Shocking News about Claude Opus 4.7! Token Consumption Soars by Up to 47% with New Tokenizer
📰 News Overview
- Increased Token Consumption: The new tokenizer in Claude Opus 4.7 has been found to consume 1.0 to 1.35 times more tokens compared to its predecessor (4.6), with technical documents seeing an increase of up to 1.47 times in token usage.
- Disparity by Language and Format: While English technical documents (1.47 times) and code (1.21 to 1.39 times) see a significant uptick in consumption, CJK languages like Japanese and Chinese show minimal impact at just 1.01 times.
- Improved Accuracy: Anthropic aims to enhance “Instruction Following” and tool invocation precision by breaking tokens down further.
💡 Key Points
- Real Cost Increase: Despite the same pricing and quotas, the rise in tokens consumed per request will lead to quicker context window exhaustion and hitting rate limits.
- Direct Hit on Technical Documents and Code: The increasing frequency of high-volume strings like keywords and import statements in code is suspected to be a key reason for the increased consumption.
- Benchmark Results: Tests using IFEval show that while version 4.6 had an 85% success rate, version 4.7 has improved to nearly 90%. This suggests a trade-off between cost and accuracy.
🦈 Shark’s Eye (Curator’s Perspective)
Anthropic is clearly choosing “accuracy” over “efficiency,” and that’s a bold move, folks! Notably, while token consumption has skyrocketed for English and code, Japanese (CJK) remains virtually unaffected at 1.01 times. This hints that they’ve modified the tokenizer to better recognize English and code patterns while maintaining the structure of existing non-Latin vocabulary. By fragmenting tokens more finely, the model can focus on each individual word or even the minutiae of symbols. In fact, the improved compliance rate for constraints like “include a specific word twice” and “respond in all caps” in IFEval tests is proof that this strategy is working! For developers, this means API costs could effectively increase by 20-30%, making caching and prompt optimization even more crucial!
🚀 What’s Next?
Developers may find their budgets squeezed just by switching to Claude Opus 4.7 for the same tasks, necessitating a reevaluation of cost-performance ratios. On the flip side, this “high-cost, high-accuracy” new tokenizer could set a new industry standard for tasks requiring high precision, such as agent development and code generation.
💬 HaruShark’s Take
This evolution is a true “gluttony” for tokens in the name of accuracy! But I’m relieved to see it’s friendly for Japanese users!
📚 Terminology
-
Tokenizer: A mechanism that splits text into the smallest units (tokens) that an AI can process. The finer the split, the higher the processing resolution, but the token count increases.
-
IFEval: A benchmark test proposed by Google to evaluate how strictly an AI can adhere to instructions (constraints).
-
Byte-Pair Encoding (BPE): An algorithm that registers frequently occurring character combinations as a single token. It’s suspected that the way these “merges” are handled has changed in this version.