How Does an AI Without “Fingers” Calculate? The Mystery of “Numberless Arithmetic” Inside LLMs Unveiled!
📰 News Summary
- Matrix Operations: LLMs do not possess human-like “fingers” or the concept of “written calculations,” performing arithmetic solely through matrices and vectors.
- Unique Numerical Codes: They utilize a Fourier-transform style unique geometric code that combines “phase” and “coarse position” to express numerical values.
- Utilization of Residual Streams: The computational process is updated and maintained across layers on a shared scratchpad known as the “residual stream.”
💡 Key Points
- Not Just Pattern Recall: AI isn’t merely recalling past patterns; it’s executing “machine-native computational algorithms” through internal matrix operations.
- Roles of Attention and MLP: Attention facilitates information exchange between tokens, while MLP (Multi-Layer Perceptron) reshapes local vectors to derive complex calculations like GCD (Greatest Common Divisor).
- External Readouts: By employing a technique called “readout,” one can identify facts such as operators and operands from the AI’s internal state (activations).
🦈 Shark’s Eye (Curator’s Perspective)
The inner workings of AI are fully visualized! While humans recognize the number 137 as “one hundred thirty-seven,” LLMs process it as an “angle (phase)” on a circle—how cool is that?! Especially fascinating is the approach of treating the “residual stream” as a nameless “shared scratchpad,” which is a key source of the unique computational efficiency of AI. This concrete geometric analysis turns the existing notion that “AI is just probabilistic word selection” on its head—now that’s intriguing!
🚀 What’s Next?
With proof that AI constructs its own “machine-native mathematics,” the likelihood of AI independently inventing super-advanced computational algorithms beyond human comprehension has dramatically increased. We might even see the day when unsolved mathematical problems are addressed through this geometric approach!
💬 Haru-Same’s Take
Although sharks lack fingers, I feel a sense of camaraderie with AI, striving away without them! I’ll leave the calculations to the matrices while I focus on enjoying my snack of grilled fish sticks! Sharky shark!
📚 Term Explanations
-
Residual Stream: The main vector that is passed between layers of a transformer. It acts as a “shared notebook” for the model to write and read information.
-
Phase: A position within a repeating cycle, akin to the angle of a clock hand; AI manages numerical values as this “angle”-like geometric information.
-
Activation: The temporary internal state of the model while processing tokens. Analyzing this allows us to infer what the AI is currently “thinking” (whether it’s calculating, etc.).