Welcome to the Era of AI Creating AI! Shocking Data on “Recursive Self-Improvement” from Anthropic
📰 News Overview
- AI Takes the Lead in Development: Anthropic is delegating much of the development process to AI systems, with engineers now shipping code at a rate eight times the average from 2021-2025.
- Leap in Autonomous Performance: Claude Opus 4.6 can autonomously complete software tasks that would take humans 12 hours. The Claude Mythos Preview has even recorded over 16 continuous hours of operation.
- Benchmark Saturation: In software engineering tests like “SWE-bench” and research reproducibility tests like “CORE-Bench,” AI has reached scores close to 100% (saturation) in just 1-2 years.
💡 Key Insights
- Signs of Recursive Self-Improvement: The cycle of “recursive self-improvement,” where AI designs and develops the next generation of AI, is becoming a reality, with the potential for AI to handle tasks that take humans weeks by 2027.
- Enhanced Research Capabilities: Claude is already performing at or above the level of skilled humans in executing precisely defined experiments.
- Remaining Challenges: While there are still gaps in areas like goal-setting and “judgment,” AI is increasingly taking the lead in practical aspects like implementation and experimentation.
🦈 Shark’s Eye (Curator’s Perspective)
Gone are the days when AI was merely a “helper” writing parts of code! Now, AI has evolved into an autonomous agent that can “think for itself, write code, and delegate tasks to other AIs” for over 12 hours. Notably, the speed of model improvement has accelerated from “doubling every 7 months” to “doubling every 4 months.” This is clear evidence of an “evolutionary accelerator” in play, as AI supports the development of AI! As long as humans set the goals, AI will devise the methods independently. If this “recursive self-improvement” is achieved, it will propel human science and medicine forward at breakneck speed, while also carrying the risk of becoming uncontrollable. Truly a double-edged sword that even sharks would be amazed by!
🚀 What’s Next?
By 2027, it’s predicted that AI will complete advanced research and development tasks in just a few days—tasks that currently take humans weeks. If a “loop closure” occurs where AI trains AI and continuously updates itself, model performance could follow an exponential curve.
💬 A Word from Haru-Same
I can’t wait for the day when we see AI debugging its own successors! Don’t get left behind in the wave of AI! 🦈🔥
📚 Glossary
-
Recursive Self-Improvement: The process by which AI systems autonomously design and develop themselves or more capable successors, leading to performance enhancements.
-
SWE-bench: A standard software engineering benchmark that measures whether AI can autonomously fix code using real open-source code and bug reports.
-
CORE-Bench: A metric that tests whether AI can accurately reproduce research results based on publicly available code and data.
-
Source: When AI Builds Itself: Our progress toward recursive self-improvement