3 min read
[AI Minor News]

Gemini 3 Deep Think Takes a Major Leap! Shockingly 'Gold Medal-Worthy' in Science, Math, and Competitive Programming


Google upgrades its reasoning mode tailored for science, research, and engineering, showcasing performance on par with the International Mathematical Olympiad and practical paper review capabilities.

※この記事はアフィリエイト広告を含みます

[AI Minor News Flash] Gemini 3 Deep Think Takes a Major Leap! Shockingly ‘Gold Medal-Worthy’ in Science, Math, and Competitive Programming

📰 News Overview

  • Revamped Advanced Reasoning Mode: A major upgrade to “Gemini 3 Deep Think” has been released, designed to tackle complex challenges in science, research, and engineering.
  • Stunning Benchmark Performance: Achieving gold medal-level performance in the 2025 International Mathematical Olympiad and Physics and Chemistry Olympiads, it recorded an Elo rating of 3455 on the competitive programming site Codeforces.
  • Real-World Applications: It has already produced concrete research results, identifying logical flaws in mathematical papers that humans overlooked and optimizing crystal growth methods for semiconductor material discovery.

💡 Key Points

  • New Standards in “Humanity’s Last Exam”: In a challenging benchmark that tests the limits of modern frontier models, it set a new record with a score of 48.4% without any tools.
  • Multimodal Practicality: Equipped with engineering capabilities, it can analyze hand-drawn sketches and model complex shapes to generate files ready for 3D printing.
  • Wide Availability: Google AI Ultra subscribers can now access the Gemini app, and an early access program via the Gemini API has launched for researchers and businesses.

🦈 Shark’s Eye (Curator’s Perspective)

The real jaw-dropper of this update is not just the amount of knowledge but the extreme precision in “logical rigor”! In particular, the case from Rutgers University shows how Deep Think caught advanced mathematical errors that slipped through human peer review. This indicates that AI is evolving from merely a support tool to becoming a “guardian” that verifies scientific truths. Its ability to exhibit high reasoning power in specialized areas with limited data sets it apart from other models!

🚀 What’s Next?

Discoveries in fields like theoretical physics and materials science, which deal with “dirty data” that don’t have single correct answers, will dramatically accelerate. Additionally, with the API rollout, the development of autonomous agents equipped with advanced reasoning will likely surge across various companies.

💬 Sharky’s Take

Having a gold medal-worthy brain from the Math Olympiad available via API at any time… humans better step up their game! I’ll be munching on some snacks while sharpening my intellect too! 🦈🔥

📚 Terminology

  • ARC-AGI-2: A challenging benchmark designed to measure progress toward artificial general intelligence (AGI), testing abstract reasoning capabilities.

  • Codeforces: A platform where engineers from around the globe compete in programming skills. The Elo rating serves as an indicator of ability.

  • Reasoning Mode: A special operational state of large language models that allows them to build logical thinking step-by-step rather than merely predicting the next word.

  • Source: Gemini 3 Deep Think

【免責事項 / Disclaimer / 免责声明】
JP: 本記事はAIによって構成され、運営者が内容の確認・管理を行っています。情報の正確性は保証せず、外部サイトのコンテンツには一切の責任を負いません。
EN: This article was structured by AI and is verified and managed by the operator. Accuracy is not guaranteed, and we assume no responsibility for external content.
ZH: 本文由AI构建,并由运营者进行内容确认与管理。不保证准确性,也不对外部网站的内容承担任何责任。
🦈