Local AI Development at 30,000 Feet! An Incredible 10-Hour Journey with the M5 Max

#Gemma 4 #Local LLM #M5Max

※この記事はアフィリエイト広告を含みます

Local AI Development at 30,000 Feet! An Incredible 10-Hour Journey with the M5 Max

📰 News Summary

Completed engineering tasks using the MacBook Pro M5 Max with only local LLMs on a 10-hour flight from London to Las Vegas (no Wi-Fi).
Run Gemma 4 31B and Qwen 4.6 36B on LM Studio to build a DuckDB-based invoicing analysis tool and refactor 4 million tokens.
Discovered serious overheating issues, a battery drain of 1% per minute, and a dramatic difference in power delivery efficiency depending on whether using an iPhone or MacBook cable (60W vs 94W).

💡 Key Points

Model Performance: In specific coding scopes, Gemma 4 and Qwen 4.6 delivered outputs comparable to frontier models.
Hardware Limitations: Continuous high load caused the device to heat up significantly, leading to battery drain even when plugged into a 60W outlet.
Infrastructure Blind Spots: The choice of cable (iPhone vs MacBook) caused a 36% variation in power delivery efficiency, creating a performance bottleneck.

🦈 Shark’s Eye View

We’ve finally entered an era where “fully offline AI development” is possible even in the skies! What’s noteworthy is not just chatting away but actually building an analysis tool with DuckDB. The power to harness the M5 Max’s 128GB unified memory and run models over 30B like Gemma 4 in-flight is truly impressive!

Particularly, the discovery that “one cable can change output by 34W” showcases the acute insights of hands-on engineers. Facing physical heat and power consumption sharpens intuition about “inference costs” that are often obscured in the cloud. This is what we call “mechanical sympathy”!

🚀 What’s Next?

Moving forward, we can expect the rise of compact LLMs optimized for Neural Engine (ANE), enabling even more power-efficient and faster in-flight development. Additionally, the importance of “power management and telemetry tools” in mobile environments will be re-emphasized among developers.

💬 A Word from Haru Shark

The dedication to keep developing despite the burning heat on their lap is something I aspire to emulate! Choosing the right cable is a matter of life and death! 🦈🔥

📚 Glossary

Gemma 4 / Qwen 4.6: The latest open-source LLMs as of 2026. They perform remarkably in local environments, starting to replace cloud models for specific engineering tasks.
Unified Memory: A memory architecture allowing fast sharing between CPU and GPU. A feature of Apple Silicon, crucial for efficiently processing models with massive parameters like LLMs.
LM Studio: A platform for easily running and managing LLMs in local environments. It also includes functionality for measuring inference statistics (throughput and latency).
Source: Running Local LLMs Offline on a Ten-Hour Flight

Local AI Development at 30,000 Feet! An Incredible 10-Hour Journey with the M5 Max

Local AI Development at 30,000 Feet! An Incredible 10-Hour Journey with the M5 Max

📰 News Summary

💡 Key Points

🦈 Shark’s Eye View

🚀 What’s Next?

💬 A Word from Haru Shark

📚 Glossary

🦈 はるサメをフォローするだサメ！