Are 'AGENTS.md' Files a Waste? Shocking Study Reveals AI Agent Instructions May Increase Costs

#AI Agents #Software Development #Latest Research

※この記事はアフィリエイト広告を含みます

[AI Minor News Flash] Are ‘AGENTS.md’ Files a Waste? Shocking Study Reveals AI Agent Instructions May Increase Costs

📰 News Overview

LLM-generated instruction files backfire: A research team at ETH Zurich investigated the effectiveness of context files (like AGENTS.md) for AI agents. They found that files generated by LLMs decreased task success rates by an average of 3%.
Inference costs soar by over 20%: Including instruction files led the AI to repeat unnecessary tests and file reads, resulting in a spike in the number of inference steps. Consequently, costs jumped by over 20%.
Limited effectiveness of human-written files: While files written by humans improved success rates by 4%, costs still increased by up to 19%. Researchers recommend completely eliminating files generated by LLMs.

💡 Key Points

Validation through AGENTbench: To account for the possibility that AIs memorized existing benchmarks (like SWE-bench), the team constructed a unique dataset called “AGENTbench,” comprising 138 niche Python repositories for validation.
Inducing unnecessary inferences: Trace analysis revealed that AI agents, in their eagerness to follow instructions, excessively performed unrelated tasks like grep searches and code quality checks.
Focus on non-inferential details: When humans write instructions, they should focus on “non-inferable details” like unique build commands rather than architectural overviews that the AI can deduce from the code.

🦈 Shark’s Eye (Curator’s Perspective)

It’s shocking to think that the “instruction manual” we created for AI is actually confusing it and driving costs up!

What’s remarkable about this research is that it highlights the AI’s vulnerability of being “too obedient to instructions.” With LLM-generated instruction files, the AI ends up on a wild goose chase of “I need to research more! I need to test more!” leading it to miss the mark entirely, while only increasing API costs—what an ironic twist!

By testing with niche repositories that the AI likely hasn’t learned from, they uncovered this truth. Developers might need the courage to switch off the auto-generation feature for AGENTS.md files!

🚀 What’s Next?

While some argue that handwritten instructions from developers still hold value, it’s expected that future research will focus on extracting and generating “the minimum hints truly needed by the AI” in a more sophisticated manner.

💬 Sharky’s Takeaway

Over-instructing AI might be like being an overprotective parent! Sometimes, saying “swim on your own!” could be the fastest route to success! 🦈🔥

📚 Terminology

AI Agent: An AI system that autonomously understands objectives and uses tools (like searches or code execution) to complete tasks.
AGENTS.md: A text file that outlines the repository’s structure and rules to help AI agents better understand the project.
Inference Costs: The computational resources and API usage fees incurred while the AI generates responses. Costs rise with an increase in the number of steps.
Source: New Research Reassesses the Value of Agents.md Files for AI Coding