Learning to Configure Agentic AI Systems

AAditya TapariaSSom SagarRRansalu Senanayake

Published: February 12, 2026
Authors: 3
Word Count: 10,484
Code: Includes code

ARC learns optimal AI agent configurations per query, boosting accuracy while cutting costs.

Abstract

Configuring LLM-based agent systems involves choosing workflows, tools, token budgets, and prompts from a large combinatorial design space, and is typically handled today by fixed large templates or hand-tuned heuristics. This leads to brittle behavior and unnecessary compute, since the same cumbersome configuration is often applied to both easy and hard input queries. We formulate agent configuration as a query-wise decision problem and introduce ARC (Agentic Resource & Configuration learner), which learns a light-weight hierarchical policy using reinforcement learning to dynamically tailor these configurations. Across multiple benchmarks spanning reasoning and tool-augmented question answering, the learned policy consistently outperforms strong hand-designed and other baselines, achieving up to 25% higher task accuracy while also reducing token and runtime costs. These results demonstrate that learning per-query agent configurations is a powerful alternative to "one size fits all" designs.

Key Takeaways

1
ARC learns query-specific configurations for AI agents, achieving 25% higher accuracy while reducing token usage and costs.
2
Hierarchical reinforcement learning decomposes configuration into manageable structure and prompt policies rather than joint optimization.
3
Semantic embeddings combined with hand-crafted features enable systems to match appropriate workflows to different question types.

Limitations

Configuration space remains massive with over 100,000 possible combinations, making exhaustive search impractical.
Long contexts hurt model performance through lost-in-the-middle phenomenon, limiting context window effectiveness.

Keywords

LLM-based agent systemsreinforcement learninghierarchical policyquery-wise decision problemagent configurationtoken budgetprompt engineeringtool-augmented question answeringreasoning taskstask accuracycomputational efficiency

More in AI Agents

View all

LongCat-Flash-Thinking-2601 Technical Report

Meituan LongCat Team, Anchun Gui +160

We introduce LongCat-Flash-Thinking-2601, a 560-billion-parameter open-source Mixture-of-Experts (MoE) reasoning model with superior agentic reasoning capability. LongCat-Flash-Thinking-2601 achieves ...

Jan 23149

Agentic Reasoning for Large Language Models

Tianxin Wei, Ting-Wei Li +27

Reasoning is a fundamental cognitive process underlying inference, problem-solving, and decision-making. While large language models (LLMs) demonstrate strong reasoning capabilities in closed-world se...

Jan 18149

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Ailin Huang, Ang Li +213

We introduce Step 3.5 Flash, a sparse Mixture-of-Experts (MoE) model that bridges frontier-level agentic intelligence and computational efficiency. We focus on what matters most when building agents: ...

Feb 11140

UI-Venus-1.5 Technical Report

Veuns-Team, Changlong Gao +25

GUI agents have emerged as a powerful paradigm for automating interactions in digital environments, yet achieving both broad generality and consistently strong task performance remains challenging.In ...

Feb 9140

daVinci-Dev: Agent-native Mid-training for Software Engineering

Ji Zeng, Dayuan Fu +15

Recently, the frontier of Large Language Model (LLM) capabilities has shifted from single-turn code generation to agentic software engineering-a paradigm where models autonomously navigate, edit, and ...

Jan 26113

More AI Agents papers