AI Agents

WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

ZZelai XuZZhexuan XuRRuize ZhangCChunyang ZhuSShi YuWWeilin LiuQQuanlu ZhangWWenbo DingCChao YuYYu Wang
Published
February 4, 2026
Authors
10
Word Count
9,175
Code
Includes code

Enhances broad information seeking via multi-agent RL.

Abstract

Recent advancements in Large Language Models (LLMs) have largely focused on depth scaling, where a single agent solves long-horizon problems with multi-turn reasoning and tool use. However, as tasks grow broader, the key bottleneck shifts from individual competence to organizational capability. In this work, we explore a complementary dimension of width scaling with multi-agent systems to address broad information seeking. Existing multi-agent systems often rely on hand-crafted workflows and turn-taking interactions that fail to parallelize work effectively. To bridge this gap, we propose WideSeek-R1, a lead-agent-subagent framework trained via multi-agent reinforcement learning (MARL) to synergize scalable orchestration and parallel execution. By utilizing a shared LLM with isolated contexts and specialized tools, WideSeek-R1 jointly optimizes the lead agent and parallel subagents on a curated dataset of 20k broad information-seeking tasks. Extensive experiments show that WideSeek-R1-4B achieves an item F1 score of 40.0% on the WideSearch benchmark, which is comparable to the performance of single-agent DeepSeek-R1-671B. Furthermore, WideSeek-R1-4B exhibits consistent performance gains as the number of parallel subagents increases, highlighting the effectiveness of width scaling.

Key Takeaways

  • 1

    Proposes multi-agent framework for efficient information retrieval.

  • 2

    Lead agent decomposes tasks; subagents work in parallel.

  • 3

    Aims to overcome context pollution and sequential bottlenecks.

Limitations

  • Requires specialized tools for subagents.

  • Dependent on effective training of lead agent.

Keywords

Large Language Modelsmulti-agent systemsmulti-agent reinforcement learninglead-agent-subagent frameworkparallel executioninformation seekingWideSearch benchmarkF1 score

More in AI Agents

View all
WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning | Paperchime