AI Agents

daVinci-Dev: Agent-native Mid-training for Software Engineering

JJi ZengDDayuan FuTTiantian MiYYumin ZhuangYYaxing HuangXXuefeng LiLLyumanshan YeMMuhang XieQQishuo HuaZZhen HuangMMohan JiangHHanning WangJJifan LinYYang XiaoJJie SunYYunze WuPPengfei Liu
Published
January 26, 2026
Authors
17
Word Count
9,624
Code
Includes code

Revolutionizing LLMs for software engineering with mid-training.

Abstract

Recently, the frontier of Large Language Model (LLM) capabilities has shifted from single-turn code generation to agentic software engineering-a paradigm where models autonomously navigate, edit, and test complex repositories. While post-training methods have become the de facto approach for code agents, **agentic mid-training**-mid-training (MT) on large-scale data that mirrors authentic agentic workflows-remains critically underexplored due to substantial resource requirements, despite offering a more scalable path to instilling foundational agentic behaviors than relying solely on expensive reinforcement learning. A central challenge in realizing effective agentic mid-training is the distribution mismatch between static training data and the dynamic, feedback-rich environment of real development. To address this, we present a systematic study of agentic mid-training, establishing both the data synthesis principles and training methodology for effective agent development at scale. Central to our approach is **agent-native data**-supervision comprising two complementary types of trajectories: **contextually-native trajectories** that preserve the complete information flow an agent experiences, offering broad coverage and diversity; and **environmentally-native trajectories** collected from executable repositories where observations stem from actual tool invocations and test executions, providing depth and interaction authenticity. We verify the model's agentic capabilities on `SWE-Bench Verified`. We demonstrate our superiority over the previous open software engineering mid-training recipe `Kimi-Dev` under two post-training settings with an aligned base model and agentic scaffold, while using less than half mid-training tokens (73.1B). Besides relative advantage, our best performing 32B and 72B models achieve **56.1%** and **58.5%** resolution rates, respectively, which are ...

Key Takeaways

  • 1

    Agent-native mid-training enhances LLMs for software engineering.

  • 2

    Contextually-native and environmentally-native trajectories improve performance.

  • 3

    daVinci-Dev achieves state-of-the-art results with fewer tokens.

Limitations

  • Requires access to GitHub PRs for dataset construction.

  • Dependent on the quality of real development environments.

Keywords

Large Language Modelagentic software engineeringmid-trainingdistribution mismatchagent-native datacontextually-native trajectoriesenvironmentally-native trajectoriesSWE-Bench VerifiedKimi-Devresolution rates

More in AI Agents

View all
daVinci-Dev: Agent-native Mid-training for Software Engineering | Paperchime