AI Agents

Closing the Loop: Universal Repository Representation with RPG-Encoder

JJane LuoCChengyu YinXXin ZhangQQingtao LiSSteven LiuYYiming HuangJJie WuHHao LiuYYangyu HuangYYu KangFFangkai YangYYing XinSScarlett Li
Published
February 2, 2026
Authors
13
Word Count
6,076
Code
Includes code

Revolutionizing codebase comprehension with RPG-Encoder.

Abstract

Current repository agents encounter a reasoning disconnect due to fragmented representations, as existing methods rely on isolated API documentation or dependency graphs that lack semantic depth. We consider repository comprehension and generation to be inverse processes within a unified cycle: generation expands intent into implementation, while comprehension compresses implementation back into intent. To address this, we propose RPG-Encoder, a framework that generalizes the Repository Planning Graph (RPG) from a static generative blueprint into a unified, high-fidelity representation. RPG-Encoder closes the reasoning loop through three mechanisms: (1) Encoding raw code into the RPG that combines lifted semantic features with code dependencies; (2) Evolving the topology incrementally to decouple maintenance costs from repository scale, reducing overhead by 95.7%; and (3) Operating as a unified interface for structure-aware navigation. In evaluations, RPG-Encoder establishes state-of-the-art repository understanding on SWE-bench Verified with 93.7% Acc@5 and exceeds the best baseline by over 10% on SWE-bench Live Lite. These results highlight our superior fine-grained localization accuracy in complex codebases. Furthermore, it achieves 98.5% reconstruction coverage on RepoCraft, confirming RPG's high-fidelity capacity to mirror the original codebase and closing the loop between intent and implementation.

Key Takeaways

  • 1

    RPG-Encoder enhances codebase navigation and understanding.

  • 2

    Significantly reduces maintenance costs by 95.7%.

  • 3

    Outperforms traditional methods in repository reconstruction.

Limitations

  • Performance depends on codebase complexity and quality.

  • Computational demands may challenge very large repositories.

Keywords

Repository Planning GraphRPG-Encodercode representationsemantic featurescode dependenciesincremental topology evolutionstructure-aware navigationfine-grained localizationrepository understandingcodebase reconstruction

More in AI Agents

View all
Closing the Loop: Universal Repository Representation with RPG-Encoder | Paperchime