Multimodal AI

PaperBanana: Automating Academic Illustration for AI Scientists

DDawei ZhuRRui MengYYale SongXXiyu WeiSSujian LiTTomas PfisterJJinsung Yoon
Published
January 30, 2026
Authors
7
Word Count
15,241
Code
Includes code

Automate academic illustrations with PaperBanana framework.

Abstract

Despite rapid advances in autonomous AI scientists powered by language models, generating publication-ready illustrations remains a labor-intensive bottleneck in the research workflow. To lift this burden, we introduce PaperBanana, an agentic framework for automated generation of publication-ready academic illustrations. Powered by state-of-the-art VLMs and image generation models, PaperBanana orchestrates specialized agents to retrieve references, plan content and style, render images, and iteratively refine via self-critique. To rigorously evaluate our framework, we introduce PaperBananaBench, comprising 292 test cases for methodology diagrams curated from NeurIPS 2025 publications, covering diverse research domains and illustration styles. Comprehensive experiments demonstrate that PaperBanana consistently outperforms leading baselines in faithfulness, conciseness, readability, and aesthetics. We further show that our method effectively extends to the generation of high-quality statistical plots. Collectively, PaperBanana paves the way for the automated generation of publication-ready illustrations.

Key Takeaways

  • 1

    Automates academic illustration generation for AI scientists.

  • 2

    Outperforms baselines in faithfulness, conciseness, readability, aesthetics.

  • 3

    Frees researchers to focus on core work.

Limitations

  • Output in raster format, difficult to edit post-generation.

  • Trade-off between style standardization and diversity.

Keywords

VLMsimage generation modelsagentic frameworkpublication-ready illustrationsmethodology diagramsPaperBananaBenchself-critiquestatistical plots

More in Multimodal AI

View all
PaperBanana: Automating Academic Illustration for AI Scientists | Paperchime