Efficient AI

Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model

DDongwon KimGGawon SeoJJinsung LeeMMinsu ChoSSuha Kwak
Published
March 5, 2026
Authors
5
Word Count
12,848
Code
Includes code

CompACT achieves 40× faster planning by compressing images to 8 tokens using frozen semantic encoders.

Abstract

World models provide a powerful framework for simulating environment dynamics conditioned on actions or instructions, enabling downstream tasks such as action planning or policy learning. Recent approaches leverage world models as learned simulators, but its application to decision-time planning remains computationally prohibitive for real-time control. A key bottleneck lies in latent representations: conventional tokenizers encode each observation into hundreds of tokens, making planning both slow and resource-intensive. To address this, we propose CompACT, a discrete tokenizer that compresses each observation into as few as 8 tokens, drastically reducing computational cost while preserving essential information for planning. An action-conditioned world model that occupies CompACT tokenizer achieves competitive planning performance with orders-of-magnitude faster planning, offering a practical step toward real-world deployment of world models.

Key Takeaways

  • 1

    CompACT compresses images to just 8 tokens while maintaining planning performance superior to 64-token competitors.

  • 2

    Using frozen pretrained vision encoders preserves semantic information while eliminating unnecessary perceptual details for planning.

  • 3

    Generative decoding synthesizes perceptual details from compact semantic tokens, achieving 40× speedup in planning latency.

Limitations

  • Extreme compression creates information bottleneck; approach may not generalize to tasks requiring fine-grained visual details.

  • Method relies on specific pretrained foundation models; performance may vary with different encoder architectures or domains.

Keywords

world modelslatent representationstokenizersaction-conditioned world modelplanningcomputational efficiency

More in Efficient AI

View all
Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model | Paperchime