AI Agents

AutoWebWorld: Synthesizing Infinite Verifiable Web Environments via Finite State Machines

YYifan WuYYiran PengYYiyu ChenJJianhao RuanZZijie ZhuangCCheng YangJJiayi ZhangMMan ChenYYenchi TsengZZhaoyang YuLLiang ChenYYuyao ZhaiBBang LiuCChenglin WuYYuyu Luo
Published
February 15, 2026
Authors
15
Word Count
5,061
Code
Includes code

AutoWebWorld generates synthetic verifiable websites via finite state machines for cheap web agent training.

Abstract

The performance of autonomous Web GUI agents heavily relies on the quality and quantity of their training data. However, a fundamental bottleneck persists: collecting interaction trajectories from real-world websites is expensive and difficult to verify. The underlying state transitions are hidden, leading to reliance on inconsistent and costly external verifiers to evaluate step-level correctness. To address this, we propose AutoWebWorld, a novel framework for synthesizing controllable and verifiable web environments by modeling them as Finite State Machines (FSMs) and use coding agents to translate FSMs into interactive websites. Unlike real websites, where state transitions are implicit, AutoWebWorld explicitly defines all states, actions, and transition rules. This enables programmatic verification: action correctness is checked against predefined rules, and task success is confirmed by reaching a goal state in the FSM graph. AutoWebWorld enables a fully automated search-and-verify pipeline, generating over 11,663 verified trajectories from 29 diverse web environments at only $0.04 per trajectory. Training on this synthetic data significantly boosts real-world performance. Our 7B Web GUI agent outperforms all baselines within 15 steps on WebVoyager. Furthermore, we observe a clear scaling law: as the synthetic data volume increases, performance on WebVoyager and Online-Mind2Web consistently improves.

Key Takeaways

  • 1

    AutoWebWorld uses finite state machines to generate synthetic websites with perfect verification, eliminating the verifier bottleneck.

  • 2

    Synthetic web environments reduce data collection costs from fifteen cents to four cents per trajectory while maintaining quality.

  • 3

    AI coding agents automatically generate runnable websites from formal FSM specifications, enabling scalable web agent training.

Limitations

  • Synthetic websites may not capture the complexity and edge cases present in real-world web applications.

  • The approach requires careful FSM specification design, which could become a bottleneck for diverse web scenarios.

Keywords

Finite State Machinescoding agentsweb environmentsautomated search-and-verify pipelinesynthetic datatrajectory generationverifiable environmentsautonomous Web GUI agents

More in AI Agents

View all
AutoWebWorld: Synthesizing Infinite Verifiable Web Environments via Finite State Machines | Paperchime