Generative AI

WildActor: Unconstrained Identity-Preserving Video Generation

QQin GuoTTianyu YangXXuanhua HeFFei ShenYYong ZhangZZhuoliang KangXXiaoming WeiDDan Xu
Published
February 28, 2026
Authors
8
Word Count
6,810

Actor-18M enables production-ready human video generation with consistent identity across viewpoints and motions.

Abstract

Production-ready human video generation requires digital actors to maintain strictly consistent full-body identities across dynamic shots, viewpoints and motions, a setting that remains challenging for existing methods. Prior methods often suffer from face-centric behavior that neglects body-level consistency, or produce copy-paste artifacts where subjects appear rigid due to pose locking. We present Actor-18M, a large-scale human video dataset designed to capture identity consistency under unconstrained viewpoints and environments. Actor-18M comprises 1.6M videos with 18M corresponding human images, covering both arbitrary views and canonical three-view representations. Leveraging Actor-18M, we propose WildActor, a framework for any-view conditioned human video generation. We introduce an Asymmetric Identity-Preserving Attention mechanism coupled with a Viewpoint-Adaptive Monte Carlo Sampling strategy that iteratively re-weights reference conditions by marginal utility for balanced manifold coverage. Evaluated on the proposed Actor-Bench, WildActor consistently preserves body identity under diverse shot compositions, large viewpoint transitions, and substantial motions, surpassing existing methods in these challenging settings.

Key Takeaways

  • 1

    WildActor introduces Actor-18M, a 1.6M video dataset with 18M images addressing viewpoint bias in human video generation.

  • 2

    The framework uses Asymmetric Identity-Preserving Attention to maintain full-body consistency across dynamic viewpoints and motions.

  • 3

    Viewpoint-Adaptive Monte Carlo Sampling re-weights reference images by marginal utility for balanced identity coverage during generation.

Limitations

  • Existing methods suffer from face-centric behavior neglecting body consistency or rigid pose-locking causing copy-paste artifacts.

  • Prior datasets rely on expensive studio captures lacking scalability to real-world unconstrained environments and diverse viewpoints.

Keywords

human video generationidentity consistencyviewpoint adaptationasymmetric identity-preserving attentionviewpoint-adaptive monte carlo samplingmanifold coveragebody identity preservation

More in Generative AI

View all
WildActor: Unconstrained Identity-Preserving Video Generation | Paperchime