TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward
Yihong Luo, Tianyang Hu, Weijian Luo +1 more
While few-step generative models have enabled powerful image and video generation at significantly lower cost, generic reinforcement learning (RL) paradigms for few-step models remain an unsolved problem. Existing RL approaches for few-step diffusion models strongly rely on back-propagating through ...