Input Image
Pretrained Stable Video Diffusion
Finetuned Stable Video Diffusion
Track4Gen without Refiner
Track4Gen
Limitation of all baselines (including Track4Gen)
Input Image
Pretrained Stable Video Diffusion
Finetuned Stable Video Diffusion
Track4Gen without Refiner
Track4Gen