Input Image
Pretrained Stable Video Diffusion
Finetuned Stable Video Diffusion
Track4Gen without Refiner
Track4Gen

Input Image
Input Image
Input Image
Input Image
Input Image
Input Image
Input Image
Input Image
Input Image
Input Image
Input Image
Input Image
Input Image
Input Image
Input Image
Input Image
Input Image



 
Limitation of all baselines (including Track4Gen)
 
 
Input Image
Pretrained Stable Video Diffusion
Finetuned Stable Video Diffusion
Track4Gen without Refiner
Track4Gen

Input Image
Input Image
Input Image
Input Image