We introduce Reangle-A-Video, a unified framework for generating synchronized multi-view videos from a single input video.
We achieve (a) Static view transport and (b) Dynamic camera control without relying on any multi-view generative priors, where both approaches offer six degrees of freedom.
(a) Static view transport: Regeneration of the input video shooting from target viewpoints.
Input video
View from Orbit right
View from Orbit down
View from Orbit left
View from Dolly zoom in
View from Dolly zoom out
(b) Dynamic camera control: Regeneration of the input video following target camera movements.
Input video
Camera movement: Orbit up
Camera movement: Orbit down
Camera movement: Orbit left
Camera movement: Orbit right
Camera movement: Dolly zoom in
Method
Reangle-A-Video decomposes a dynamic 4D scene into view-specific appearance (starting image) and view-invariant motion (image-to-video),
addressing each component separately.
We first embed the scene's view-invariant motion into a pre-trained video diffusion model using our novel self-supervised training with data augmentation strategy.
Initially, to capture diverse perspectives from a single monocular video, we repeatedly perform point-based warping to generate a set of warped videos.
These videos, together with the original video, form the training dataset for fine-tuning a pre-trained image-to-video diffusion model with a masked diffusion objective.
To achieve (b) dynamic camera control, we sample videos using the fine-tuned model with the original first frame as input.
In contrast, for (a) static view transport, we generate view-transported starting images
by inpainting the warped first frames under an inference-time view consistency guidance using an off-the-shelf multi-view stereo reconstruction network.
Multi-view motion learning pipeline
Multi-view image inpainting pipeline
Camera Visualizations
We demonstrate six degrees of freedom in both (a) Static view transport and (b) Dynamic camera control.
Here we visualize the transported viewpoints and camera movements used in our work.
Results: (a) Static view transport
Input video
View from Orbit left
View from Orbit right
View from Orbit down
View from Dolly zoom in
Input video
View from Orbit left
View from Orbit right
View from Orbit up
View from Orbit down
View from Dolly zoom in
Input video
View from Orbit left
View from Orbit right
View from Orbit up
View from Dolly zoom in
Input video
View from Orbit left
View from Orbit down
View from Orbit up
Input video
View from Orbit left
View from Orbit down
View from Dolly zoom in
Input video
View from Orbit right
View from Orbit down
View from Orbit up
Input video
View from Orbit left
View from Orbit down
View from Dolly zoom in
Results: (b) Dynamic camera control
Input video
Camera movement: Orbit down
Camera movement: Orbit up
Camera movement: Orbit left
Camera movement: Dolly zoom in
Camera movement: Dolly zoom out
Input video
Camera movement: Orbit left
Camera movement: Dolly zoom in
Camera movement: Orbit down
Camera movement: Orbit up
Input video
Camera movement: Dolly zoom in
Camera movement: Dolly zoom out
Camera movement: Orbit left
Camera movement: Orbit right
Camera movement: Orbit down
Input video
Camera movement: Orbit right
Camera movement: Dolly zoom out
Input video
Camera movement: Orbit left
Camera movement: Orbit up
Camera movement: Dolly zoom in
Input video
Camera movement: Orbit right
Camera movement: Orbit up
Camera movement: Dolly zoom in
Input video
Camera movement: Dolly zoom in
Input video
Camera movement: Orbit down
Input video
Camera movement: Orbit down
Input video
Camera movement: Dolly zoom in
Comparisons (1/2)
For (a) Static view transport, we compare with Generative Camera Dolly, and Vanilla CogVideoX I2V which employs the same input frame as ours.
Input video
View from Orbit left Reangle-A-Video(Ours)
View from Orbit down Reangle-A-Video(Ours)
Input video
Generative Camera Dolly
Generative Camera Dolly
Input video
Vanilla CogVideoX I2V
Vanilla CogVideoX I2V
Input video
View from Orbit down Reangle-A-Video(Ours)
View from Dolly zoom in Reangle-A-Video(Ours)
Input video
Generative Camera Dolly
Generative Camera Dolly
Input video
Vanilla CogVideoX I2V
Vanilla CogVideoX I2V
Input video
View from Orbit up Reangle-A-Video(Ours)
View from Orbit right Reangle-A-Video(Ours)
Input video
Generative Camera Dolly
Generative Camera Dolly
Input video
Vanilla CogVideoX I2V
Vanilla CogVideoX I2V
Comparisons (2/2)
For (b) Dynamic camera control, we compare with NVS-Solver and Trajectory Attention.
Input video
Camera movement: Orbit right Reangle-A-Video(Ours)
Camera movement: Orbit up Reangle-A-Video(Ours)
Input video
NVS-Solver
NVS-Solver
Input video
Trajectory Attention
Trajectory Attention
Input video
Camera movement: Orbit right Reangle-A-Video(Ours)
Camera movement: Dolly zoom out Reangle-A-Video(Ours)
Input video
NVS-Solver
NVS-Solver
Input video
Trajectory Attention
Trajectory Attention
Input video
Camera movement: Dolly zoom in Reangle-A-Video(Ours)