In-2-4D: Inbetweening from Two Single-View Images to 4D Generation
- URL: http://arxiv.org/abs/2504.08366v1
- Date: Fri, 11 Apr 2025 09:01:09 GMT
- Title: In-2-4D: Inbetweening from Two Single-View Images to 4D Generation
- Authors: Sauradip Nag, Daniel Cohen-Or, Hao Zhang, Ali Mahdavi-Amiri,
- Abstract summary: We propose a new problem, In-between2-4D, for generative 4D (i.e., 3D + motion) in Splating from a minimalistic input setting.<n>Given two images representing the start and end states of an object in motion, our goal is to generate and reconstruct the motion in 4D.
- Score: 54.62824686338408
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new problem, In-2-4D, for generative 4D (i.e., 3D + motion) inbetweening from a minimalistic input setting: two single-view images capturing an object in two distinct motion states. Given two images representing the start and end states of an object in motion, our goal is to generate and reconstruct the motion in 4D. We utilize a video interpolation model to predict the motion, but large frame-to-frame motions can lead to ambiguous interpretations. To overcome this, we employ a hierarchical approach to identify keyframes that are visually close to the input states and show significant motion, then generate smooth fragments between them. For each fragment, we construct the 3D representation of the keyframe using Gaussian Splatting. The temporal frames within the fragment guide the motion, enabling their transformation into dynamic Gaussians through a deformation field. To improve temporal consistency and refine 3D motion, we expand the self-attention of multi-view diffusion across timesteps and apply rigid transformation regularization. Finally, we merge the independently generated 3D motion segments by interpolating boundary deformation fields and optimizing them to align with the guiding video, ensuring smooth and flicker-free transitions. Through extensive qualitative and quantitiave experiments as well as a user study, we show the effectiveness of our method and its components. The project page is available at https://in-2-4d.github.io/
Related papers
- Shape of Motion: 4D Reconstruction from a Single Video [51.04575075620677]
We introduce a method capable of reconstructing generic dynamic scenes, featuring explicit, full-sequence-long 3D motion.
We exploit the low-dimensional structure of 3D motion by representing scene motion with a compact set of SE3 motion bases.
Our method achieves state-of-the-art performance for both long-range 3D/2D motion estimation and novel view synthesis on dynamic scenes.
arXiv Detail & Related papers (2024-07-18T17:59:08Z) - MagicPose4D: Crafting Articulated Models with Appearance and Motion Control [17.161695123524563]
We propose MagicPose4D, a framework for refined control over both appearance and motion in 4D generation.
Unlike current 4D generation methods, MagicPose4D accepts monocular videos or mesh sequences as motion prompts.
We demonstrate that MagicPose4D significantly improves the accuracy and consistency of 4D content generation.
arXiv Detail & Related papers (2024-05-22T21:51:01Z) - DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos [21.93514516437402]
We present DreamScene4D, the first approach to generate 3D dynamic scenes of multiple objects from monocular videos via novel view synthesis.
Our key insight is a "decompose-recompose" approach that factorizes the video scene into the background and object tracks.
We show extensive results on challenging DAVIS, Kubric, and self-captured videos with quantitative comparisons and a user preference study.
arXiv Detail & Related papers (2024-05-03T17:55:34Z) - MAS: Multi-view Ancestral Sampling for 3D motion generation using 2D diffusion [57.90404618420159]
We introduce Multi-view Ancestral Sampling (MAS), a method for 3D motion generation.
MAS works by simultaneously denoising multiple 2D motion sequences representing different views of the same 3D motion.
We demonstrate MAS on 2D pose data acquired from videos depicting professional basketball maneuvers.
arXiv Detail & Related papers (2023-10-23T09:05:18Z) - Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance [83.25826307000717]
We study the challenging problem of recovering detailed motion from a single motion-red image.
Existing solutions to this problem estimate a single image sequence without considering the motion ambiguity for each region.
In this paper, we explicitly account for such motion ambiguity, allowing us to generate multiple plausible solutions all in sharp detail.
arXiv Detail & Related papers (2022-07-20T18:05:53Z) - Non-linear Motion Estimation for Video Frame Interpolation using
Space-time Convolutions [18.47978862083129]
Video frame aims to synthesize one or multiple frames between two consecutive frames in a video.
Some older works tackled this problem by assuming per-pixel linear motion between video frames.
We propose to approximate the per-pixel motion using a space-time convolution network that is able to adaptively select the motion model to be used.
arXiv Detail & Related papers (2022-01-27T09:49:23Z) - Motion-from-Blur: 3D Shape and Motion Estimation of Motion-blurred
Objects in Videos [115.71874459429381]
We propose a method for jointly estimating the 3D motion, 3D shape, and appearance of highly motion-blurred objects from a video.
Experiments on benchmark datasets demonstrate that our method outperforms previous methods for fast moving object deblurring and 3D reconstruction.
arXiv Detail & Related papers (2021-11-29T11:25:14Z) - Learning to Segment Rigid Motions from Two Frames [72.14906744113125]
We propose a modular network, motivated by a geometric analysis of what independent object motions can be recovered from an egomotion field.
It takes two consecutive frames as input and predicts segmentation masks for the background and multiple rigidly moving objects, which are then parameterized by 3D rigid transformations.
Our method achieves state-of-the-art performance for rigid motion segmentation on KITTI and Sintel.
arXiv Detail & Related papers (2021-01-11T04:20:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.