Related papers: Generating Continual Human Motion in Diverse 3D Scenes

Generating Continual Human Motion in Diverse 3D Scenes

URL: http://arxiv.org/abs/2304.02061v4
Date: Sun, 02 Feb 2025 17:40:23 GMT
Title: Generating Continual Human Motion in Diverse 3D Scenes
Authors: Aymen Mir, Xavier Puig, Angjoo Kanazawa, Gerard Pons-Moll,
Abstract summary: We introduce a method to synthesize animator guided human motion across 3D scenes.<n>We decompose the continual motion synthesis problem into walking along paths and transitioning in and out of the actions specified by the keypoints.<n>Our model can generate long sequences of diverse actions such as grabbing, sitting and leaning chained together.
Score: 51.90506920301473
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce a method to synthesize animator guided human motion across 3D scenes. Given a set of sparse (3 or 4) joint locations (such as the location of a person's hand and two feet) and a seed motion sequence in a 3D scene, our method generates a plausible motion sequence starting from the seed motion while satisfying the constraints imposed by the provided keypoints. We decompose the continual motion synthesis problem into walking along paths and transitioning in and out of the actions specified by the keypoints, which enables long generation of motions that satisfy scene constraints without explicitly incorporating scene information. Our method is trained only using scene agnostic mocap data. As a result, our approach is deployable across 3D scenes with various geometries. For achieving plausible continual motion synthesis without drift, our key contribution is to generate motion in a goal-centric canonical coordinate frame where the next immediate target is situated at the origin. Our model can generate long sequences of diverse actions such as grabbing, sitting and leaning chained together in arbitrary order, demonstrated on scenes of varying geometry: HPS, Replica, Matterport, ScanNet and scenes represented using NeRFs. Several experiments demonstrate that our method outperforms existing methods that navigate paths in 3D scenes. For more results we urge the reader to watch our supplementary video available at: https://www.youtube.com/watch?v=0wZgsdyCT4A&t=1s

Related papers

AHA! Animating Human Avatars in Diverse Scenes with Gaussian Splatting [26.560838721184435]
We present a novel framework for animating humans in 3D scenes using 3D Gaussian Splatting (3DGS)<n>By representing humans and scenes as Gaussians, our approach allows for geometry-consistent free-viewpoint rendering of humans interacting with 3D scenes.<n>We evaluate our approach on scenes from Scannet++ and the SuperSplat library, and on avatars reconstructed from sparse and dense multi-view human capture.
arXiv Detail & Related papers (2025-11-13T00:19:18Z)
DIMO: Diverse 3D Motion Generation for Arbitrary Objects [57.14954351767432]
DIMO is a generative approach capable of generating diverse 3D motions for arbitrary objects from a single image.<n>We leverage the rich priors in well-trained video models to extract the common motion patterns.<n>During inference time with learned latent space, we can instantly sample diverse 3D motions in a single-forward pass.
arXiv Detail & Related papers (2025-11-10T18:56:49Z)
AnimateScene: Camera-controllable Animation in Any Scene [34.04222775149215]
3D scene reconstruction and 4D human animation have seen rapid progress and broad adoption in recent years.<n>One key difficulty lies in placing the human at the correct location and scale within the scene.<n>Another challenge is that the human and the background may exhibit different lighting and style, leading to unrealistic composites.<n>We present AnimateScene, which addresses the above issues in a unified framework.
arXiv Detail & Related papers (2025-08-08T03:28:17Z)
Recovering Dynamic 3D Sketches from Videos [30.87733869892925]
Liv3Stroke is a novel approach for abstracting objects in motion with deformable 3D strokes. We first extract noisy, 3D point cloud motion guidance from video frames using semantic features. Our approach deforms a set of curves to abstract essential motion features as a set of explicit 3D representations.
arXiv Detail & Related papers (2025-03-26T08:43:21Z)
Move-in-2D: 2D-Conditioned Human Motion Generation [54.067588636155115]
We propose Move-in-2D, a novel approach to generate human motion sequences conditioned on a scene image. Our approach accepts both a scene image and text prompt as inputs, producing a motion sequence tailored to the scene.
arXiv Detail & Related papers (2024-12-17T18:58:07Z)
Gaussians-to-Life: Text-Driven Animation of 3D Gaussian Splatting Scenes [49.26872036160368]
We propose a method for animating parts of high-quality 3D scenes in a Gaussian Splatting representation. We find that, in contrast to prior work, this enables realistic animations of complex, pre-existing 3D scenes.
arXiv Detail & Related papers (2024-11-28T16:01:58Z)
MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling [21.1274747033854]
Character video synthesis aims to produce realistic videos of animatable characters within lifelike scenes. Milo is a novel framework which can synthesize character videos with controllable attributes. Milo achieves advanced scalability to arbitrary characters, generality to novel 3D motions, and applicability to interactive real-world scenes.
arXiv Detail & Related papers (2024-09-24T15:00:07Z)
Shape of Motion: 4D Reconstruction from a Single Video [51.04575075620677]
We introduce a method capable of reconstructing generic dynamic scenes, featuring explicit, full-sequence-long 3D motion. We exploit the low-dimensional structure of 3D motion by representing scene motion with a compact set of SE3 motion bases. Our method achieves state-of-the-art performance for both long-range 3D/2D motion estimation and novel view synthesis on dynamic scenes.
arXiv Detail & Related papers (2024-07-18T17:59:08Z)
LoopGaussian: Creating 3D Cinemagraph with Multi-view Images via Eulerian Motion Field [13.815932949774858]
Cinemagraph is a form of visual media that combines elements of still photography and subtle motion to create a captivating experience. We propose LoopGaussian to elevate cinemagraph from 2D image space to 3D space using 3D Gaussian modeling. Experiment results validate the effectiveness of our approach, demonstrating high-quality and visually appealing scene generation.
arXiv Detail & Related papers (2024-04-13T11:07:53Z)
BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D Scene Generation [96.58789785954409]
We propose a practical and efficient 3D representation that incorporates an equivariant radiance field with the guidance of a bird's-eye view map. We produce large-scale, even infinite-scale, 3D scenes via synthesizing local scenes and then stitching them with smooth consistency.
arXiv Detail & Related papers (2023-12-04T18:56:10Z)
Synthesizing Diverse Human Motions in 3D Indoor Scenes [16.948649870341782]
We present a novel method for populating 3D indoor scenes with virtual humans that can navigate in the environment and interact with objects in a realistic manner. Existing approaches rely on training sequences that contain captured human motions and the 3D scenes they interact with. We propose a reinforcement learning-based approach that enables virtual humans to navigate in 3D scenes and interact with objects realistically and autonomously.
arXiv Detail & Related papers (2023-05-21T09:22:24Z)
3D Cinemagraphy from a Single Image [73.09720823592092]
We present 3D Cinemagraphy, a new technique that marries 2D image animation with 3D photography. Given a single still image as input, our goal is to generate a video that contains both visual content animation and camera motion.
arXiv Detail & Related papers (2023-03-10T06:08:23Z)
HUMANISE: Language-conditioned Human Motion Generation in 3D Scenes [54.61610144668777]
We present a novel scene-and-language conditioned generative model that can produce 3D human motions in 3D scenes. Our experiments demonstrate that our model generates diverse and semantically consistent human motions in 3D scenes.
arXiv Detail & Related papers (2022-10-18T10:14:11Z)
MoCaNet: Motion Retargeting in-the-wild via Canonicalization Networks [77.56526918859345]
We present a novel framework that brings the 3D motion task from controlled environments to in-the-wild scenarios. It is capable of body motion from a character in a 2D monocular video to a 3D character without using any motion capture system or 3D reconstruction procedure.
arXiv Detail & Related papers (2021-12-19T07:52:05Z)
The Wanderings of Odysseus in 3D Scenes [22.230079422580065]
We propose generative motion primitives via body surface markers, shortened as GAMMA. We exploit body surface markers and conditional variational autoencoder to model each motion primitive. Experiments show that our method can produce more realistic and controllable motion than state-of-the-art data-driven method.
arXiv Detail & Related papers (2021-12-16T23:24:50Z)
Action2video: Generating Videos of Human 3D Actions [31.665831044217363]
We aim to tackle the interesting yet challenging problem of generating videos of diverse and natural human motions from prescribed action categories. Key issue lies in the ability to synthesize multiple distinct motion sequences that are realistic in their visual appearances. Action2motionally generates plausible 3D pose sequences of a prescribed action category, which are processed and rendered by motion2video to form 2D videos.
arXiv Detail & Related papers (2021-11-12T20:20:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.