Related papers: DreamMesh4D: Video-to-4D Generation with Sparse-Controlled Gaussian-Mesh Hybrid Representation

DreamMesh4D: Video-to-4D Generation with Sparse-Controlled Gaussian-Mesh Hybrid Representation

URL: http://arxiv.org/abs/2410.06756v1
Date: Wed, 9 Oct 2024 10:41:08 GMT
Title: DreamMesh4D: Video-to-4D Generation with Sparse-Controlled Gaussian-Mesh Hybrid Representation
Authors: Zhiqi Li, Yiming Chen, Peidong Liu,
Abstract summary: We introduce DreamMesh4D, a novel framework combining mesh representation with geometric skinning technique to generate high-quality 4D object from a monocular video. Our method is compatible with modern graphic pipelines, showcasing its potential in the 3D gaming and film industry.
Score: 10.250715657201363
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advancements in 2D/3D generative techniques have facilitated the generation of dynamic 3D objects from monocular videos. Previous methods mainly rely on the implicit neural radiance fields (NeRF) or explicit Gaussian Splatting as the underlying representation, and struggle to achieve satisfactory spatial-temporal consistency and surface appearance. Drawing inspiration from modern 3D animation pipelines, we introduce DreamMesh4D, a novel framework combining mesh representation with geometric skinning technique to generate high-quality 4D object from a monocular video. Instead of utilizing classical texture map for appearance, we bind Gaussian splats to triangle face of mesh for differentiable optimization of both the texture and mesh vertices. In particular, DreamMesh4D begins with a coarse mesh obtained through an image-to-3D generation procedure. Sparse points are then uniformly sampled across the mesh surface, and are used to build a deformation graph to drive the motion of the 3D object for the sake of computational efficiency and providing additional constraint. For each step, transformations of sparse control points are predicted using a deformation network, and the mesh vertices as well as the surface Gaussians are deformed via a novel geometric skinning algorithm, which is a hybrid approach combining LBS (linear blending skinning) and DQS (dual-quaternion skinning), mitigating drawbacks associated with both approaches. The static surface Gaussians and mesh vertices as well as the deformation network are learned via reference view photometric loss, score distillation loss as well as other regularizers in a two-stage manner. Extensive experiments demonstrate superior performance of our method. Furthermore, our method is compatible with modern graphic pipelines, showcasing its potential in the 3D gaming and film industry.

Related papers

MILo: Mesh-In-the-Loop Gaussian Splatting for Detailed and Efficient Surface Reconstruction [28.452920446301608]
We present MILo, a novel framework that bridges the gap between volumetric and surface representations by differentiably extracting a mesh from the 3D Gaussians.<n>Our approach can reconstruct complete scenes, including backgrounds, with state-of-the-art quality while requiring an order of magnitude fewer mesh vertices than previous methods.
arXiv Detail & Related papers (2025-06-30T17:48:54Z)
PIG: Physically-based Multi-Material Interaction with 3D Gaussians [14.097146027458368]
PIG: Physically-Based Multi-Material Interaction with 3D Gaussians is a novel approach that combines 3D object segmentation with the simulation of objects interacting in high precision.<n>We show that our method not only outperforms the state-of-the-art (SOTA) in terms of visual quality, but also opens up new directions and pipelines for the field of physically realistic scene generation.
arXiv Detail & Related papers (2025-06-09T11:25:21Z)
EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization. We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z)
3D Gaussian Splatting with Normal Information for Mesh Extraction and Improved Rendering [8.59572577251833]
We propose a novel regularization method using the gradients of a signed distance function estimated from the Gaussians. We demonstrate the effectiveness of our approach on datasets such as Mip-NeRF360, Tanks and Temples, and Deep-Blending.
arXiv Detail & Related papers (2025-01-14T18:40:33Z)
Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors [17.544733016978928]
3D object generation from a single image involves estimating the full 3D geometry and texture of unseen views from an unposed RGB image captured in the wild. Recent advancements in 3D object generation have introduced techniques that reconstruct an object's 3D shape and texture. We propose bridging the gap between 2D and 3D diffusion models to address this limitation.
arXiv Detail & Related papers (2024-10-12T10:14:11Z)
CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner [34.78919665494048]
CraftsMan can generate high-fidelity 3D geometries with highly varied shapes, regular mesh topologies, and detailed surfaces. Our method achieves high efficacy in producing superior-quality 3D assets compared to existing methods.
arXiv Detail & Related papers (2024-05-23T18:30:12Z)
Direct Learning of Mesh and Appearance via 3D Gaussian Splatting [3.4899193297791054]
We propose a learnable scene model that incorporates 3DGS with an explicit geometry representation, namely a mesh. Our model learns the mesh and appearance in an end-to-end manner, where we bind 3D Gaussians to the mesh faces and perform differentiable rendering of 3DGS to obtain photometric supervision.
arXiv Detail & Related papers (2024-05-11T07:56:19Z)
Gaussian Opacity Fields: Efficient Adaptive Surface Reconstruction in Unbounded Scenes [50.92217884840301]
Gaussian Opacity Fields (GOF) is a novel approach for efficient, high-quality, and adaptive surface reconstruction in scenes. GOF is derived from ray-tracing-based volume rendering of 3D Gaussians. GOF surpasses existing 3DGS-based methods in surface reconstruction and novel view synthesis.
arXiv Detail & Related papers (2024-04-16T17:57:19Z)
2D Gaussian Splatting for Geometrically Accurate Radiance Fields [50.056790168812114]
3D Gaussian Splatting (3DGS) has recently revolutionized radiance field reconstruction, achieving high quality novel view synthesis and fast rendering speed without baking. We present 2D Gaussian Splatting (2DGS), a novel approach to model and reconstruct geometrically accurate radiance fields from multi-view images. We demonstrate that our differentiable terms allows for noise-free and detailed geometry reconstruction while maintaining competitive appearance quality, fast training speed, and real-time rendering.
arXiv Detail & Related papers (2024-03-26T17:21:24Z)
UV Gaussians: Joint Learning of Mesh Deformation and Gaussian Textures for Human Avatar Modeling [71.87807614875497]
We propose UV Gaussians, which models the 3D human body by jointly learning mesh deformations and 2D UV-space Gaussian textures. We collect and process a new dataset of human motion, which includes multi-view images, scanned models, parametric model registration, and corresponding texture maps. Experimental results demonstrate that our method achieves state-of-the-art synthesis of novel view and novel pose.
arXiv Detail & Related papers (2024-03-18T09:03:56Z)
Bridging 3D Gaussian and Mesh for Freeview Video Rendering [57.21847030980905]
GauMesh bridges the 3D Gaussian and Mesh for modeling and rendering the dynamic scenes. We show that our approach adapts the appropriate type of primitives to represent the different parts of the dynamic scene.
arXiv Detail & Related papers (2024-03-18T04:01:26Z)
GeoGS3D: Single-view 3D Reconstruction via Geometric-aware Diffusion Model and Gaussian Splatting [81.03553265684184]
We introduce GeoGS3D, a framework for reconstructing detailed 3D objects from single-view images. We propose a novel metric, Gaussian Divergence Significance (GDS), to prune unnecessary operations during optimization. Experiments demonstrate that GeoGS3D generates images with high consistency across views and reconstructs high-quality 3D objects.
arXiv Detail & Related papers (2024-03-15T12:24:36Z)
Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting [9.383423119196408]
We introduce Multi-view ControlNet (MVControl), a novel neural network architecture designed to enhance existing multi-view diffusion models. MVControl is able to offer 3D diffusion guidance for optimization-based 3D generation. In pursuit of efficiency, we adopt 3D Gaussians as our representation instead of the commonly used implicit representations.
arXiv Detail & Related papers (2024-03-15T02:57:20Z)
Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis [90.26556260531707]
DMTet is a conditional generative model that can synthesize high-resolution 3D shapes using simple user guides such as coarse voxels. Unlike deep 3D generative models that directly generate explicit representations such as meshes, our model can synthesize shapes with arbitrary topology.
arXiv Detail & Related papers (2021-11-08T05:29:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.