Feature Splatting: Language-Driven Physics-Based Scene Synthesis and Editing
- URL: http://arxiv.org/abs/2404.01223v1
- Date: Mon, 1 Apr 2024 16:31:04 GMT
- Title: Feature Splatting: Language-Driven Physics-Based Scene Synthesis and Editing
- Authors: Ri-Zhao Qiu, Ge Yang, Weijia Zeng, Xiaolong Wang,
- Abstract summary: We introduce Feature Splatting, an approach that unifies physics-based dynamic scene synthesis with rich semantics.
Our first contribution is a way to distill high-quality, object-centric vision-language features into 3D Gaussians.
Our second contribution is a way to synthesize physics-based dynamics from an otherwise static scene using a particle-based simulator.
- Score: 11.46530458561589
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scene representations using 3D Gaussian primitives have produced excellent results in modeling the appearance of static and dynamic 3D scenes. Many graphics applications, however, demand the ability to manipulate both the appearance and the physical properties of objects. We introduce Feature Splatting, an approach that unifies physics-based dynamic scene synthesis with rich semantics from vision language foundation models that are grounded by natural language. Our first contribution is a way to distill high-quality, object-centric vision-language features into 3D Gaussians, that enables semi-automatic scene decomposition using text queries. Our second contribution is a way to synthesize physics-based dynamics from an otherwise static scene using a particle-based simulator, in which material properties are assigned automatically via text queries. We ablate key techniques used in this pipeline, to illustrate the challenge and opportunities in using feature-carrying 3D Gaussians as a unified format for appearance, geometry, material properties and semantics grounded on natural language. Project website: https://feature-splatting.github.io/
Related papers
- Automated 3D Physical Simulation of Open-world Scene with Gaussian Splatting [22.40115216094332]
We present Sim Anything, a physics-based approach that endows static 3D objects with interactive dynamics.
Inspired by human visual reasoning, we propose MLLM-based Physical Property Perception.
We also simulate objects in an open-world scene with particles sampled via the Physical-Geometric Adaptive Sampling.
arXiv Detail & Related papers (2024-11-19T12:52:21Z) - Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning.
voxelization infers per-object occupancy probabilities at individual spatial locations.
Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z) - Latent Intuitive Physics: Learning to Transfer Hidden Physics from A 3D Video [58.043569985784806]
We introduce latent intuitive physics, a transfer learning framework for physics simulation.
It can infer hidden properties of fluids from a single 3D video and simulate the observed fluid in novel scenes.
We validate our model in three ways: (i) novel scene simulation with the learned visual-world physics, (ii) future prediction of the observed fluid dynamics, and (iii) supervised particle simulation.
arXiv Detail & Related papers (2024-06-18T16:37:44Z) - DreamPhysics: Learning Physical Properties of Dynamic 3D Gaussians with Video Diffusion Priors [75.83647027123119]
We propose to learn the physical properties of a material field with video diffusion priors.
We then utilize a physics-based Material-Point-Method simulator to generate 4D content with realistic motions.
arXiv Detail & Related papers (2024-06-03T16:05:25Z) - DGD: Dynamic 3D Gaussians Distillation [14.7298711927857]
We tackle the task of learning dynamic 3D semantic radiance fields given a single monocular video as input.
Our learned semantic radiance field captures per-point semantics as well as color and geometric properties for a dynamic 3D scene.
We present DGD, a unified 3D representation for both the appearance and semantics of a dynamic 3D scene.
arXiv Detail & Related papers (2024-05-29T17:52:22Z) - Reconstruction and Simulation of Elastic Objects with Spring-Mass 3D Gaussians [23.572267290979045]
Spring-Gaus is a 3D physical object representation for reconstructing and simulating elastic objects from videos of the object from multiple viewpoints.
We develop and integrate a 3D Spring-Mass model into 3D Gaussian kernels, enabling the reconstruction of the visual appearance, shape, and physical dynamics of the object.
We evaluate Spring-Gaus on both synthetic and real-world datasets, demonstrating accurate reconstruction and simulation of elastic objects.
arXiv Detail & Related papers (2024-03-14T14:25:10Z) - CG3D: Compositional Generation for Text-to-3D via Gaussian Splatting [57.14748263512924]
CG3D is a method for compositionally generating scalable 3D assets.
Gamma radiance fields, parameterized to allow for compositions of objects, possess the capability to enable semantically and physically consistent scenes.
arXiv Detail & Related papers (2023-11-29T18:55:38Z) - Differentiable Blocks World: Qualitative 3D Decomposition by Rendering
Primitives [70.32817882783608]
We present an approach that produces a simple, compact, and actionable 3D world representation by means of 3D primitives.
Unlike existing primitive decomposition methods that rely on 3D input data, our approach operates directly on images.
We show that the resulting textured primitives faithfully reconstruct the input images and accurately model the visible 3D points.
arXiv Detail & Related papers (2023-07-11T17:58:31Z) - 3D-IntPhys: Towards More Generalized 3D-grounded Visual Intuitive
Physics under Challenging Scenes [68.66237114509264]
We present a framework capable of learning 3D-grounded visual intuitive physics models from videos of complex scenes with fluids.
We show our model can make long-horizon future predictions by learning from raw images and significantly outperforms models that do not employ an explicit 3D representation space.
arXiv Detail & Related papers (2023-04-22T19:28:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.