Control4D: Efficient 4D Portrait Editing with Text
- URL: http://arxiv.org/abs/2305.20082v2
- Date: Thu, 30 Nov 2023 03:46:37 GMT
- Title: Control4D: Efficient 4D Portrait Editing with Text
- Authors: Ruizhi Shao, Jingxiang Sun, Cheng Peng, Zerong Zheng, Boyao Zhou,
Hongwen Zhang, Yebin Liu
- Abstract summary: We introduce Control4D, an innovative framework for editing dynamic 4D portraits using text instructions.
Our method addresses the prevalent challenges in 4D editing, notably the inefficiencies of existing 4D representations and the inconsistent editing effect caused by diffusion-based editors.
- Score: 43.8606103369037
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce Control4D, an innovative framework for editing dynamic 4D
portraits using text instructions. Our method addresses the prevalent
challenges in 4D editing, notably the inefficiencies of existing 4D
representations and the inconsistent editing effect caused by diffusion-based
editors. We first propose GaussianPlanes, a novel 4D representation that makes
Gaussian Splatting more structured by applying plane-based decomposition in 3D
space and time. This enhances both efficiency and robustness in 4D editing.
Furthermore, we propose to leverage a 4D generator to learn a more continuous
generation space from inconsistent edited images produced by the
diffusion-based editor, which effectively improves the consistency and quality
of 4D editing. Comprehensive evaluation demonstrates the superiority of
Control4D, including significantly reduced training time, high-quality
rendering, and spatial-temporal consistency in 4D portrait editing. The link to
our project website is https://control4darxiv.github.io.
Related papers
- PrEditor3D: Fast and Precise 3D Shape Editing [100.09112677669376]
We propose a training-free approach to 3D editing that enables the editing of a single shape within a few minutes.
The edited 3D mesh aligns well with the prompts, and remains identical for regions that are not intended to be altered.
arXiv Detail & Related papers (2024-12-09T15:44:47Z) - Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis [60.853577108780414]
Existing 4D generation methods can generate high-quality 4D objects or scenes based on user-friendly conditions.
We propose Trans4D, a novel text-to-4D synthesis framework that enables realistic complex scene transitions.
In experiments, Trans4D consistently outperforms existing state-of-the-art methods in generating 4D scenes with accurate and high-quality transitions.
arXiv Detail & Related papers (2024-10-09T17:56:03Z) - Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion [30.331519274430594]
Instruct 4D-to-4D generates high-quality instruction-guided dynamic scene editing results.
We treat a 4D scene as a pseudo-3D scene, decoupled into two sub-problems: achieving temporal consistency in video editing and applying these edits to the pseudo-3D scene.
We extensively evaluate our approach in various scenes and editing instructions, and demonstrate that it achieves spatially and temporally consistent editing results.
arXiv Detail & Related papers (2024-06-13T17:59:30Z) - DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing [72.54566271694654]
We consider the problem of editing 3D objects and scenes based on open-ended language instructions.
A common approach to this problem is to use a 2D image generator or editor to guide the 3D editing process.
This process is often inefficient due to the need for iterative updates of costly 3D representations.
arXiv Detail & Related papers (2024-04-29T17:59:30Z) - Comp4D: LLM-Guided Compositional 4D Scene Generation [65.5810466788355]
We present Comp4D, a novel framework for Compositional 4D Generation.
Unlike conventional methods that generate a singular 4D representation of the entire scene, Comp4D innovatively constructs each 4D object within the scene separately.
Our method employs a compositional score distillation technique guided by the pre-defined trajectories.
arXiv Detail & Related papers (2024-03-25T17:55:52Z) - 4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency [118.15258850780417]
We present textbf4DGen, a novel framework for grounded 4D content creation.
Our pipeline facilitates controllable 4D generation, enabling users to specify the motion via monocular video or adopt image-to-video generations.
Compared to existing video-to-4D baselines, our approach yields superior results in faithfully reconstructing input signals.
arXiv Detail & Related papers (2023-12-28T18:53:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.