Control4D: Efficient 4D Portrait Editing with Text
- URL: http://arxiv.org/abs/2305.20082v2
- Date: Thu, 30 Nov 2023 03:46:37 GMT
- Title: Control4D: Efficient 4D Portrait Editing with Text
- Authors: Ruizhi Shao, Jingxiang Sun, Cheng Peng, Zerong Zheng, Boyao Zhou,
Hongwen Zhang, Yebin Liu
- Abstract summary: We introduce Control4D, an innovative framework for editing dynamic 4D portraits using text instructions.
Our method addresses the prevalent challenges in 4D editing, notably the inefficiencies of existing 4D representations and the inconsistent editing effect caused by diffusion-based editors.
- Score: 43.8606103369037
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce Control4D, an innovative framework for editing dynamic 4D
portraits using text instructions. Our method addresses the prevalent
challenges in 4D editing, notably the inefficiencies of existing 4D
representations and the inconsistent editing effect caused by diffusion-based
editors. We first propose GaussianPlanes, a novel 4D representation that makes
Gaussian Splatting more structured by applying plane-based decomposition in 3D
space and time. This enhances both efficiency and robustness in 4D editing.
Furthermore, we propose to leverage a 4D generator to learn a more continuous
generation space from inconsistent edited images produced by the
diffusion-based editor, which effectively improves the consistency and quality
of 4D editing. Comprehensive evaluation demonstrates the superiority of
Control4D, including significantly reduced training time, high-quality
rendering, and spatial-temporal consistency in 4D portrait editing. The link to
our project website is https://control4darxiv.github.io.
Related papers
- Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis [60.853577108780414]
Existing 4D generation methods can generate high-quality 4D objects or scenes based on user-friendly conditions.
We propose Trans4D, a novel text-to-4D synthesis framework that enables realistic complex scene transitions.
In experiments, Trans4D consistently outperforms existing state-of-the-art methods in generating 4D scenes with accurate and high-quality transitions.
arXiv Detail & Related papers (2024-10-09T17:56:03Z) - Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images [72.70883914827687]
Tailor3D is a novel pipeline that creates customized 3D assets from editable dual-side images.
It provides a user-friendly, efficient solution for editing 3D assets, with each editing step taking only seconds to complete.
arXiv Detail & Related papers (2024-07-08T17:59:55Z) - Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion [30.331519274430594]
Instruct 4D-to-4D generates high-quality instruction-guided dynamic scene editing results.
We treat a 4D scene as a pseudo-3D scene, decoupled into two sub-problems: achieving temporal consistency in video editing and applying these edits to the pseudo-3D scene.
We extensively evaluate our approach in various scenes and editing instructions, and demonstrate that it achieves spatially and temporally consistent editing results.
arXiv Detail & Related papers (2024-06-13T17:59:30Z) - DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing [72.54566271694654]
We consider the problem of editing 3D objects and scenes based on open-ended language instructions.
A common approach to this problem is to use a 2D image generator or editor to guide the 3D editing process.
This process is often inefficient due to the need for iterative updates of costly 3D representations.
arXiv Detail & Related papers (2024-04-29T17:59:30Z) - Comp4D: LLM-Guided Compositional 4D Scene Generation [65.5810466788355]
We present Comp4D, a novel framework for Compositional 4D Generation.
Unlike conventional methods that generate a singular 4D representation of the entire scene, Comp4D innovatively constructs each 4D object within the scene separately.
Our method employs a compositional score distillation technique guided by the pre-defined trajectories.
arXiv Detail & Related papers (2024-03-25T17:55:52Z) - 4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency [118.15258850780417]
This work introduces 4DGen, a novel framework for grounded 4D content creation.
We identify static 3D assets and monocular video sequences as key components in constructing the 4D content.
Our pipeline facilitates conditional 4D generation, enabling users to specify geometry (3D assets) and motion (monocular videos)
arXiv Detail & Related papers (2023-12-28T18:53:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.