C3Editor: Achieving Controllable Consistency in 2D Model for 3D Editing
- URL: http://arxiv.org/abs/2510.04539v2
- Date: Fri, 31 Oct 2025 16:06:19 GMT
- Title: C3Editor: Achieving Controllable Consistency in 2D Model for 3D Editing
- Authors: Zeng Tao, Zheng Ding, Zeyuan Chen, Xiang Zhang, Leizhi Li, Zhuowen Tu,
- Abstract summary: C3Editor is a controllable and consistent 2D-lifting-based 3D editing framework.<n>Our method selectively establishes a view-consistent 2D editing model to achieve superior 3D editing results.<n>Our approach delivers more consistent and controllable 2D and 3D editing results than existing 2D-lifting-based methods.
- Score: 37.439731931558036
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing 2D-lifting-based 3D editing methods often encounter challenges related to inconsistency, stemming from the lack of view-consistent 2D editing models and the difficulty of ensuring consistent editing across multiple views. To address these issues, we propose C3Editor, a controllable and consistent 2D-lifting-based 3D editing framework. Given an original 3D representation and a text-based editing prompt, our method selectively establishes a view-consistent 2D editing model to achieve superior 3D editing results. The process begins with the controlled selection of a ground truth (GT) view and its corresponding edited image as the optimization target, allowing for user-defined manual edits. Next, we fine-tune the 2D editing model within the GT view and across multiple views to align with the GT-edited image while ensuring multi-view consistency. To meet the distinct requirements of GT view fitting and multi-view consistency, we introduce separate LoRA modules for targeted fine-tuning. Our approach delivers more consistent and controllable 2D and 3D editing results than existing 2D-lifting-based methods, outperforming them in both qualitative and quantitative evaluations.
Related papers
- Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing [106.07976338405793]
Leveraging the priors of 2D diffusion models for 3D editing has emerged as a promising paradigm.<n>We propose textbfRL3DEdit, a single-pass framework driven by reinforcement learning with novel rewards derived from the 3D foundation model, VGGT.<n>Experiments demonstrate that RL3DEdit achieves stable multi-view consistency and outperforms state-of-the-art methods in editing quality with high efficiency.
arXiv Detail & Related papers (2026-03-03T16:31:10Z) - Edit3r: Instant 3D Scene Editing from Sparse Unposed Images [40.421700685587346]
We present Edit3r, a framework that reconstructs and edits 3D scenes in a single pass from unposed, view-inconsistent, instruction-edited images.<n>We show that Edit3r achieves superior semantic alignment and enhanced 3D consistency compared to recent baselines.
arXiv Detail & Related papers (2025-12-31T18:59:53Z) - Fast Multi-view Consistent 3D Editing with Video Priors [19.790628738739354]
We propose generative Video Prior based 3D Editing (ViP3DE)<n>Our key insight is to condition the video generation model on a single edited view to generate other consistent edited views for 3D updating directly.<n>Our proposed ViP3DE can achieve high-quality 3D editing results even within a single forward pass, significantly outperforming existing methods in both editing quality and speed.
arXiv Detail & Related papers (2025-11-28T13:31:10Z) - Towards Scalable and Consistent 3D Editing [32.16698854719098]
3D editing has wide applications in immersive content creation, digital entertainment, and AR/VR.<n>Unlike 2D editing, it remains challenging due to the need for cross-view consistency, structural fidelity, and fine-grained controllability.<n>We introduce 3DEditVerse, the largest paired 3D editing benchmark to date, comprising 116,309 high-quality training pairs and 1,500 curated test pairs.<n>On the model side, we propose 3DEditFormer, a 3D-structure-preserving conditional transformer.
arXiv Detail & Related papers (2025-10-03T13:34:55Z) - 3D-LATTE: Latent Space 3D Editing from Textual Instructions [64.77718887666312]
We propose a training-free editing method that operates within the latent space of a native 3D diffusion model.<n>We guide the edit synthesis by blending 3D attention maps from the generation with the source object.
arXiv Detail & Related papers (2025-08-29T22:51:59Z) - TrAME: Trajectory-Anchored Multi-View Editing for Text-Guided 3D Gaussian Splatting Manipulation [35.951718189386845]
We propose a progressive 3D editing strategy that ensures multi-view consistency via a Trajectory-Anchored Scheme (TAS)
TAS facilitates a tightly coupled iterative process between 2D view editing and 3D updating, preventing error accumulation yielded from text-to-image process.
We present a tuning-free View-Consistent Attention Control (VCAC) module that leverages cross-view semantic and geometric reference from the source branch to yield aligned views from the target branch during the editing of 2D views.
arXiv Detail & Related papers (2024-07-02T08:06:58Z) - DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation [57.406031264184584]
DragGaussian is a 3D object drag-editing framework based on 3D Gaussian Splatting.
Our contributions include the introduction of a new task, the development of DragGaussian for interactive point-based 3D editing, and comprehensive validation of its effectiveness through qualitative and quantitative experiments.
arXiv Detail & Related papers (2024-05-09T14:34:05Z) - DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing [72.54566271694654]
We consider the problem of editing 3D objects and scenes based on open-ended language instructions.<n>A common approach to this problem is to use a 2D image generator or editor to guide the 3D editing process.<n>This process is often inefficient due to the need for iterative updates of costly 3D representations.
arXiv Detail & Related papers (2024-04-29T17:59:30Z) - View-Consistent 3D Editing with Gaussian Splatting [50.6460814430094]
View-consistent Editing (VcEdit) is a novel framework that seamlessly incorporates 3DGS into image editing processes.<n>By incorporating consistency modules into an iterative pattern, VcEdit proficiently resolves the issue of multi-view inconsistency.
arXiv Detail & Related papers (2024-03-18T15:22:09Z) - SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds [73.91114735118298]
Shap-Editor is a novel feed-forward 3D editing framework.
We demonstrate that direct 3D editing in this space is possible and efficient by building a feed-forward editor network.
arXiv Detail & Related papers (2023-12-14T18:59:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.