Related papers: C3Editor: Achieving Controllable Consistency in 2D Model for 3D Editing

C3Editor: Achieving Controllable Consistency in 2D Model for 3D Editing

URL: http://arxiv.org/abs/2510.04539v2
Date: Fri, 31 Oct 2025 16:06:19 GMT
Title: C3Editor: Achieving Controllable Consistency in 2D Model for 3D Editing
Authors: Zeng Tao, Zheng Ding, Zeyuan Chen, Xiang Zhang, Leizhi Li, Zhuowen Tu,
Abstract summary: C3Editor is a controllable and consistent 2D-lifting-based 3D editing framework.<n>Our method selectively establishes a view-consistent 2D editing model to achieve superior 3D editing results.<n>Our approach delivers more consistent and controllable 2D and 3D editing results than existing 2D-lifting-based methods.
Score: 37.439731931558036
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing 2D-lifting-based 3D editing methods often encounter challenges related to inconsistency, stemming from the lack of view-consistent 2D editing models and the difficulty of ensuring consistent editing across multiple views. To address these issues, we propose C3Editor, a controllable and consistent 2D-lifting-based 3D editing framework. Given an original 3D representation and a text-based editing prompt, our method selectively establishes a view-consistent 2D editing model to achieve superior 3D editing results. The process begins with the controlled selection of a ground truth (GT) view and its corresponding edited image as the optimization target, allowing for user-defined manual edits. Next, we fine-tune the 2D editing model within the GT view and across multiple views to align with the GT-edited image while ensuring multi-view consistency. To meet the distinct requirements of GT view fitting and multi-view consistency, we introduce separate LoRA modules for targeted fine-tuning. Our approach delivers more consistent and controllable 2D and 3D editing results than existing 2D-lifting-based methods, outperforming them in both qualitative and quantitative evaluations.

Related papers

Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing [106.07976338405793]
Leveraging the priors of 2D diffusion models for 3D editing has emerged as a promising paradigm.<n>We propose textbfRL3DEdit, a single-pass framework driven by reinforcement learning with novel rewards derived from the 3D foundation model, VGGT.<n>Experiments demonstrate that RL3DEdit achieves stable multi-view consistency and outperforms state-of-the-art methods in editing quality with high efficiency.
arXiv Detail & Related papers (2026-03-03T16:31:10Z)
Edit3r: Instant 3D Scene Editing from Sparse Unposed Images [40.421700685587346]
We present Edit3r, a framework that reconstructs and edits 3D scenes in a single pass from unposed, view-inconsistent, instruction-edited images.<n>We show that Edit3r achieves superior semantic alignment and enhanced 3D consistency compared to recent baselines.
arXiv Detail & Related papers (2025-12-31T18:59:53Z)
Fast Multi-view Consistent 3D Editing with Video Priors [19.790628738739354]
We propose generative Video Prior based 3D Editing (ViP3DE)<n>Our key insight is to condition the video generation model on a single edited view to generate other consistent edited views for 3D updating directly.<n>Our proposed ViP3DE can achieve high-quality 3D editing results even within a single forward pass, significantly outperforming existing methods in both editing quality and speed.
arXiv Detail & Related papers (2025-11-28T13:31:10Z)
Towards Scalable and Consistent 3D Editing [32.16698854719098]
3D editing has wide applications in immersive content creation, digital entertainment, and AR/VR.<n>Unlike 2D editing, it remains challenging due to the need for cross-view consistency, structural fidelity, and fine-grained controllability.<n>We introduce 3DEditVerse, the largest paired 3D editing benchmark to date, comprising 116,309 high-quality training pairs and 1,500 curated test pairs.<n>On the model side, we propose 3DEditFormer, a 3D-structure-preserving conditional transformer.
arXiv Detail & Related papers (2025-10-03T13:34:55Z)
3D-LATTE: Latent Space 3D Editing from Textual Instructions [64.77718887666312]
We propose a training-free editing method that operates within the latent space of a native 3D diffusion model.<n>We guide the edit synthesis by blending 3D attention maps from the generation with the source object.
arXiv Detail & Related papers (2025-08-29T22:51:59Z)
TrAME: Trajectory-Anchored Multi-View Editing for Text-Guided 3D Gaussian Splatting Manipulation [35.951718189386845]
We propose a progressive 3D editing strategy that ensures multi-view consistency via a Trajectory-Anchored Scheme (TAS) TAS facilitates a tightly coupled iterative process between 2D view editing and 3D updating, preventing error accumulation yielded from text-to-image process. We present a tuning-free View-Consistent Attention Control (VCAC) module that leverages cross-view semantic and geometric reference from the source branch to yield aligned views from the target branch during the editing of 2D views.
arXiv Detail & Related papers (2024-07-02T08:06:58Z)
DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation [57.406031264184584]
DragGaussian is a 3D object drag-editing framework based on 3D Gaussian Splatting. Our contributions include the introduction of a new task, the development of DragGaussian for interactive point-based 3D editing, and comprehensive validation of its effectiveness through qualitative and quantitative experiments.
arXiv Detail & Related papers (2024-05-09T14:34:05Z)
DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing [72.54566271694654]
We consider the problem of editing 3D objects and scenes based on open-ended language instructions.<n>A common approach to this problem is to use a 2D image generator or editor to guide the 3D editing process.<n>This process is often inefficient due to the need for iterative updates of costly 3D representations.
arXiv Detail & Related papers (2024-04-29T17:59:30Z)
View-Consistent 3D Editing with Gaussian Splatting [50.6460814430094]
View-consistent Editing (VcEdit) is a novel framework that seamlessly incorporates 3DGS into image editing processes.<n>By incorporating consistency modules into an iterative pattern, VcEdit proficiently resolves the issue of multi-view inconsistency.
arXiv Detail & Related papers (2024-03-18T15:22:09Z)
SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds [73.91114735118298]
Shap-Editor is a novel feed-forward 3D editing framework. We demonstrate that direct 3D editing in this space is possible and efficient by building a feed-forward editor network.
arXiv Detail & Related papers (2023-12-14T18:59:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.