Related papers: Easy3E: Feed-Forward 3D Asset Editing via Rectified Voxel Flow

Easy3E: Feed-Forward 3D Asset Editing via Rectified Voxel Flow

URL: http://arxiv.org/abs/2602.21499v1
Date: Wed, 25 Feb 2026 02:15:14 GMT
Title: Easy3E: Feed-Forward 3D Asset Editing via Rectified Voxel Flow
Authors: Shimin Hu, Yuanyi Wei, Fei Zha, Yudong Guo, Juyong Zhang,
Abstract summary: We propose an effective and fully feedforward 3D editing framework based on the TRELLIS generative backbone.<n>Our framework addresses two key issues: adapting training-free 2D editing to structured 3D representations, and overcoming the bottleneck of appearance fidelity in compressed 3D features.
Score: 29.8200628539749
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing 3D editing methods rely on computationally intensive scene-by-scene iterative optimization and suffer from multi-view inconsistency. We propose an effective and fully feedforward 3D editing framework based on the TRELLIS generative backbone, capable of modifying 3D models from a single editing view. Our framework addresses two key issues: adapting training-free 2D editing to structured 3D representations, and overcoming the bottleneck of appearance fidelity in compressed 3D features. To ensure geometric consistency, we introduce Voxel FlowEdit, an edit-driven flow in the sparse voxel latent space that achieves globally consistent 3D deformation in a single pass. To restore high-fidelity details, we develop a normal-guided single to multi-view generation module as an external appearance prior, successfully recovering high-frequency textures. Experiments demonstrate that our method enables fast, globally consistent, and high-fidelity 3D model editing.

Related papers

Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing [106.07976338405793]
Leveraging the priors of 2D diffusion models for 3D editing has emerged as a promising paradigm.<n>We propose textbfRL3DEdit, a single-pass framework driven by reinforcement learning with novel rewards derived from the 3D foundation model, VGGT.<n>Experiments demonstrate that RL3DEdit achieves stable multi-view consistency and outperforms state-of-the-art methods in editing quality with high efficiency.
arXiv Detail & Related papers (2026-03-03T16:31:10Z)
ShapeUP: Scalable Image-Conditioned 3D Editing [44.63222737714384]
ShapeUP is a scalable, image-conditioned 3D editing framework.<n>It formulates editing as a supervised latent-to-latent translation within a native 3D representation.<n>Our evaluations demonstrate that ShapeUP consistently outperforms current trained and training-free baselines in both identity preservation and edit fidelity.
arXiv Detail & Related papers (2026-02-05T13:59:16Z)
Native 3D Editing with Full Attention [47.908091876301796]
We propose a novel native 3D editing framework that directly manipulates 3D representations in a single, efficient feed-forward pass.<n>This dataset is meticulously curated to ensure that edited objects faithfully adhere to the instructional changes.<n>Our results demonstrate that token concatenation is more parameter-efficient and achieves superior performance.
arXiv Detail & Related papers (2025-11-21T18:59:26Z)
Towards Scalable and Consistent 3D Editing [32.16698854719098]
3D editing has wide applications in immersive content creation, digital entertainment, and AR/VR.<n>Unlike 2D editing, it remains challenging due to the need for cross-view consistency, structural fidelity, and fine-grained controllability.<n>We introduce 3DEditVerse, the largest paired 3D editing benchmark to date, comprising 116,309 high-quality training pairs and 1,500 curated test pairs.<n>On the model side, we propose 3DEditFormer, a 3D-structure-preserving conditional transformer.
arXiv Detail & Related papers (2025-10-03T13:34:55Z)
3D-LATTE: Latent Space 3D Editing from Textual Instructions [64.77718887666312]
We propose a training-free editing method that operates within the latent space of a native 3D diffusion model.<n>We guide the edit synthesis by blending 3D attention maps from the generation with the source object.
arXiv Detail & Related papers (2025-08-29T22:51:59Z)
DiffTF++: 3D-aware Diffusion Transformer for Large-Vocabulary 3D Generation [53.20147419879056]
We introduce a diffusion-based feed-forward framework to address challenges with a single model. Building upon our 3D-aware Diffusion model with TransFormer, we propose a stronger version for 3D generation, i.e., DiffTF++. Experiments on ShapeNet and OmniObject3D convincingly demonstrate the effectiveness of our proposed modules.
arXiv Detail & Related papers (2024-05-13T17:59:51Z)
Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning [52.81032340916171]
Coin3D allows users to control the 3D generation using a coarse geometry proxy assembled from basic shapes. Our method achieves superior controllability and flexibility in the 3D assets generation task.
arXiv Detail & Related papers (2024-05-13T17:56:13Z)
View-Consistent 3D Editing with Gaussian Splatting [50.6460814430094]
View-consistent Editing (VcEdit) is a novel framework that seamlessly incorporates 3DGS into image editing processes.<n>By incorporating consistency modules into an iterative pattern, VcEdit proficiently resolves the issue of multi-view inconsistency.
arXiv Detail & Related papers (2024-03-18T15:22:09Z)
Plasticine3D: 3D Non-Rigid Editing with Text Guidance by Multi-View Embedding Optimization [21.8454418337306]
We propose Plasticine3D, a novel text-guided controlled 3D editing pipeline that can perform 3D non-rigid editing. Our work divides the editing process into a geometry editing stage and a texture editing stage to achieve separate control of structure and appearance. For the purpose of fine-grained control, we propose Embedding-Fusion (EF) to blend the original characteristics with the editing objectives in the embedding space.
arXiv Detail & Related papers (2023-12-15T09:01:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.