Related papers: Light-SQ: Structure-aware Shape Abstraction with Superquadrics for Generated Meshes

Light-SQ: Structure-aware Shape Abstraction with Superquadrics for Generated Meshes

URL: http://arxiv.org/abs/2509.24986v1
Date: Mon, 29 Sep 2025 16:18:32 GMT
Title: Light-SQ: Structure-aware Shape Abstraction with Superquadrics for Generated Meshes
Authors: Yuhan Wang, Weikai Chen, Zeyu Hu, Runze Zhang, Yingda Yin, Ruoyu Wu, Keyang Luo, Shengju Qian, Yiyan Ma, Hongyi Li, Yuan Gao, Yuhuan Zhou, Hao Luo, Wan Wang, Xiaobin Shen, Zhaowei Li, Kuixin Zhu, Chuanlang Hong, Yueyue Wang, Lijie Feng, Xin Wang, Chen Change Loy,
Abstract summary: We present Light-SQ, a novel superquadric-based optimization framework.<n>We propose a block-regrow-fill strategy guided by structure-aware volumetric decomposition.<n>Experiments demonstrate that Light-SQ enables efficient, high-fidelity, and editable shape abstraction with superquadrics.
Score: 60.92139345612904
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In user-generated-content (UGC) applications, non-expert users often rely on image-to-3D generative models to create 3D assets. In this context, primitive-based shape abstraction offers a promising solution for UGC scenarios by compressing high-resolution meshes into compact, editable representations. Towards this end, effective shape abstraction must therefore be structure-aware, characterized by low overlap between primitives, part-aware alignment, and primitive compactness. We present Light-SQ, a novel superquadric-based optimization framework that explicitly emphasizes structure-awareness from three aspects. (a) We introduce SDF carving to iteratively udpate the target signed distance field, discouraging overlap between primitives. (b) We propose a block-regrow-fill strategy guided by structure-aware volumetric decomposition, enabling structural partitioning to drive primitive placement. (c) We implement adaptive residual pruning based on SDF update history to surpress over-segmentation and ensure compact results. In addition, Light-SQ supports multiscale fitting, enabling localized refinement to preserve fine geometric details. To evaluate our method, we introduce 3DGen-Prim, a benchmark extending 3DGen-Bench with new metrics for both reconstruction quality and primitive-level editability. Extensive experiments demonstrate that Light-SQ enables efficient, high-fidelity, and editable shape abstraction with superquadrics for complex generated geometry, advancing the feasibility of 3D UGC creation.

Related papers

One-Shot Refiner: Boosting Feed-forward Novel View Synthesis via One-Step Diffusion [57.824020826432815]
We present a novel framework for high-fidelity novel view synthesis (NVS) from sparse images.<n>We design a Dual-Domain Detail Perception Module, which enables handling high-resolution images without being limited by the ViT backbone.<n>We develop a feature-guided diffusion network, which can preserve high-frequency details during the restoration process.
arXiv Detail & Related papers (2026-01-20T17:11:55Z)
StdGEN++: A Comprehensive System for Semantic-Decomposed 3D Character Generation [57.06461272772509]
StdGEN++ is a novel and comprehensive system for generating high-fidelity, semantically decomposed 3D characters from diverse inputs.<n>It achieves state-of-the-art performance, significantly outperforming existing methods in geometric accuracy and semantic disentanglement.<n>The resulting structural independence unlocks advanced downstream capabilities, including non-destructive editing, physics-compliant animation, and gaze tracking.
arXiv Detail & Related papers (2026-01-12T15:41:27Z)
LATTICE: Democratize High-Fidelity 3D Generation at Scale [27.310104395842075]
LATTICE is a new framework for high-fidelity 3D asset generation.<n> VoxSet is a semi-structured representation that compresses 3D assets into a compact set of latent vectors anchored to a coarse voxel grid.<n>Our method is simple at its core, but supports arbitrary resolution decoding, low-cost training, and flexible inference schemes.
arXiv Detail & Related papers (2025-11-24T03:22:19Z)
CUS-GS: A Compact Unified Structured Gaussian Splatting Framework for Multimodal Scene Representation [16.85102888388904]
CUS-GS is a compact unified structured Gaussian Splatting representation.<n>We propose a feature-aware significance evaluation strategy to guide anchor growing and pruning.<n>CUS-GS achieves competitive performance compared to state-of-the-art methods using as few as 6M parameters.
arXiv Detail & Related papers (2025-11-22T03:42:49Z)
DisCo-Layout: Disentangling and Coordinating Semantic and Physical Refinement in a Multi-Agent Framework for 3D Indoor Layout Synthesis [76.7196710324494]
3D indoor layout synthesis is crucial for creating virtual environments.<n>DisCo is a novel framework that disentangles and coordinates physical and semantic refinement.
arXiv Detail & Related papers (2025-10-02T16:30:37Z)
Seeing 3D Through 2D Lenses: 3D Few-Shot Class-Incremental Learning via Cross-Modal Geometric Rectification [59.17489431187807]
We propose a framework that enhances 3D geometric fidelity by leveraging CLIP's hierarchical spatial semantics.<n>Our method significantly improves 3D few-shot class-incremental learning, achieving superior geometric coherence and robustness to texture bias.
arXiv Detail & Related papers (2025-09-18T13:45:08Z)
RobustGS: Unified Boosting of Feedforward 3D Gaussian Splatting under Low-Quality Conditions [67.48495052903534]
We propose a general and efficient multi-view feature enhancement module, RobustGS.<n>It substantially improves the robustness of feedforward 3DGS methods under various adverse imaging conditions.<n>The RobustGS module can be seamlessly integrated into existing pretrained pipelines in a plug-and-play manner.
arXiv Detail & Related papers (2025-08-05T04:50:29Z)
SeqAffordSplat: Scene-level Sequential Affordance Reasoning on 3D Gaussian Splatting [85.87902260102652]
We introduce the novel task of Sequential 3D Gaussian Affordance Reasoning.<n>We then propose SeqSplatNet, an end-to-end framework that directly maps an instruction to a sequence of 3D affordance masks.<n>Our method sets a new state-of-the-art on our challenging benchmark, effectively advancing affordance reasoning from single-step interactions to complex, sequential tasks at the scene level.
arXiv Detail & Related papers (2025-07-31T17:56:55Z)
Boosting Multi-View Indoor 3D Object Detection via Adaptive 3D Volume Construction [10.569056109735735]
This work presents SGCDet, a novel multi-view indoor 3D object detection framework based on adaptive 3D volume construction.<n>We introduce a geometry and context aware aggregation module to integrate geometric and contextual information within adaptive regions in each image.<n>We show that SGCDet achieves state-of-the-art performance on the ScanNet, ScanNet200 and ARKitScenes datasets.
arXiv Detail & Related papers (2025-07-24T11:58:01Z)
PrITTI: Primitive-based Generation of Controllable and Editable 3D Semantic Scenes [30.417675568919552]
Large-scale 3D semantic scene generation has predominantly relied on voxel-based representations.<n> primitives represent semantic entities using compact, coarse 3D structures that are easy to manipulate and compose.<n>PrITTI is a latent diffusion-based framework that leverages primitives as the main foundational elements for generating compositional, controllable, and editable scene layouts.
arXiv Detail & Related papers (2025-06-23T20:47:18Z)
LTM3D: Bridging Token Spaces for Conditional 3D Generation with Auto-Regressive Diffusion Framework [40.17218893870908]
LTM3D is a Latent Token space Modeling framework for conditional 3D shape generation.<n>It integrates the strengths of diffusion and auto-regressive (AR) models.<n>LTM3D offers a generalizable framework for multi-modal, multi-representation 3D generation.
arXiv Detail & Related papers (2025-05-30T06:08:45Z)
Geometry-Editable and Appearance-Preserving Object Compositon [67.98806888489385]
General object composition (GOC) aims to seamlessly integrate a target object into a background scene with desired geometric properties.<n>Recent approaches derive semantic embeddings and integrate them into advanced diffusion models to enable geometry-editable generation.<n>We introduce a Disentangled Geometry-editable and Appearance-preserving Diffusion model that first leverages semantic embeddings to implicitly capture desired geometric transformations.
arXiv Detail & Related papers (2025-05-27T09:05:28Z)
Evolving High-Quality Rendering and Reconstruction in a Unified Framework with Contribution-Adaptive Regularization [27.509109317973817]
3D Gaussian Splatting (3DGS) has garnered significant attention for its high-quality rendering and fast inference speed.<n>Previous methods primarily focus on geometry regularization, with common approaches including primitive-based and dual-model frameworks.<n>We propose CarGS, a unified model leveraging-adaptive regularization to achieve simultaneous, high-quality surface reconstruction.
arXiv Detail & Related papers (2025-03-02T12:51:38Z)
DeFormer: Integrating Transformers with Deformable Models for 3D Shape Abstraction from a Single Image [31.154786931081087]
We propose a novel bi-channel Transformer architecture, integrated with parameterized deformable models, to simultaneously estimate the global and local deformations of primitives. DeFormer achieves better reconstruction accuracy over the state-of-the-art, and visualizes with consistent semantic correspondences for improved interpretability.
arXiv Detail & Related papers (2023-09-22T02:46:43Z)
Learning Versatile 3D Shape Generation with Improved AR Models [91.87115744375052]
Auto-regressive (AR) models have achieved impressive results in 2D image generation by modeling joint distributions in the grid space. We propose the Improved Auto-regressive Model (ImAM) for 3D shape generation, which applies discrete representation learning based on a latent vector instead of volumetric grids.
arXiv Detail & Related papers (2023-03-26T12:03:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.