Related papers: Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication

Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication

URL: http://arxiv.org/abs/2405.18515v2
Date: Sat, 16 Nov 2024 02:54:28 GMT
Title: Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication
Authors: Yunuo Chen, Tianyi Xie, Zeshun Zong, Xuan Li, Feng Gao, Yin Yang, Ying Nian Wu, Chenfanfu Jiang,
Abstract summary: We introduce Atlas3D, an automatic and easy-to-implement text-to-3D method. Our approach combines a novel differentiable simulation-based loss function with physically inspired regularization. We verify Atlas3D's efficacy through extensive generation tasks and validate the resulting 3D models in both simulated and real-world environments.
Score: 50.541882834405946
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Existing diffusion-based text-to-3D generation methods primarily focus on producing visually realistic shapes and appearances, often neglecting the physical constraints necessary for downstream tasks. Generated models frequently fail to maintain balance when placed in physics-based simulations or 3D printed. This balance is crucial for satisfying user design intentions in interactive gaming, embodied AI, and robotics, where stable models are needed for reliable interaction. Additionally, stable models ensure that 3D-printed objects, such as figurines for home decoration, can stand on their own without requiring additional supports. To fill this gap, we introduce Atlas3D, an automatic and easy-to-implement method that enhances existing Score Distillation Sampling (SDS)-based text-to-3D tools. Atlas3D ensures the generation of self-supporting 3D models that adhere to physical laws of stability under gravity, contact, and friction. Our approach combines a novel differentiable simulation-based loss function with physically inspired regularization, serving as either a refinement or a post-processing module for existing frameworks. We verify Atlas3D's efficacy through extensive generation tasks and validate the resulting 3D models in both simulated and real-world environments.

Related papers

DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness [67.97235923372035]
Most 3D object generators focus on aesthetic quality, neglecting physical constraints necessary in applications. One such constraint is that the 3D object should be self-supporting, i.e., remains balanced under gravity. We propose Direct Simulation Optimization (DSO), a framework to use the feedback from a (non-differentiable) simulator to increase the likelihood that the 3D generator outputs stable 3D objects directly.
arXiv Detail & Related papers (2025-03-28T17:59:53Z)
MagicArticulate: Make Your 3D Models Articulation-Ready [109.35703811628045]
We present MagicArticulate, an effective framework that automatically transforms static 3D models into articulation-ready assets. Our key contributions are threefold. First, we introduce Articulation-averse benchmark containing over 33k 3D models with high-quality articulation annotations, carefully curated from XL-XL. Extensive experiments demonstrate that MagicArticulate significantly outperforms existing methods across diverse object categories.
arXiv Detail & Related papers (2025-02-17T18:53:27Z)
GraphicsDreamer: Image to 3D Generation with Physical Consistency [32.26851174969898]
We introduce GraphicsDreamer, a method for creating highly usable 3D meshes from single images. In the geometry fusion stage, we continue to enforce the PBR constraints, ensuring that the generated 3D objects possess reliable texture details. Our method incorporates topology optimization and fast UV unwrapping capabilities, allowing the 3D products to be seamlessly imported into graphics engines.
arXiv Detail & Related papers (2024-12-18T10:01:27Z)
GEAL: Generalizable 3D Affordance Learning with Cross-Modal Consistency [50.11520458252128]
Existing 3D affordance learning methods struggle with generalization and robustness due to limited annotated data. We propose GEAL, a novel framework designed to enhance the generalization and robustness of 3D affordance learning by leveraging large-scale pre-trained 2D models. GEAL consistently outperforms existing methods across seen and novel object categories, as well as corrupted data.
arXiv Detail & Related papers (2024-12-12T17:59:03Z)
Text-to-3D Gaussian Splatting with Physics-Grounded Motion Generation [47.6666060652434]
We present an innovative framework that generates 3D models with accurate appearances and geometric structures. By integrating text-to-3D generation with physics-grounded motion synthesis, our framework renders photo-realistic 3D objects.
arXiv Detail & Related papers (2024-12-07T06:48:16Z)
GaussianAnything: Interactive Point Cloud Flow Matching For 3D Object Generation [75.39457097832113]
This paper introduces a novel 3D generation framework, offering scalable, high-quality 3D generation with an interactive Point Cloud-structured Latent space. Our framework employs a Variational Autoencoder with multi-view posed RGB-D(epth)-N(ormal) renderings as input, using a unique latent space design that preserves 3D shape information. The proposed method, GaussianAnything, supports multi-modal conditional 3D generation, allowing for point cloud, caption, and single image inputs.
arXiv Detail & Related papers (2024-11-12T18:59:32Z)
Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors [17.544733016978928]
3D object generation from a single image involves estimating the full 3D geometry and texture of unseen views from an unposed RGB image captured in the wild. Recent advancements in 3D object generation have introduced techniques that reconstruct an object's 3D shape and texture. We propose bridging the gap between 2D and 3D diffusion models to address this limitation.
arXiv Detail & Related papers (2024-10-12T10:14:11Z)
PhysPart: Physically Plausible Part Completion for Interactable Objects [28.91080122885566]
We tackle the problem of physically plausible part completion for interactable objects. We propose a diffusion-based part generation model that utilizes geometric conditioning. We also demonstrate our applications in 3D printing, robot manipulation, and sequential part generation.
arXiv Detail & Related papers (2024-08-25T04:56:09Z)
Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability [118.26563926533517]
Auto-regressive models have achieved impressive results in 2D image generation by modeling joint distributions in grid space. We extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.
arXiv Detail & Related papers (2024-02-19T15:33:09Z)
En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data [36.51674664590734]
We present En3D, an enhanced izable scheme for high-qualityd 3D human avatars. Unlike previous works that rely on scarce 3D datasets or limited 2D collections with imbalance viewing angles and pose priors, our approach aims to develop a zero-shot 3D capable of producing 3D humans.
arXiv Detail & Related papers (2024-01-02T12:06:31Z)
FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with Pre-trained Vision-Language Models [62.663113296987085]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data. We introduce two novel components: the Redundant Feature Eliminator (RFE) and the Spatial Noise Compensator (SNC) Considering the imbalance in existing 3D datasets, we also propose new evaluation metrics that offer a more nuanced assessment of a 3D FSCIL model.
arXiv Detail & Related papers (2023-12-28T14:52:07Z)
AutoDecoding Latent 3D Diffusion Models [95.7279510847827]
We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core. The 3D autodecoder framework embeds properties learned from the target dataset in the latent space. We then identify the appropriate intermediate volumetric latent space, and introduce robust normalization and de-normalization operations.
arXiv Detail & Related papers (2023-07-07T17:59:14Z)
MoDA: Modeling Deformable 3D Objects from Casual Videos [84.29654142118018]
We propose neural dual quaternion blend skinning (NeuDBS) to achieve 3D point deformation without skin-collapsing artifacts. In the endeavor to register 2D pixels across different frames, we establish a correspondence between canonical feature embeddings that encodes 3D points within the canonical space. Our approach can reconstruct 3D models for humans and animals with better qualitative and quantitative performance than state-of-the-art methods.
arXiv Detail & Related papers (2023-04-17T13:49:04Z)
GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images [72.15855070133425]
We introduce GET3D, a Generative model that directly generates Explicit Textured 3D meshes with complex topology, rich geometric details, and high-fidelity textures. GET3D is able to generate high-quality 3D textured meshes, ranging from cars, chairs, animals, motorbikes and human characters to buildings.
arXiv Detail & Related papers (2022-09-22T17:16:19Z)
I-nteract 2.0: A Cyber-Physical System to Design 3D Models using Mixed Reality Technologies and Deep Learning for Additive Manufacturing [2.7986973063309875]
I-nteract is a cyber-physical system that enables real-time interaction with both virtual and real artifacts to design 3D models for additive manufacturing. This paper presents novel advances in the development of the interaction platform I-nteract to generate 3D models using both constructive solid geometry and artificial intelligence.
arXiv Detail & Related papers (2020-10-21T14:13:21Z)
Amodal 3D Reconstruction for Robotic Manipulation via Stability and Connectivity [3.359622001455893]
Learning-based 3D object reconstruction enables single- or few-shot estimation of 3D object models. Existing 3D reconstruction techniques optimize for visual reconstruction fidelity, typically measured by chamfer distance or voxel IOU. We propose ARM, an amodal 3D reconstruction system that introduces a stability prior over object shapes, (2) a connectivity prior, and (3) a multi-channel input representation.
arXiv Detail & Related papers (2020-09-28T08:52:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.