Related papers: UniArt: Unified 3D Representation for Generating 3D Articulated Objects with Open-Set Articulation

UniArt: Unified 3D Representation for Generating 3D Articulated Objects with Open-Set Articulation

URL: http://arxiv.org/abs/2511.21887v1
Date: Wed, 26 Nov 2025 20:09:11 GMT
Title: UniArt: Unified 3D Representation for Generating 3D Articulated Objects with Open-Set Articulation
Authors: Bu Jin, Weize Li, Songen Gu, Yupeng Zheng, Yuhang Zheng, Zhengyi Zhou, Yao Yao,
Abstract summary: UniArt is a diffusion-based framework that synthesizes fully articulated 3D objects from a single image in an end-to-end manner.<n>We introduce a reversible joint-to-voxel embedding, which spatially aligns articulation features with volumetric geometry.<n>Experiments on the PartNet-Mobility benchmark demonstrate that UniArt achieves state-of-the-art mesh quality and articulation accuracy.
Score: 14.687459506970301
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Articulated 3D objects play a vital role in realistic simulation and embodied robotics, yet manually constructing such assets remains costly and difficult to scale. In this paper, we present UniArt, a diffusion-based framework that directly synthesizes fully articulated 3D objects from a single image in an end-to-end manner. Unlike prior multi-stage techniques, UniArt establishes a unified latent representation that jointly encodes geometry, texture, part segmentation, and kinematic parameters. We introduce a reversible joint-to-voxel embedding, which spatially aligns articulation features with volumetric geometry, enabling the model to learn coherent motion behaviors alongside structural formation. Furthermore, we formulate articulation type prediction as an open-set problem, removing the need for fixed joint semantics and allowing generalization to novel joint categories and unseen object types. Experiments on the PartNet-Mobility benchmark demonstrate that UniArt achieves state-of-the-art mesh quality and articulation accuracy.

Related papers

ArtLLM: Generating Articulated Assets via 3D LLM [19.814132638278547]
ArtLLM is a novel framework for generating high-quality articulated assets directly from complete 3D meshes.<n>At its core is a 3D multimodal large language model trained on a large-scale articulation dataset.<n> Experiments show that ArtLLM significantly outperforms state-of-the-art methods in both part layout accuracy and joint prediction.
arXiv Detail & Related papers (2026-03-01T15:07:46Z)
ArtGen: Conditional Generative Modeling of Articulated Objects in Arbitrary Part-Level States [9.721009445297716]
ArtGen is a conditional diffusion-based framework capable of generating articulated 3D objects with accurate geometry and coherent kinematics.<n>Specifically, ArtGen employs cross-state Monte Carlo sampling to explicitly enforce global kinematic consistency.<n>A compositional 3D-VAE latent prior enhanced with local-global attention effectively captures fine-grained geometry and global part-level relationships.
arXiv Detail & Related papers (2025-12-13T17:00:03Z)
Particulate: Feed-Forward 3D Object Articulation [89.78788418174946]
Particulate is a feed-forward approach that, given a single static 3D mesh of an everyday object, directly infers all attributes of the underlying articulated structure.<n>We train the network end-to-end on a diverse collection of articulated 3D assets from public datasets.<n>During inference, Particulate lifts the network's feed-forward prediction to the input mesh, yielding a fully articulated 3D model in seconds.
arXiv Detail & Related papers (2025-12-12T18:59:51Z)
ArtiLatent: Realistic Articulated 3D Object Generation via Structured Latents [31.495577251319315]
ArtiLatent is a generative framework that synthesizes human-made 3D objects with fine-grained geometry, accurate articulation, and realistic appearance.
arXiv Detail & Related papers (2025-10-24T13:08:15Z)
REACT3D: Recovering Articulations for Interactive Physical 3D Scenes [96.27769519526426]
REACT3D is a framework that converts static 3D scenes into simulation-ready interactive replicas with consistent geometry.<n>We achieve state-of-the-art performance on detection/segmentation and articulation metrics across diverse indoor scenes.
arXiv Detail & Related papers (2025-10-13T12:37:59Z)
DreamArt: Generating Interactable Articulated Objects from a Single Image [40.66232231077524]
We introduce DreamArt, a novel framework for generating high-fidelity, interactable articulated assets from single-view images.<n>DreamArt employs a three-stage pipeline: it reconstructs part-segmented and complete 3D object meshes through a combination of image-to-3D generation, mask-prompted 3D segmentation, and part amodal completion.<n> Experimental results demonstrate that DreamArt effectively generates high-quality articulated objects, possessing accurate part shape, high appearance fidelity, and plausible articulation.
arXiv Detail & Related papers (2025-07-08T08:06:51Z)
HiScene: Creating Hierarchical 3D Scenes with Isometric View Generation [50.206100327643284]
HiScene is a novel hierarchical framework that bridges the gap between 2D image generation and 3D object generation.<n>We generate 3D content that aligns with 2D representations while maintaining compositional structure.
arXiv Detail & Related papers (2025-04-17T16:33:39Z)
SIGHT: Synthesizing Image-Text Conditioned and Geometry-Guided 3D Hand-Object Trajectories [124.24041272390954]
Modeling hand-object interaction priors holds significant potential to advance robotic and embodied AI systems.<n>We introduce SIGHT, a novel task focused on generating realistic and physically plausible 3D hand-object interaction trajectories from a single image.<n>We propose SIGHT-Fusion, a novel diffusion-based image-text conditioned generative model that tackles this task by retrieving the most similar 3D object mesh from a database.
arXiv Detail & Related papers (2025-03-28T20:53:20Z)
Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling [48.78204955169967]
Articulate Anymesh is an automated framework that is able to convert rigid 3D mesh into its articulated counterpart in an open-vocabulary manner.<n>Our experiments show that Articulate Anymesh can generate large-scale, high-quality 3D articulated objects, including tools, toys, mechanical devices, and vehicles.
arXiv Detail & Related papers (2025-02-04T18:59:55Z)
Object Wake-up: 3-D Object Reconstruction, Animation, and in-situ Rendering from a Single Image [58.69732754597448]
Given a picture of a chair, could we extract the 3-D shape of the chair, animate its plausible articulations and motions, and render in-situ in its original image space? We devise an automated approach to extract and manipulate articulated objects in single images.
arXiv Detail & Related papers (2021-08-05T16:20:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.