Related papers: Skin Tokens: A Learned Compact Representation for Unified Autoregressive Rigging

Skin Tokens: A Learned Compact Representation for Unified Autoregressive Rigging

URL: http://arxiv.org/abs/2602.04805v1
Date: Wed, 04 Feb 2026 17:52:17 GMT
Title: Skin Tokens: A Learned Compact Representation for Unified Autoregressive Rigging
Authors: Jia-peng Zhang, Cheng-Feng Pu, Meng-Hao Guo, Yan-Pei Cao, Shi-Min Hu,
Abstract summary: SkinTokens is a learned, compact, and discrete representation for skinning weights.<n> TokenRig is a unified autoregressive framework that models the entire rig as a single sequence of skeletal parameters and SkinTokens.<n>Our work presents a unified, generative approach to rigging that yields higher fidelity and robustness.
Score: 44.79819257609757
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapid proliferation of generative 3D models has created a critical bottleneck in animation pipelines: rigging. Existing automated methods are fundamentally limited by their approach to skinning, treating it as an ill-posed, high-dimensional regression task that is inefficient to optimize and is typically decoupled from skeleton generation. We posit this is a representation problem and introduce SkinTokens: a learned, compact, and discrete representation for skinning weights. By leveraging an FSQ-CVAE to capture the intrinsic sparsity of skinning, we reframe the task from continuous regression to a more tractable token sequence prediction problem. This representation enables TokenRig, a unified autoregressive framework that models the entire rig as a single sequence of skeletal parameters and SkinTokens, learning the complicated dependencies between skeletons and skin deformations. The unified model is then amenable to a reinforcement learning stage, where tailored geometric and semantic rewards improve generalization to complex, out-of-distribution assets. Quantitatively, the SkinTokens representation leads to a 98%-133% percents improvement in skinning accuracy over state-of-the-art methods, while the full TokenRig framework, refined with RL, enhances bone prediction by 17%-22%. Our work presents a unified, generative approach to rigging that yields higher fidelity and robustness, offering a scalable solution to a long-standing challenge in 3D content creation.

Related papers

FACE: A Face-based Autoregressive Representation for High-Fidelity and Efficient Mesh Generation [50.71369329585773]
We introduce FACE, a novel Autoregressive Autoencoder framework that generates meshes at the face level.<n>Our one-face-one-token strategy treats each triangle face, the fundamental building block of a mesh, as a single, unified token.<n> FACE achieves state-of-the-art reconstruction quality on standard benchmarks.
arXiv Detail & Related papers (2026-03-02T06:47:15Z)
Protein Autoregressive Modeling via Multiscale Structure Generation [51.92004892768298]
We present protein autoregressive modeling (PAR), the first multi-scale autoregressive framework for protein backbone generation.<n>We adopt noisy context learning and scheduled sampling, enabling robust backbone generation.<n>On the unconditional generation benchmark, PAR effectively learns protein distributions and produces backbones of high design quality.
arXiv Detail & Related papers (2026-02-04T18:59:49Z)
StdGEN++: A Comprehensive System for Semantic-Decomposed 3D Character Generation [57.06461272772509]
StdGEN++ is a novel and comprehensive system for generating high-fidelity, semantically decomposed 3D characters from diverse inputs.<n>It achieves state-of-the-art performance, significantly outperforming existing methods in geometric accuracy and semantic disentanglement.<n>The resulting structural independence unlocks advanced downstream capabilities, including non-destructive editing, physics-compliant animation, and gaze tracking.
arXiv Detail & Related papers (2026-01-12T15:41:27Z)
Feed-Forward 3D Gaussian Splatting Compression with Long-Context Modeling [30.948753429414648]
3DGS has emerged as a revolutionary 3D representation, but its substantial data size poses a major barrier to widespread adoption.<n>We propose a novel feed-forward 3DGS compression framework that effectively models long-range correlations.<n>Our method yields a $20times$ compression ratio for 3DGS in a feed-forward inference.
arXiv Detail & Related papers (2025-11-30T12:51:43Z)
Puppeteer: Rig and Animate Your 3D Models [105.11046762553121]
Puppeteer is a comprehensive framework that addresses both automatic rigging and animation for diverse 3D objects.<n>Our system first predicts plausible skeletal structures via an auto-regressive transformer.<n>It then infers skinning weights via an attention-based architecture.
arXiv Detail & Related papers (2025-08-14T17:59:31Z)
Efficient Autoregressive Shape Generation via Octree-Based Adaptive Tokenization [68.07464514094299]
Existing methods encode all shapes into a fixed-size token, disregarding the inherent variations in scale and complexity across 3D data.<n>We introduce Octree-based Adaptive Tokenization, a novel framework that adjusts the dimension of latent representations according to shape complexity.<n>Our approach reduces token counts by 50% compared to fixed-size methods while maintaining comparable visual quality.
arXiv Detail & Related papers (2025-04-03T17:57:52Z)
ARMO: Autoregressive Rigging for Multi-Category Objects [8.030479370619458]
We introduce OmniRig, the first large-scale rigging dataset, comprising 79,499 meshes with detailed skeleton and skinning information.<n>Unlike traditional benchmarks that rely on predefined standard poses, our dataset embraces diverse shape categories, styles, and poses.<n>We propose ARMO, a novel rigging framework that utilizes an autoregressive model to predict both joint positions and connectivity relationships in a unified manner.
arXiv Detail & Related papers (2025-03-26T15:56:48Z)
RigAnything: Template-Free Autoregressive Rigging for Diverse 3D Assets [44.655049022141384]
We present RigAnything, a novel autoregressive transformer-based model.<n>It makes 3D assets rig-ready by probabilistically generating joints and skeleton topologies and assigning skinning weights in a template-free manner.<n>It demonstrates state-of-the-art performance across diverse object types, including humanoids, quadrupeds, marine creatures, insects, and many more.
arXiv Detail & Related papers (2025-02-13T18:59:13Z)
HumanRig: Learning Automatic Rigging for Humanoid Character in a Large Scale Dataset [6.978870586488504]
We present HumanRig, the first large-scale dataset specifically designed for 3D humanoid character rigging.<n>We introduce an innovative, data-driven automatic rigging framework, which overcomes the limitations of GNN-based methods.<n>This work not only remedies the dataset deficiency in rigging research but also propels the animation industry towards more efficient and automated character rigging pipelines.
arXiv Detail & Related papers (2024-12-03T09:33:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.