Skin Tokens: A Learned Compact Representation for Unified Autoregressive Rigging
- URL: http://arxiv.org/abs/2602.04805v1
- Date: Wed, 04 Feb 2026 17:52:17 GMT
- Title: Skin Tokens: A Learned Compact Representation for Unified Autoregressive Rigging
- Authors: Jia-peng Zhang, Cheng-Feng Pu, Meng-Hao Guo, Yan-Pei Cao, Shi-Min Hu,
- Abstract summary: SkinTokens is a learned, compact, and discrete representation for skinning weights.<n> TokenRig is a unified autoregressive framework that models the entire rig as a single sequence of skeletal parameters and SkinTokens.<n>Our work presents a unified, generative approach to rigging that yields higher fidelity and robustness.
- Score: 44.79819257609757
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid proliferation of generative 3D models has created a critical bottleneck in animation pipelines: rigging. Existing automated methods are fundamentally limited by their approach to skinning, treating it as an ill-posed, high-dimensional regression task that is inefficient to optimize and is typically decoupled from skeleton generation. We posit this is a representation problem and introduce SkinTokens: a learned, compact, and discrete representation for skinning weights. By leveraging an FSQ-CVAE to capture the intrinsic sparsity of skinning, we reframe the task from continuous regression to a more tractable token sequence prediction problem. This representation enables TokenRig, a unified autoregressive framework that models the entire rig as a single sequence of skeletal parameters and SkinTokens, learning the complicated dependencies between skeletons and skin deformations. The unified model is then amenable to a reinforcement learning stage, where tailored geometric and semantic rewards improve generalization to complex, out-of-distribution assets. Quantitatively, the SkinTokens representation leads to a 98%-133% percents improvement in skinning accuracy over state-of-the-art methods, while the full TokenRig framework, refined with RL, enhances bone prediction by 17%-22%. Our work presents a unified, generative approach to rigging that yields higher fidelity and robustness, offering a scalable solution to a long-standing challenge in 3D content creation.
Related papers
- FACE: A Face-based Autoregressive Representation for High-Fidelity and Efficient Mesh Generation [50.71369329585773]
We introduce FACE, a novel Autoregressive Autoencoder framework that generates meshes at the face level.<n>Our one-face-one-token strategy treats each triangle face, the fundamental building block of a mesh, as a single, unified token.<n> FACE achieves state-of-the-art reconstruction quality on standard benchmarks.
arXiv Detail & Related papers (2026-03-02T06:47:15Z) - Protein Autoregressive Modeling via Multiscale Structure Generation [51.92004892768298]
We present protein autoregressive modeling (PAR), the first multi-scale autoregressive framework for protein backbone generation.<n>We adopt noisy context learning and scheduled sampling, enabling robust backbone generation.<n>On the unconditional generation benchmark, PAR effectively learns protein distributions and produces backbones of high design quality.
arXiv Detail & Related papers (2026-02-04T18:59:49Z) - StdGEN++: A Comprehensive System for Semantic-Decomposed 3D Character Generation [57.06461272772509]
StdGEN++ is a novel and comprehensive system for generating high-fidelity, semantically decomposed 3D characters from diverse inputs.<n>It achieves state-of-the-art performance, significantly outperforming existing methods in geometric accuracy and semantic disentanglement.<n>The resulting structural independence unlocks advanced downstream capabilities, including non-destructive editing, physics-compliant animation, and gaze tracking.
arXiv Detail & Related papers (2026-01-12T15:41:27Z) - Feed-Forward 3D Gaussian Splatting Compression with Long-Context Modeling [30.948753429414648]
3DGS has emerged as a revolutionary 3D representation, but its substantial data size poses a major barrier to widespread adoption.<n>We propose a novel feed-forward 3DGS compression framework that effectively models long-range correlations.<n>Our method yields a $20times$ compression ratio for 3DGS in a feed-forward inference.
arXiv Detail & Related papers (2025-11-30T12:51:43Z) - Puppeteer: Rig and Animate Your 3D Models [105.11046762553121]
Puppeteer is a comprehensive framework that addresses both automatic rigging and animation for diverse 3D objects.<n>Our system first predicts plausible skeletal structures via an auto-regressive transformer.<n>It then infers skinning weights via an attention-based architecture.
arXiv Detail & Related papers (2025-08-14T17:59:31Z) - Efficient Autoregressive Shape Generation via Octree-Based Adaptive Tokenization [68.07464514094299]
Existing methods encode all shapes into a fixed-size token, disregarding the inherent variations in scale and complexity across 3D data.<n>We introduce Octree-based Adaptive Tokenization, a novel framework that adjusts the dimension of latent representations according to shape complexity.<n>Our approach reduces token counts by 50% compared to fixed-size methods while maintaining comparable visual quality.
arXiv Detail & Related papers (2025-04-03T17:57:52Z) - ARMO: Autoregressive Rigging for Multi-Category Objects [8.030479370619458]
We introduce OmniRig, the first large-scale rigging dataset, comprising 79,499 meshes with detailed skeleton and skinning information.<n>Unlike traditional benchmarks that rely on predefined standard poses, our dataset embraces diverse shape categories, styles, and poses.<n>We propose ARMO, a novel rigging framework that utilizes an autoregressive model to predict both joint positions and connectivity relationships in a unified manner.
arXiv Detail & Related papers (2025-03-26T15:56:48Z) - RigAnything: Template-Free Autoregressive Rigging for Diverse 3D Assets [44.655049022141384]
We present RigAnything, a novel autoregressive transformer-based model.<n>It makes 3D assets rig-ready by probabilistically generating joints and skeleton topologies and assigning skinning weights in a template-free manner.<n>It demonstrates state-of-the-art performance across diverse object types, including humanoids, quadrupeds, marine creatures, insects, and many more.
arXiv Detail & Related papers (2025-02-13T18:59:13Z) - HumanRig: Learning Automatic Rigging for Humanoid Character in a Large Scale Dataset [6.978870586488504]
We present HumanRig, the first large-scale dataset specifically designed for 3D humanoid character rigging.<n>We introduce an innovative, data-driven automatic rigging framework, which overcomes the limitations of GNN-based methods.<n>This work not only remedies the dataset deficiency in rigging research but also propels the animation industry towards more efficient and automated character rigging pipelines.
arXiv Detail & Related papers (2024-12-03T09:33:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.