MimiCAT: Mimic with Correspondence-Aware Cascade-Transformer for Category-Free 3D Pose Transfer
- URL: http://arxiv.org/abs/2511.18370v1
- Date: Sun, 23 Nov 2025 09:28:57 GMT
- Title: MimiCAT: Mimic with Correspondence-Aware Cascade-Transformer for Category-Free 3D Pose Transfer
- Authors: Zenghao Chai, Chen Tang, Yongkang Wong, Xulei Yang, Mohan Kankanhalli,
- Abstract summary: 3D pose transfer aims to transfer the pose-style of a source mesh to a target character while preserving both the target's geometry and the source's pose characteristic.<n>Existing methods are largely restricted to characters with similar structures and fail to generalize to category-free settings.<n>We propose MimiCAT, a cascade-transformer model designed for category-free 3D pose transfer.
- Score: 29.06828928176544
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: 3D pose transfer aims to transfer the pose-style of a source mesh to a target character while preserving both the target's geometry and the source's pose characteristic. Existing methods are largely restricted to characters with similar structures and fail to generalize to category-free settings (e.g., transferring a humanoid's pose to a quadruped). The key challenge lies in the structural and transformation diversity inherent in distinct character types, which often leads to mismatched regions and poor transfer quality. To address these issues, we first construct a million-scale pose dataset across hundreds of distinct characters. We further propose MimiCAT, a cascade-transformer model designed for category-free 3D pose transfer. Instead of relying on strict one-to-one correspondence mappings, MimiCAT leverages semantic keypoint labels to learn a novel soft correspondence that enables flexible many-to-many matching across characters. The pose transfer is then formulated as a conditional generation process, in which the source transformations are first projected onto the target through soft correspondence matching and subsequently refined using shape-conditioned representations. Extensive qualitative and quantitative experiments demonstrate that MimiCAT transfers plausible poses across different characters, significantly outperforming prior methods that are limited to narrow category transfer (e.g., humanoid-to-humanoid).
Related papers
- Structural Action Transformer for 3D Dexterous Manipulation [80.07649565189035]
Cross-embodiment skill transfer is a challenge for high-DoF robotic hands.<n>Existing methods, often relying on 2D observations and temporal-centric action representation, struggle to capture 3D spatial relations.<n>This paper proposes a new 3D dexterous manipulation policy that challenges this paradigm by introducing a structural-centric perspective.
arXiv Detail & Related papers (2026-03-04T11:38:12Z) - Weakly-supervised 3D Pose Transfer with Keypoints [57.66991032263699]
Main challenges of 3D pose transfer are: 1) Lack of paired training data with different characters performing the same pose; 2) Disentangling pose and shape information from the target mesh; 3) Difficulty in applying to meshes with different topologies.
We propose a novel weakly-supervised keypoint-based framework to overcome these difficulties.
arXiv Detail & Related papers (2023-07-25T12:40:24Z) - MAPConNet: Self-supervised 3D Pose Transfer with Mesh and Point
Contrastive Learning [32.97354536302333]
3D pose transfer is a challenging generation task that aims to transfer the pose of a source geometry onto a target geometry with the target identity preserved.
Current pose transfer methods allow end-to-end correspondence learning but require the desired final output as ground truth for supervision.
We present a novel self-supervised framework for 3D pose transfer which can be trained in unsupervised, semi-supervised, or fully supervised settings.
arXiv Detail & Related papers (2023-04-26T20:42:40Z) - Unsupervised 3D Pose Transfer with Cross Consistency and Dual
Reconstruction [50.94171353583328]
The goal of 3D pose transfer is to transfer the pose from the source mesh to the target mesh while preserving the identity information.
Deep learning-based methods improved the efficiency and performance of 3D pose transfer.
We present X-DualNet, a simple yet effective approach that enables unsupervised 3D pose transfer.
arXiv Detail & Related papers (2022-11-18T15:09:56Z) - Skeleton-free Pose Transfer for Stylized 3D Characters [53.33996932633865]
We present the first method that automatically transfers poses between stylized 3D characters without skeletal rigging.
We propose a novel pose transfer network that predicts the character skinning weights and deformation transformations jointly to articulate the target character to match the desired pose.
Our method is trained in a semi-supervised manner absorbing all existing character data with paired/unpaired poses and stylized shapes.
arXiv Detail & Related papers (2022-07-28T20:05:57Z) - Neural Human Deformation Transfer [26.60034186410921]
We consider the problem of human deformation transfer, where the goal is to retarget poses between different characters.
We take a different approach and transform the identity of a character into a new identity without modifying the character's pose.
We show experimentally that our method outperforms state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2021-09-03T15:51:30Z) - 3D Human Shape Style Transfer [21.73251261476412]
We consider the problem of modifying/replacing the shape style of a real moving character with those of an arbitrary static real source character.
Traditional solutions follow a pose transfer strategy, from the moving character to the source character shape, that relies on skeletal pose parametrization.
In this paper, we explore an alternative approach that transfers the source shape style onto the moving character.
arXiv Detail & Related papers (2021-09-03T15:51:30Z) - Unsupervised Geodesic-preserved Generative Adversarial Networks for
Unconstrained 3D Pose Transfer [84.04540436494011]
We present an unsupervised approach to conduct the pose transfer between any arbitrated given 3D meshes.
Specifically, a novel Intrinsic-Extrinsic Preserved Generative Adrative Network (IEP-GAN) is presented for both intrinsic (i.e., shape) and extrinsic (i.e., pose) information preservation.
Our proposed model produces better results and is substantially more efficient compared to recent state-of-the-art methods.
arXiv Detail & Related papers (2021-08-17T09:08:21Z) - Human Pose Transfer by Adaptive Hierarchical Deformation [24.70009597455219]
We propose an adaptive human pose transfer network with two hierarchical deformation levels.
The first level generates human semantic parsing aligned with the target pose.
The second level generates the final textured person image in the target pose with the semantic guidance.
arXiv Detail & Related papers (2020-12-13T01:49:26Z) - Neural Pose Transfer by Spatially Adaptive Instance Normalization [73.04483812364127]
We propose the first neural pose transfer model that solves the pose transfer via the latest technique for image style transfer.
Our model does not require any correspondences between the source and target meshes.
Experiments show that the proposed model can effectively transfer deformation from source to target meshes, and has good generalization ability to deal with unseen identities or poses of meshes.
arXiv Detail & Related papers (2020-03-16T14:33:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.