Geometry-Contrastive Transformer for Generalized 3D Pose Transfer
- URL: http://arxiv.org/abs/2112.07374v1
- Date: Tue, 14 Dec 2021 13:14:24 GMT
- Title: Geometry-Contrastive Transformer for Generalized 3D Pose Transfer
- Authors: Haoyu Chen, Hao Tang, Zitong Yu, Nicu Sebe, Guoying Zhao
- Abstract summary: The intuition of this work is to perceive the geometric inconsistency between the given meshes with the powerful self-attention mechanism.
We propose a novel geometry-contrastive Transformer that has an efficient 3D structured perceiving ability to the global geometric inconsistencies.
We present a latent isometric regularization module together with a novel semi-synthesized dataset for the cross-dataset 3D pose transfer task.
- Score: 95.56457218144983
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a customized 3D mesh Transformer model for the pose transfer task.
As the 3D pose transfer essentially is a deformation procedure dependent on the
given meshes, the intuition of this work is to perceive the geometric
inconsistency between the given meshes with the powerful self-attention
mechanism. Specifically, we propose a novel geometry-contrastive Transformer
that has an efficient 3D structured perceiving ability to the global geometric
inconsistencies across the given meshes. Moreover, locally, a simple yet
efficient central geodesic contrastive loss is further proposed to improve the
regional geometric-inconsistency learning. At last, we present a latent
isometric regularization module together with a novel semi-synthesized dataset
for the cross-dataset 3D pose transfer task towards unknown spaces. The massive
experimental results prove the efficacy of our approach by showing
state-of-the-art quantitative performances on SMPL-NPT, FAUST and our new
proposed dataset SMG-3D datasets, as well as promising qualitative results on
MG-cloth and SMAL datasets. It's demonstrated that our method can achieve
robust 3D pose transfer and be generalized to challenging meshes from unknown
spaces on cross-dataset tasks. The code and dataset are made available. Code is
available: https://github.com/mikecheninoulu/CGT.
Related papers
- Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Self-supervised Learning for Enhancing Geometrical Modeling in 3D-Aware
Generative Adversarial Network [42.16520614686877]
3D-GANs exhibit artifacts in their 3D geometrical modeling, such as mesh imperfections and holes.
These shortcomings are primarily attributed to the limited availability of annotated 3D data.
We present a Self-Supervised Learning technique tailored as an auxiliary loss for any 3D-GAN.
arXiv Detail & Related papers (2023-12-19T04:55:33Z) - MeT: A Graph Transformer for Semantic Segmentation of 3D Meshes [10.667492516216887]
We propose a transformer-based method for semantic segmentation of 3D mesh.
We perform positional encoding by means of the Laplacian eigenvectors of the adjacency matrix.
We show how the proposed approach yields state-of-the-art performance on semantic segmentation of 3D meshes.
arXiv Detail & Related papers (2023-07-03T15:45:14Z) - OcTr: Octree-based Transformer for 3D Object Detection [30.335788698814444]
A key challenge for LiDAR-based 3D object detection is to capture sufficient features from large scale 3D scenes.
We propose an Octree-based Transformer, named OcTr, to address this issue.
For enhanced foreground perception, we propose a hybrid positional embedding, composed of the semantic-aware positional embedding and attention mask.
arXiv Detail & Related papers (2023-03-22T15:01:20Z) - Normal Transformer: Extracting Surface Geometry from LiDAR Points
Enhanced by Visual Semantics [6.516912796655748]
This paper presents a technique for estimating the normal from 3D point clouds and 2D colour images.
We have developed a transformer neural network that learns to utilise the hybrid information of visual semantic and 3D geometric data.
arXiv Detail & Related papers (2022-11-19T03:55:09Z) - Unsupervised Geodesic-preserved Generative Adversarial Networks for
Unconstrained 3D Pose Transfer [84.04540436494011]
We present an unsupervised approach to conduct the pose transfer between any arbitrated given 3D meshes.
Specifically, a novel Intrinsic-Extrinsic Preserved Generative Adrative Network (IEP-GAN) is presented for both intrinsic (i.e., shape) and extrinsic (i.e., pose) information preservation.
Our proposed model produces better results and is substantially more efficient compared to recent state-of-the-art methods.
arXiv Detail & Related papers (2021-08-17T09:08:21Z) - Adjoint Rigid Transform Network: Task-conditioned Alignment of 3D Shapes [86.2129580231191]
Adjoint Rigid Transform (ART) Network is a neural module which can be integrated with a variety of 3D networks.
ART learns to rotate input shapes to a learned canonical orientation, which is crucial for a lot of tasks.
We will release our code and pre-trained models for further research.
arXiv Detail & Related papers (2021-02-01T20:58:45Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.