Geometry-Contrastive Transformer for Generalized 3D Pose Transfer
- URL: http://arxiv.org/abs/2112.07374v1
- Date: Tue, 14 Dec 2021 13:14:24 GMT
- Title: Geometry-Contrastive Transformer for Generalized 3D Pose Transfer
- Authors: Haoyu Chen, Hao Tang, Zitong Yu, Nicu Sebe, Guoying Zhao
- Abstract summary: The intuition of this work is to perceive the geometric inconsistency between the given meshes with the powerful self-attention mechanism.
We propose a novel geometry-contrastive Transformer that has an efficient 3D structured perceiving ability to the global geometric inconsistencies.
We present a latent isometric regularization module together with a novel semi-synthesized dataset for the cross-dataset 3D pose transfer task.
- Score: 95.56457218144983
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a customized 3D mesh Transformer model for the pose transfer task.
As the 3D pose transfer essentially is a deformation procedure dependent on the
given meshes, the intuition of this work is to perceive the geometric
inconsistency between the given meshes with the powerful self-attention
mechanism. Specifically, we propose a novel geometry-contrastive Transformer
that has an efficient 3D structured perceiving ability to the global geometric
inconsistencies across the given meshes. Moreover, locally, a simple yet
efficient central geodesic contrastive loss is further proposed to improve the
regional geometric-inconsistency learning. At last, we present a latent
isometric regularization module together with a novel semi-synthesized dataset
for the cross-dataset 3D pose transfer task towards unknown spaces. The massive
experimental results prove the efficacy of our approach by showing
state-of-the-art quantitative performances on SMPL-NPT, FAUST and our new
proposed dataset SMG-3D datasets, as well as promising qualitative results on
MG-cloth and SMAL datasets. It's demonstrated that our method can achieve
robust 3D pose transfer and be generalized to challenging meshes from unknown
spaces on cross-dataset tasks. The code and dataset are made available. Code is
available: https://github.com/mikecheninoulu/CGT.
Related papers
- Joint Semantic and Rendering Enhancements in 3D Gaussian Modeling with Anisotropic Local Encoding [86.55824709875598]
We propose a joint enhancement framework for 3D semantic Gaussian modeling that synergizes both semantic and rendering branches.<n>Unlike conventional point cloud shape encoding, we introduce an anisotropic 3D Gaussian Chebyshev descriptor to capture fine-grained 3D shape details.<n>We employ a cross-scene knowledge transfer module to continuously update learned shape patterns, enabling faster convergence and robust representations.
arXiv Detail & Related papers (2026-01-05T18:33:50Z) - A Lightweight 3D Anomaly Detection Method with Rotationally Invariant Features [60.76577388438418]
3D anomaly detection (AD) is a crucial task in computer vision, aiming to identify anomalous points or regions from point cloud data.<n>Existing methods may encounter challenges when handling point clouds with changes in orientation and position because the resulting features may vary significantly.<n>We propose a novel Rotationally Invariant Features (RIF) framework for 3D AD, which maps each point into a rotationally invariant space to maintain consistency of representation.
arXiv Detail & Related papers (2025-11-17T08:16:05Z) - Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Self-supervised Learning for Enhancing Geometrical Modeling in 3D-Aware
Generative Adversarial Network [42.16520614686877]
3D-GANs exhibit artifacts in their 3D geometrical modeling, such as mesh imperfections and holes.
These shortcomings are primarily attributed to the limited availability of annotated 3D data.
We present a Self-Supervised Learning technique tailored as an auxiliary loss for any 3D-GAN.
arXiv Detail & Related papers (2023-12-19T04:55:33Z) - MeT: A Graph Transformer for Semantic Segmentation of 3D Meshes [10.667492516216887]
We propose a transformer-based method for semantic segmentation of 3D mesh.
We perform positional encoding by means of the Laplacian eigenvectors of the adjacency matrix.
We show how the proposed approach yields state-of-the-art performance on semantic segmentation of 3D meshes.
arXiv Detail & Related papers (2023-07-03T15:45:14Z) - OcTr: Octree-based Transformer for 3D Object Detection [30.335788698814444]
A key challenge for LiDAR-based 3D object detection is to capture sufficient features from large scale 3D scenes.
We propose an Octree-based Transformer, named OcTr, to address this issue.
For enhanced foreground perception, we propose a hybrid positional embedding, composed of the semantic-aware positional embedding and attention mask.
arXiv Detail & Related papers (2023-03-22T15:01:20Z) - Normal Transformer: Extracting Surface Geometry from LiDAR Points
Enhanced by Visual Semantics [6.516912796655748]
This paper presents a technique for estimating the normal from 3D point clouds and 2D colour images.
We have developed a transformer neural network that learns to utilise the hybrid information of visual semantic and 3D geometric data.
arXiv Detail & Related papers (2022-11-19T03:55:09Z) - Unsupervised Geodesic-preserved Generative Adversarial Networks for
Unconstrained 3D Pose Transfer [84.04540436494011]
We present an unsupervised approach to conduct the pose transfer between any arbitrated given 3D meshes.
Specifically, a novel Intrinsic-Extrinsic Preserved Generative Adrative Network (IEP-GAN) is presented for both intrinsic (i.e., shape) and extrinsic (i.e., pose) information preservation.
Our proposed model produces better results and is substantially more efficient compared to recent state-of-the-art methods.
arXiv Detail & Related papers (2021-08-17T09:08:21Z) - Adjoint Rigid Transform Network: Task-conditioned Alignment of 3D Shapes [86.2129580231191]
Adjoint Rigid Transform (ART) Network is a neural module which can be integrated with a variety of 3D networks.
ART learns to rotate input shapes to a learned canonical orientation, which is crucial for a lot of tasks.
We will release our code and pre-trained models for further research.
arXiv Detail & Related papers (2021-02-01T20:58:45Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.