Rotation Equivariant 3D Hand Mesh Generation from a Single RGB Image
- URL: http://arxiv.org/abs/2111.13023v1
- Date: Thu, 25 Nov 2021 11:07:27 GMT
- Title: Rotation Equivariant 3D Hand Mesh Generation from a Single RGB Image
- Authors: Joshua Mitton, Chaitanya Kaul, Roderick Murray-Smith
- Abstract summary: We develop a rotation equivariant model for generating 3D hand meshes from 2D RGB images.
This guarantees that as the input image of a hand is rotated the generated mesh undergoes a corresponding rotation.
- Score: 1.8692254863855962
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We develop a rotation equivariant model for generating 3D hand meshes from 2D
RGB images. This guarantees that as the input image of a hand is rotated the
generated mesh undergoes a corresponding rotation. Furthermore, this removes
undesirable deformations in the meshes often generated by methods without
rotation equivariance. By building a rotation equivariant model, through
considering symmetries in the problem, we reduce the need for training on very
large datasets to achieve good mesh reconstruction.
The encoder takes images defined on $\mathbb{Z}^{2}$ and maps these to latent
functions defined on the group $C_{8}$. We introduce a novel vector mapping
function to map the function defined on $C_{8}$ to a latent point cloud space
defined on the group $\mathrm{SO}(2)$. Further, we introduce a 3D projection
function that learns a 3D function from the $\mathrm{SO}(2)$ latent space.
Finally, we use an $\mathrm{SO}(3)$ equivariant decoder to ensure rotation
equivariance. Our rotation equivariant model outperforms state-of-the-art
methods on a real-world dataset and we demonstrate that it accurately captures
the shape and pose in the generated meshes under rotation of the input hand.
Related papers
- Image to Sphere: Learning Equivariant Features for Efficient Pose
Prediction [3.823356975862006]
Methods that predict a single point estimate do not predict the pose of objects with symmetries well and cannot represent uncertainty.
We propose a novel mapping of features from the image domain to the 3D rotation manifold.
We demonstrate the effectiveness of our method at object orientation prediction, and achieve state-of-the-art performance on the popular PASCAL3D+ dataset.
arXiv Detail & Related papers (2023-02-27T16:23:19Z) - PaRot: Patch-Wise Rotation-Invariant Network via Feature Disentanglement
and Pose Restoration [16.75367717130046]
State-of-the-art models are not robust to rotations, which remains an unknown prior to real applications.
We introduce a novel Patch-wise Rotation-invariant network (PaRot)
Our disentanglement module extracts high-quality rotation-robust features and the proposed lightweight model achieves competitive results.
arXiv Detail & Related papers (2023-02-06T02:13:51Z) - Category-Level 6D Object Pose Estimation with Flexible Vector-Based
Rotation Representation [51.67545893892129]
We propose a novel 3D graph convolution based pipeline for category-level 6D pose and size estimation from monocular RGB-D images.
We first design an orientation-aware autoencoder with 3D graph convolution for latent feature learning.
Then, to efficiently decode the rotation information from the latent feature, we design a novel flexible vector-based decomposable rotation representation.
arXiv Detail & Related papers (2022-12-09T02:13:43Z) - 3D Equivariant Graph Implicit Functions [51.5559264447605]
We introduce a novel family of graph implicit functions with equivariant layers that facilitates modeling fine local details.
Our method improves over the existing rotation-equivariant implicit function from 0.69 to 0.89 on the ShapeNet reconstruction task.
arXiv Detail & Related papers (2022-03-31T16:51:25Z) - ConDor: Self-Supervised Canonicalization of 3D Pose for Partial Shapes [55.689763519293464]
ConDor is a self-supervised method that learns to canonicalize the 3D orientation and position for full and partial 3D point clouds.
During inference, our method takes an unseen full or partial 3D point cloud at an arbitrary pose and outputs an equivariant canonical pose.
arXiv Detail & Related papers (2022-01-19T18:57:21Z) - ZZ-Net: A Universal Rotation Equivariant Architecture for 2D Point
Clouds [17.35440223078089]
We propose a novel neural network architecture for processing 2D point clouds.
We show how to extend the architecture to accept a set of 2D-2D correspondences as indata.
Experiments are presented on the estimation of essential matrices in stereo vision.
arXiv Detail & Related papers (2021-11-30T12:37:36Z) - Extreme Rotation Estimation using Dense Correlation Volumes [73.35119461422153]
We present a technique for estimating the relative 3D rotation of an RGB image pair in an extreme setting.
We observe that, even when images do not overlap, there may be rich hidden cues as to their geometric relationship.
We propose a network design that can automatically learn such implicit cues by comparing all pairs of points between the two input images.
arXiv Detail & Related papers (2021-04-28T02:00:04Z) - Adjoint Rigid Transform Network: Task-conditioned Alignment of 3D Shapes [86.2129580231191]
Adjoint Rigid Transform (ART) Network is a neural module which can be integrated with a variety of 3D networks.
ART learns to rotate input shapes to a learned canonical orientation, which is crucial for a lot of tasks.
We will release our code and pre-trained models for further research.
arXiv Detail & Related papers (2021-02-01T20:58:45Z) - Rotation-Invariant Autoencoders for Signals on Spheres [10.406659081400354]
We study the problem of unsupervised learning of rotation-invariant representations for spherical images.
In particular, we design an autoencoder architecture consisting of $S2$ and $SO(3)$ convolutional layers.
Experiments on multiple datasets demonstrate the usefulness of the learned representations on clustering, retrieval and classification applications.
arXiv Detail & Related papers (2020-12-08T15:15:03Z) - Implicit Functions in Feature Space for 3D Shape Reconstruction and
Completion [53.885984328273686]
Implicit Feature Networks (IF-Nets) deliver continuous outputs, can handle multiple topologies, and complete shapes for missing or sparse input data.
IF-Nets clearly outperform prior work in 3D object reconstruction in ShapeNet, and obtain significantly more accurate 3D human reconstructions.
arXiv Detail & Related papers (2020-03-03T11:14:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.