An Analysis of SVD for Deep Rotation Estimation
- URL: http://arxiv.org/abs/2006.14616v1
- Date: Thu, 25 Jun 2020 17:58:28 GMT
- Title: An Analysis of SVD for Deep Rotation Estimation
- Authors: Jake Levinson, Carlos Esteves, Kefan Chen, Noah Snavely, Angjoo
Kanazawa, Afshin Rostamizadeh, Ameesh Makadia
- Abstract summary: We present a theoretical analysis that shows SVD is the natural choice for projecting onto the rotation group.
Our analysis shows simply replacing existing representations with the SVD orthogonalization procedure obtains state of the art performance in many deep learning applications.
- Score: 63.97835949897361
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Symmetric orthogonalization via SVD, and closely related procedures, are
well-known techniques for projecting matrices onto $O(n)$ or $SO(n)$. These
tools have long been used for applications in computer vision, for example
optimal 3D alignment problems solved by orthogonal Procrustes, rotation
averaging, or Essential matrix decomposition. Despite its utility in different
settings, SVD orthogonalization as a procedure for producing rotation matrices
is typically overlooked in deep learning models, where the preferences tend
toward classic representations like unit quaternions, Euler angles, and
axis-angle, or more recently-introduced methods. Despite the importance of 3D
rotations in computer vision and robotics, a single universally effective
representation is still missing. Here, we explore the viability of SVD
orthogonalization for 3D rotations in neural networks. We present a theoretical
analysis that shows SVD is the natural choice for projecting onto the rotation
group. Our extensive quantitative analysis shows simply replacing existing
representations with the SVD orthogonalization procedure obtains state of the
art performance in many deep learning applications covering both supervised and
unsupervised training.
Related papers
- Object Gaussian for Monocular 6D Pose Estimation from Sparse Views [4.290993205307184]
We introduce SGPose, a novel framework for sparse view object pose estimation using Gaussian-based methods.
Given as few as ten views, SGPose generates a geometric-aware representation by starting with a random cuboid.
Experiments on typical benchmarks, especially on the Occlusion LM-O dataset, demonstrate that SGPose outperforms existing methods even under sparse view constraints.
arXiv Detail & Related papers (2024-09-04T10:03:11Z) - Learning Unorthogonalized Matrices for Rotation Estimation [83.94986875750455]
Estimating 3D rotations is a common procedure for 3D computer vision.
One form of representation -- rotation matrices -- is popular due to its continuity.
We propose unorthogonalized Pseudo' Rotation Matrices (PRoM)
arXiv Detail & Related papers (2023-12-01T09:56:29Z) - Evaluating 3D Shape Analysis Methods for Robustness to Rotation
Invariance [22.306775502181818]
This paper analyzes the robustness of recent 3D shape descriptors to SO(3) rotations.
We consider a database of 3D indoor scenes, where objects occur in different orientations.
arXiv Detail & Related papers (2023-05-29T18:39:31Z) - Orthogonal Matrix Retrieval with Spatial Consensus for 3D Unknown-View
Tomography [58.60249163402822]
Unknown-view tomography (UVT) reconstructs a 3D density map from its 2D projections at unknown, random orientations.
The proposed OMR is more robust and performs significantly better than the previous state-of-the-art OMR approach.
arXiv Detail & Related papers (2022-07-06T21:40:59Z) - Why Approximate Matrix Square Root Outperforms Accurate SVD in Global
Covariance Pooling? [59.820507600960745]
We propose a new GCP meta-layer that uses SVD in the forward pass, and Pad'e Approximants in the backward propagation to compute the gradients.
The proposed meta-layer has been integrated into different CNN models and achieves state-of-the-art performances on both large-scale and fine-grained datasets.
arXiv Detail & Related papers (2021-05-06T08:03:45Z) - Deep regression on manifolds: a 3D rotation case study [0.0]
We show that a differentiable function mapping arbitrary inputs of a Euclidean space onto this manifold should satisfy to allow proper training.
We compare various differentiable mappings on the 3D rotation space, and conjecture about the importance of the local linearity of the mapping.
We notably show that a mapping based on Procrustes orthonormalization of a 3x3 matrix generally performs best among the ones considered.
arXiv Detail & Related papers (2021-03-30T13:07:36Z) - Adjoint Rigid Transform Network: Task-conditioned Alignment of 3D Shapes [86.2129580231191]
Adjoint Rigid Transform (ART) Network is a neural module which can be integrated with a variety of 3D networks.
ART learns to rotate input shapes to a learned canonical orientation, which is crucial for a lot of tasks.
We will release our code and pre-trained models for further research.
arXiv Detail & Related papers (2021-02-01T20:58:45Z) - Learning Local Neighboring Structure for Robust 3D Shape Representation [143.15904669246697]
Representation learning for 3D meshes is important in many computer vision and graphics applications.
We propose a local structure-aware anisotropic convolutional operation (LSA-Conv)
Our model produces significant improvement in 3D shape reconstruction compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-04-21T13:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.