Procrustean Regression Networks: Learning 3D Structure of Non-Rigid
Objects from 2D Annotations
- URL: http://arxiv.org/abs/2007.10961v1
- Date: Tue, 21 Jul 2020 17:29:20 GMT
- Title: Procrustean Regression Networks: Learning 3D Structure of Non-Rigid
Objects from 2D Annotations
- Authors: Sungheon Park, Minsik Lee, Nojun Kwak
- Abstract summary: We propose a novel framework for training neural networks which is capable of learning 3D information of non-rigid objects.
The proposed framework shows superior reconstruction performance to the state-of-the-art method on the Human 3.6M, 300-VW, and SURREAL datasets.
- Score: 42.476537776831314
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel framework for training neural networks which is capable of
learning 3D information of non-rigid objects when only 2D annotations are
available as ground truths. Recently, there have been some approaches that
incorporate the problem setting of non-rigid structure-from-motion (NRSfM) into
deep learning to learn 3D structure reconstruction. The most important
difficulty of NRSfM is to estimate both the rotation and deformation at the
same time, and previous works handle this by regressing both of them. In this
paper, we resolve this difficulty by proposing a loss function wherein the
suitable rotation is automatically determined. Trained with the cost function
consisting of the reprojection error and the low-rank term of aligned shapes,
the network learns the 3D structures of such objects as human skeletons and
faces during the training, whereas the testing is done in a single-frame basis.
The proposed method can handle inputs with missing entries and experimental
results validate that the proposed framework shows superior reconstruction
performance to the state-of-the-art method on the Human 3.6M, 300-VW, and
SURREAL datasets, even though the underlying network structure is very simple.
Related papers
- FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with
Pre-trained Vision-Language Models [62.663113296987085]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data.
We introduce two novel components: the Redundant Feature Eliminator (RFE) and the Spatial Noise Compensator (SNC)
Considering the imbalance in existing 3D datasets, we also propose new evaluation metrics that offer a more nuanced assessment of a 3D FSCIL model.
arXiv Detail & Related papers (2023-12-28T14:52:07Z) - Single-view 3D Mesh Reconstruction for Seen and Unseen Categories [69.29406107513621]
Single-view 3D Mesh Reconstruction is a fundamental computer vision task that aims at recovering 3D shapes from single-view RGB images.
This paper tackles Single-view 3D Mesh Reconstruction, to study the model generalization on unseen categories.
We propose an end-to-end two-stage network, GenMesh, to break the category boundaries in reconstruction.
arXiv Detail & Related papers (2022-08-04T14:13:35Z) - Enforcing connectivity of 3D linear structures using their 2D
projections [54.0598511446694]
We propose to improve the 3D connectivity of our results by minimizing a sum of topology-aware losses on their 2D projections.
This suffices to increase the accuracy and to reduce the annotation effort required to provide the required annotated training data.
arXiv Detail & Related papers (2022-07-14T11:42:18Z) - Learning monocular 3D reconstruction of articulated categories from
motion [39.811816510186475]
Video self-supervision forces the consistency of consecutive 3D reconstructions by a motion-based cycle loss.
We introduce an interpretable model of 3D template deformations that controls a 3D surface through the displacement of a small number of local, learnable handles.
We obtain state-of-the-art reconstructions with diverse shapes, viewpoints and textures for multiple articulated object categories.
arXiv Detail & Related papers (2021-03-30T13:50:27Z) - Adjoint Rigid Transform Network: Task-conditioned Alignment of 3D Shapes [86.2129580231191]
Adjoint Rigid Transform (ART) Network is a neural module which can be integrated with a variety of 3D networks.
ART learns to rotate input shapes to a learned canonical orientation, which is crucial for a lot of tasks.
We will release our code and pre-trained models for further research.
arXiv Detail & Related papers (2021-02-01T20:58:45Z) - Next-best-view Regression using a 3D Convolutional Neural Network [0.9449650062296823]
We propose a data-driven approach to address the next-best-view problem.
The proposed approach trains a 3D convolutional neural network with previous reconstructions in order to regress the btxtposition of the next-best-view.
We have validated the proposed approach making use of two groups of experiments.
arXiv Detail & Related papers (2021-01-23T01:50:26Z) - SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static
Images [44.78174845839193]
Recent efforts have turned to learning 3D reconstruction without 3D supervision from RGB images with annotated 2D silhouettes.
These techniques still require multi-view annotations of the same object instance during training.
We propose SDF-SRN, an approach that requires only a single view of objects at training time.
arXiv Detail & Related papers (2020-10-20T17:59:47Z) - Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end.
Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z) - Few-Shot Single-View 3-D Object Reconstruction with Compositional Priors [30.262308825799167]
We show that complex encoder-decoder architectures perform similarly to nearest-neighbor baselines in standard benchmarks.
We propose three approaches that efficiently integrate a class prior into a 3D reconstruction model.
arXiv Detail & Related papers (2020-04-14T04:53:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.