Data-efficient learning for 3D mirror symmetry detection
- URL: http://arxiv.org/abs/2112.12579v1
- Date: Thu, 23 Dec 2021 14:37:52 GMT
- Title: Data-efficient learning for 3D mirror symmetry detection
- Authors: Yancong Lin, Silvia-Laura Pintea, Jan van Gemert
- Abstract summary: We introduce a geometry-inspired deep learning method for detecting 3D mirror plane from single-view images.
We extract semantic features, calculate intra-pixel correlations, and build a 3D correlation volume for each plane.
Experiments on both synthetic and real-world datasets show the benefit of 3D mirror geometry in improving data efficiency and inference speed.
- Score: 9.904746542801838
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a geometry-inspired deep learning method for detecting 3D mirror
plane from single-view images. We reduce the demand for massive training data
by explicitly adding 3D mirror geometry into learning as an inductive prior. We
extract semantic features, calculate intra-pixel correlations, and build a 3D
correlation volume for each plane. The correlation volume indicates the extent
to which the input resembles its mirrors at various depth, allowing us to
identify the likelihood of the given plane being a mirror plane. Subsequently,
we treat the correlation volumes as feature descriptors for sampled planes and
map them to a unit hemisphere where the normal of sampled planes lies. Lastly,
we design multi-stage spherical convolutions to identify the optimal mirror
plane in a coarse-to-fine manner. Experiments on both synthetic and real-world
datasets show the benefit of 3D mirror geometry in improving data efficiency
and inference speed (up to 25 FPS).
Related papers
- Q-SLAM: Quadric Representations for Monocular SLAM [89.05457684629621]
Monocular SLAM has long grappled with the challenge of accurately modeling 3D geometries.
Recent advances in Neural Radiance Fields (NeRF)-based monocular SLAM have shown promise.
We propose a novel approach that reimagines volumetric representations through the lens of quadric forms.
arXiv Detail & Related papers (2024-03-12T23:27:30Z) - Single-Stage 3D Geometry-Preserving Depth Estimation Model Training on
Dataset Mixtures with Uncalibrated Stereo Data [4.199844472131922]
We propose GP$2$, General-Purpose and Geometry-Preserving training scheme for single-view depth estimation.
We show that GP$2$-trained models outperform methods relying on PCM in both accuracy and speed.
We also show that SVDE models can learn to predict geometrically correct depth even when geometrically complete data comprises the minor part of the training set.
arXiv Detail & Related papers (2023-06-05T13:49:24Z) - Normal Transformer: Extracting Surface Geometry from LiDAR Points
Enhanced by Visual Semantics [6.516912796655748]
This paper presents a technique for estimating the normal from 3D point clouds and 2D colour images.
We have developed a transformer neural network that learns to utilise the hybrid information of visual semantic and 3D geometric data.
arXiv Detail & Related papers (2022-11-19T03:55:09Z) - SketchSampler: Sketch-based 3D Reconstruction via View-dependent Depth
Sampling [75.957103837167]
Reconstructing a 3D shape based on a single sketch image is challenging due to the large domain gap between a sparse, irregular sketch and a regular, dense 3D shape.
Existing works try to employ the global feature extracted from sketch to directly predict the 3D coordinates, but they usually suffer from losing fine details that are not faithful to the input sketch.
arXiv Detail & Related papers (2022-08-14T16:37:51Z) - Geometry-Contrastive Transformer for Generalized 3D Pose Transfer [95.56457218144983]
The intuition of this work is to perceive the geometric inconsistency between the given meshes with the powerful self-attention mechanism.
We propose a novel geometry-contrastive Transformer that has an efficient 3D structured perceiving ability to the global geometric inconsistencies.
We present a latent isometric regularization module together with a novel semi-synthesized dataset for the cross-dataset 3D pose transfer task.
arXiv Detail & Related papers (2021-12-14T13:14:24Z) - Efficient 3D Deep LiDAR Odometry [16.388259779644553]
An efficient 3D point cloud learning architecture, named PWCLO-Net, is first proposed in this paper.
The entire architecture is holistically optimized end-to-end to achieve adaptive learning of cost volume and mask.
arXiv Detail & Related papers (2021-11-03T11:09:49Z) - Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo [103.08512487830669]
We present a modern solution to the multi-view photometric stereo problem (MVPS)
We procure the surface orientation using a photometric stereo (PS) image formation model and blend it with a multi-view neural radiance field representation to recover the object's surface geometry.
Our method performs neural rendering of multi-view images while utilizing surface normals estimated by a deep photometric stereo network.
arXiv Detail & Related papers (2021-10-11T20:20:03Z) - Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images
with Virtual Depth [64.29043589521308]
We propose a rendering module to augment the training data by synthesizing images with virtual-depths.
The rendering module takes as input the RGB image and its corresponding sparse depth image, outputs a variety of photo-realistic synthetic images.
Besides, we introduce an auxiliary module to improve the detection model by jointly optimizing it through a depth estimation task.
arXiv Detail & Related papers (2021-07-28T11:00:47Z) - Recurrently Estimating Reflective Symmetry Planes from Partial
Pointclouds [5.098175145801009]
We present an alternative novel encoding that instead slices the data along the height dimension and passes it sequentially to a 2D convolutional recurrent regression scheme.
We show that our approach has an accuracy comparable to state-of-the-art techniques on the task of planar reflective symmetry estimation on full synthetic objects.
arXiv Detail & Related papers (2021-06-30T15:26:15Z) - KAPLAN: A 3D Point Descriptor for Shape Completion [80.15764700137383]
KAPLAN is a 3D point descriptor that aggregates local shape information via a series of 2D convolutions.
In each of those planes, point properties like normals or point-to-plane distances are aggregated into a 2D grid and abstracted into a feature representation with an efficient 2D convolutional encoder.
Experiments on public datasets show that KAPLAN achieves state-of-the-art performance for 3D shape completion.
arXiv Detail & Related papers (2020-07-31T21:56:08Z) - Leveraging Planar Regularities for Point Line Visual-Inertial Odometry [13.51108336267342]
With monocular Visual-Inertial Odometry (VIO) system, 3D point cloud and camera motion can be estimated simultaneously.
We propose PLP-VIO, which exploits point features and line features as well as plane regularities.
The effectiveness of the proposed method is verified on both synthetic data and public datasets.
arXiv Detail & Related papers (2020-04-16T18:20:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.