Related papers: Data-efficient learning for 3D mirror symmetry detection

Data-efficient learning for 3D mirror symmetry detection

URL: http://arxiv.org/abs/2112.12579v1
Date: Thu, 23 Dec 2021 14:37:52 GMT
Title: Data-efficient learning for 3D mirror symmetry detection
Authors: Yancong Lin, Silvia-Laura Pintea, Jan van Gemert
Abstract summary: We introduce a geometry-inspired deep learning method for detecting 3D mirror plane from single-view images. We extract semantic features, calculate intra-pixel correlations, and build a 3D correlation volume for each plane. Experiments on both synthetic and real-world datasets show the benefit of 3D mirror geometry in improving data efficiency and inference speed.
Score: 9.904746542801838
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce a geometry-inspired deep learning method for detecting 3D mirror plane from single-view images. We reduce the demand for massive training data by explicitly adding 3D mirror geometry into learning as an inductive prior. We extract semantic features, calculate intra-pixel correlations, and build a 3D correlation volume for each plane. The correlation volume indicates the extent to which the input resembles its mirrors at various depth, allowing us to identify the likelihood of the given plane being a mirror plane. Subsequently, we treat the correlation volumes as feature descriptors for sampled planes and map them to a unit hemisphere where the normal of sampled planes lies. Lastly, we design multi-stage spherical convolutions to identify the optimal mirror plane in a coarse-to-fine manner. Experiments on both synthetic and real-world datasets show the benefit of 3D mirror geometry in improving data efficiency and inference speed (up to 25 FPS).

Related papers

LaRI: Layered Ray Intersections for Single-view 3D Geometric Reasoning [75.9814389360821]
layered ray intersections (LaRI) is a new method for unseen geometry reasoning from a single image. Benefiting from the compact and layered representation, LaRI enables complete, efficient, and view-aligned geometric reasoning. We build a complete training data generation pipeline for synthetic and real-world data, including 3D objects and scenes.
arXiv Detail & Related papers (2025-04-25T15:31:29Z)
MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps [51.44887282336391]
Key challenge of multi-view indoor 3D object detection is to infer accurate geometry information from images for precise 3D detection. Previous method relies on NeRF for geometry reasoning. We propose MVSDet which utilizes plane sweep for geometry-aware 3D object detection.
arXiv Detail & Related papers (2024-10-28T21:58:41Z)
Object Modeling from Underwater Forward-Scan Sonar Imagery with Sea-Surface Multipath [16.057203527513632]
A key contribution, for objects imaged in the proximity of the sea surface, is to resolve the multipath artifacts due to the air-water interface. Here, the object image formed by the direct target backscatter is almost always corrupted by the ghost and sometimes by the mirror components. We model, localize, and discard the corrupted object region within each view, thus avoiding the distortion of recovered 3-D shape.
arXiv Detail & Related papers (2024-09-10T18:46:25Z)
VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection [80.62052650370416]
monocular 3D object detection holds significant importance across various applications, including autonomous driving and robotics. In this paper, we present VFMM3D, an innovative framework that leverages the capabilities of Vision Foundation Models (VFMs) to accurately transform single-view images into LiDAR point cloud representations.
arXiv Detail & Related papers (2024-04-15T03:12:12Z)
GeoGS3D: Single-view 3D Reconstruction via Geometric-aware Diffusion Model and Gaussian Splatting [81.03553265684184]
We introduce GeoGS3D, a framework for reconstructing detailed 3D objects from single-view images. We propose a novel metric, Gaussian Divergence Significance (GDS), to prune unnecessary operations during optimization. Experiments demonstrate that GeoGS3D generates images with high consistency across views and reconstructs high-quality 3D objects.
arXiv Detail & Related papers (2024-03-15T12:24:36Z)
Q-SLAM: Quadric Representations for Monocular SLAM [85.82697759049388]
We reimagine volumetric representations through the lens of quadrics. We use quadric assumption to rectify noisy depth estimations from RGB inputs. We introduce a novel quadric-decomposed transformer to aggregate information across quadrics.
arXiv Detail & Related papers (2024-03-12T23:27:30Z)
Normal Transformer: Extracting Surface Geometry from LiDAR Points Enhanced by Visual Semantics [6.516912796655748]
This paper presents a technique for estimating the normal from 3D point clouds and 2D colour images. We have developed a transformer neural network that learns to utilise the hybrid information of visual semantic and 3D geometric data.
arXiv Detail & Related papers (2022-11-19T03:55:09Z)
SketchSampler: Sketch-based 3D Reconstruction via View-dependent Depth Sampling [75.957103837167]
Reconstructing a 3D shape based on a single sketch image is challenging due to the large domain gap between a sparse, irregular sketch and a regular, dense 3D shape. Existing works try to employ the global feature extracted from sketch to directly predict the 3D coordinates, but they usually suffer from losing fine details that are not faithful to the input sketch.
arXiv Detail & Related papers (2022-08-14T16:37:51Z)
Geometry-Contrastive Transformer for Generalized 3D Pose Transfer [95.56457218144983]
The intuition of this work is to perceive the geometric inconsistency between the given meshes with the powerful self-attention mechanism. We propose a novel geometry-contrastive Transformer that has an efficient 3D structured perceiving ability to the global geometric inconsistencies. We present a latent isometric regularization module together with a novel semi-synthesized dataset for the cross-dataset 3D pose transfer task.
arXiv Detail & Related papers (2021-12-14T13:14:24Z)
Efficient 3D Deep LiDAR Odometry [16.388259779644553]
An efficient 3D point cloud learning architecture, named PWCLO-Net, is first proposed in this paper. The entire architecture is holistically optimized end-to-end to achieve adaptive learning of cost volume and mask.
arXiv Detail & Related papers (2021-11-03T11:09:49Z)
Recurrently Estimating Reflective Symmetry Planes from Partial Pointclouds [5.098175145801009]
We present an alternative novel encoding that instead slices the data along the height dimension and passes it sequentially to a 2D convolutional recurrent regression scheme. We show that our approach has an accuracy comparable to state-of-the-art techniques on the task of planar reflective symmetry estimation on full synthetic objects.
arXiv Detail & Related papers (2021-06-30T15:26:15Z)
Leveraging Planar Regularities for Point Line Visual-Inertial Odometry [13.51108336267342]
With monocular Visual-Inertial Odometry (VIO) system, 3D point cloud and camera motion can be estimated simultaneously. We propose PLP-VIO, which exploits point features and line features as well as plane regularities. The effectiveness of the proposed method is verified on both synthetic data and public datasets.
arXiv Detail & Related papers (2020-04-16T18:20:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.