Zero in on Shape: A Generic 2D-3D Instance Similarity Metric learned
from Synthetic Data
- URL: http://arxiv.org/abs/2108.04091v1
- Date: Mon, 9 Aug 2021 14:44:08 GMT
- Title: Zero in on Shape: A Generic 2D-3D Instance Similarity Metric learned
from Synthetic Data
- Authors: Maciej Janik, Niklas Gard, Anna Hilsmann, Peter Eisert
- Abstract summary: We present a network architecture which compares RGB images and untextured 3D models by the similarity of the represented shape.
Our system is optimised for zero-shot retrieval, meaning it can recognise shapes never shown in training.
- Score: 3.71630298053787
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a network architecture which compares RGB images and untextured 3D
models by the similarity of the represented shape. Our system is optimised for
zero-shot retrieval, meaning it can recognise shapes never shown in training.
We use a view-based shape descriptor and a siamese network to learn object
geometry from pairs of 3D models and 2D images. Due to scarcity of datasets
with exact photograph-mesh correspondences, we train our network with only
synthetic data. Our experiments investigate the effect of different qualities
and quantities of training data on retrieval accuracy and present insights from
bridging the domain gap. We show that increasing the variety of synthetic data
improves retrieval accuracy and that our system's performance in zero-shot mode
can match that of the instance-aware mode, as far as narrowing down the search
to the top 10% of objects.
Related papers
- Robust 3D Tracking with Quality-Aware Shape Completion [67.9748164949519]
We propose a synthetic target representation composed of dense and complete point clouds depicting the target shape precisely by shape completion for robust 3D tracking.
Specifically, we design a voxelized 3D tracking framework with shape completion, in which we propose a quality-aware shape completion mechanism to alleviate the adverse effect of noisy historical predictions.
arXiv Detail & Related papers (2023-12-17T04:50:24Z) - 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images.
We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image.
We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z) - Real-time Detection of 2D Tool Landmarks with Synthetic Training Data [0.0]
In this paper a deep learning architecture is presented that can, in real time, detect the 2D locations of certain landmarks of physical tools, such as a hammer or screwdriver.
To avoid the labor of manual labeling, the network is trained on synthetically generated data.
It is shown that the model presented in this paper, named Intermediate Heatmap Model (IHM), generalizes to real images when trained on synthetic data.
arXiv Detail & Related papers (2022-10-21T14:31:43Z) - VoloGAN: Adversarial Domain Adaptation for Synthetic Depth Data [0.0]
We present VoloGAN, an adversarial domain adaptation network that translates synthetic RGB-D images of a high-quality 3D model of a person, into RGB-D images that could be generated with a consumer depth sensor.
This system is especially useful to generate high amount training data for single-view 3D reconstruction algorithms replicating the real-world capture conditions.
arXiv Detail & Related papers (2022-07-19T11:30:41Z) - RiCS: A 2D Self-Occlusion Map for Harmonizing Volumetric Objects [68.85305626324694]
Ray-marching in Camera Space (RiCS) is a new method to represent the self-occlusions of foreground objects in 3D into a 2D self-occlusion map.
We show that our representation map not only allows us to enhance the image quality but also to model temporally coherent complex shadow effects.
arXiv Detail & Related papers (2022-05-14T05:35:35Z) - Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images [82.32776379815712]
We study the problem of shape generation in 3D mesh representation from a small number of color images with or without camera poses.
We adopt to further improve the shape quality by leveraging cross-view information with a graph convolution network.
Our model is robust to the quality of the initial mesh and the error of camera pose, and can be combined with a differentiable function for test-time optimization.
arXiv Detail & Related papers (2022-04-21T03:42:31Z) - Learning Dense Correspondence from Synthetic Environments [27.841736037738286]
Existing methods map manually labelled human pixels in real 2D images onto the 3D surface, which is prone to human error.
We propose to solve the problem of data scarcity by training 2D-3D human mapping algorithms using automatically generated synthetic data.
arXiv Detail & Related papers (2022-03-24T08:13:26Z) - Diverse Plausible Shape Completions from Ambiguous Depth Images [7.652701739127332]
PSSNet is a network architecture for generating plausible 3D reconstructions from a single 2.5D depth image.
We perform experiments using Shapenet mugs and partially-occluded YCB objects and find that our method performs comparably in datasets with little ambiguity.
arXiv Detail & Related papers (2020-11-18T16:42:51Z) - Hard Example Generation by Texture Synthesis for Cross-domain Shape
Similarity Learning [97.56893524594703]
Image-based 3D shape retrieval (IBSR) aims to find the corresponding 3D shape of a given 2D image from a large 3D shape database.
metric learning with some adaptation techniques seems to be a natural solution to shape similarity learning.
We develop a geometry-focused multi-view metric learning framework empowered by texture synthesis.
arXiv Detail & Related papers (2020-10-23T08:52:00Z) - Canonical 3D Deformer Maps: Unifying parametric and non-parametric
methods for dense weakly-supervised category reconstruction [79.98689027127855]
We propose a new representation of the 3D shape of common object categories that can be learned from a collection of 2D images of independent objects.
Our method builds in a novel way on concepts from parametric deformation models, non-parametric 3D reconstruction, and canonical embeddings.
It achieves state-of-the-art results in dense 3D reconstruction on public in-the-wild datasets of faces, cars, and birds.
arXiv Detail & Related papers (2020-08-28T15:44:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.