Patch2CAD: Patchwise Embedding Learning for In-the-Wild Shape Retrieval
from a Single Image
- URL: http://arxiv.org/abs/2108.09368v1
- Date: Fri, 20 Aug 2021 20:58:52 GMT
- Title: Patch2CAD: Patchwise Embedding Learning for In-the-Wild Shape Retrieval
from a Single Image
- Authors: Weicheng Kuo, Anelia Angelova, Tsung-Yi Lin, Angela Dai
- Abstract summary: We propose a novel approach towards constructing a joint embedding space between 2D images and 3D CAD models in a patch-wise fashion.
Our approach is more robust than state of the art in real-world scenarios without any exact CAD matches.
- Score: 58.953160501596805
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: 3D perception of object shapes from RGB image input is fundamental towards
semantic scene understanding, grounding image-based perception in our spatially
3-dimensional real-world environments. To achieve a mapping between image views
of objects and 3D shapes, we leverage CAD model priors from existing
large-scale databases, and propose a novel approach towards constructing a
joint embedding space between 2D images and 3D CAD models in a patch-wise
fashion -- establishing correspondences between patches of an image view of an
object and patches of CAD geometry. This enables part similarity reasoning for
retrieving similar CADs to a new image view without exact matches in the
database. Our patch embedding provides more robust CAD retrieval for shape
estimation in our end-to-end estimation of CAD model shape and pose for
detected objects in a single input image. Experiments on in-the-wild, complex
imagery from ScanNet show that our approach is more robust than state of the
art in real-world scenarios without any exact CAD matches.
Related papers
- Img2CAD: Conditioned 3D CAD Model Generation from Single Image with Structured Visual Geometry [12.265852643914439]
We present Img2CAD, the first knowledge that uses 2D image inputs to generate editable parameters.
Img2CAD enables seamless integration between AI 3D reconstruction and CAD representation.
arXiv Detail & Related papers (2024-10-04T13:27:52Z) - DiffCAD: Weakly-Supervised Probabilistic CAD Model Retrieval and Alignment from an RGB Image [34.47379913018661]
We propose DiffCAD, the first weakly-supervised probabilistic approach to CAD retrieval and alignment from an RGB image.
We formulate this as a conditional generative task, leveraging diffusion to learn implicit probabilistic models capturing the shape, pose, and scale of CAD objects in an image.
Our approach is trained only on synthetic data, leveraging monocular depth and mask estimates to enable robust zero-shot adaptation to various real target domains.
arXiv Detail & Related papers (2023-11-30T15:10:21Z) - Sparse Multi-Object Render-and-Compare [33.97243145891282]
Reconstructing 3D shape and pose of static objects from a single image is an essential task for various industries.
Directly predicting 3D shapes produces unrealistic, overly smoothed or tessellated shapes.
Retrieving CAD models ensures realistic shapes but requires robust and accurate alignment.
arXiv Detail & Related papers (2023-10-17T12:01:32Z) - Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images [82.32776379815712]
We study the problem of shape generation in 3D mesh representation from a small number of color images with or without camera poses.
We adopt to further improve the shape quality by leveraging cross-view information with a graph convolution network.
Our model is robust to the quality of the initial mesh and the error of camera pose, and can be combined with a differentiable function for test-time optimization.
arXiv Detail & Related papers (2022-04-21T03:42:31Z) - ROCA: Robust CAD Model Retrieval and Alignment from a Single Image [22.03752392397363]
We present ROCA, a novel end-to-end approach that retrieves and aligns 3D CAD models from a shape database to a single input image.
experiments on challenging, real-world imagery from ScanNet show that ROCA significantly improves on state of the art, from 9.5% to 17.6% in retrieval-aware CAD alignment accuracy.
arXiv Detail & Related papers (2021-12-03T16:02:32Z) - Learning Canonical 3D Object Representation for Fine-Grained Recognition [77.33501114409036]
We propose a novel framework for fine-grained object recognition that learns to recover object variation in 3D space from a single image.
We represent an object as a composition of 3D shape and its appearance, while eliminating the effect of camera viewpoint.
By incorporating 3D shape and appearance jointly in a deep representation, our method learns the discriminative representation of the object.
arXiv Detail & Related papers (2021-08-10T12:19:34Z) - From Points to Multi-Object 3D Reconstruction [71.17445805257196]
We propose a method to detect and reconstruct multiple 3D objects from a single RGB image.
A keypoint detector localizes objects as center points and directly predicts all object properties, including 9-DoF bounding boxes and 3D shapes.
The presented approach performs lightweight reconstruction in a single-stage, it is real-time capable, fully differentiable and end-to-end trainable.
arXiv Detail & Related papers (2020-12-21T18:52:21Z) - Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve [54.054575408582565]
We propose to leverage existing large-scale datasets of 3D models to understand the underlying 3D structure of objects seen in an image.
We present Mask2CAD, which jointly detects objects in real-world images and for each detected object, optimize for the most similar CAD model and its pose.
This produces a clean, lightweight representation of the objects in an image.
arXiv Detail & Related papers (2020-07-26T00:08:37Z) - Self-Supervised 2D Image to 3D Shape Translation with Disentangled
Representations [92.89846887298852]
We present a framework to translate between 2D image views and 3D object shapes.
We propose SIST, a Self-supervised Image to Shape Translation framework.
arXiv Detail & Related papers (2020-03-22T22:44:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.