Related papers: ROCA: Robust CAD Model Retrieval and Alignment from a Single Image

ROCA: Robust CAD Model Retrieval and Alignment from a Single Image

URL: http://arxiv.org/abs/2112.01988v1
Date: Fri, 3 Dec 2021 16:02:32 GMT
Title: ROCA: Robust CAD Model Retrieval and Alignment from a Single Image
Authors: Can G\"umeli, Angela Dai, Matthias Nie{\ss}ner
Abstract summary: We present ROCA, a novel end-to-end approach that retrieves and aligns 3D CAD models from a shape database to a single input image. experiments on challenging, real-world imagery from ScanNet show that ROCA significantly improves on state of the art, from 9.5% to 17.6% in retrieval-aware CAD alignment accuracy.
Score: 22.03752392397363
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present ROCA, a novel end-to-end approach that retrieves and aligns 3D CAD models from a shape database to a single input image. This enables 3D perception of an observed scene from a 2D RGB observation, characterized as a lightweight, compact, clean CAD representation. Core to our approach is our differentiable alignment optimization based on dense 2D-3D object correspondences and Procrustes alignment. ROCA can thus provide a robust CAD alignment while simultaneously informing CAD retrieval by leveraging the 2D-3D correspondences to learn geometrically similar CAD models. Experiments on challenging, real-world imagery from ScanNet show that ROCA significantly improves on state of the art, from 9.5% to 17.6% in retrieval-aware CAD alignment accuracy.

Related papers

RAG-6DPose: Retrieval-Augmented 6D Pose Estimation via Leveraging CAD as Knowledge Base [112.72361202480154]
We present RAG-6DPose, a retrieval-augmented approach that leverages 3D CAD models as a knowledge base.<n> Experimental results on standard benchmarks and real-world robotic tasks demonstrate the effectiveness and robustness of our approach.
arXiv Detail & Related papers (2025-06-23T17:19:41Z)
CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images [69.7768227804928]
CADCrafter is an image-to-parametric CAD model generation framework that trains solely on synthetic textureless CAD data. We introduce a geometry encoder to accurately capture diverse geometric features. Our approach can robustly handle real unconstrained CAD images, and even generalize to unseen general objects.
arXiv Detail & Related papers (2025-04-07T06:01:35Z)
Img2CAD: Conditioned 3D CAD Model Generation from Single Image with Structured Visual Geometry [12.265852643914439]
We present Img2CAD, the first knowledge that uses 2D image inputs to generate editable parameters. Img2CAD enables seamless integration between AI 3D reconstruction and CAD representation.
arXiv Detail & Related papers (2024-10-04T13:27:52Z)
PS-CAD: Local Geometry Guidance via Prompting and Selection for CAD Reconstruction [86.726941702182]
We introduce geometric guidance into the reconstruction network PS-CAD. We provide the geometry of surfaces where the current reconstruction differs from the complete model as a point cloud. Second, we use geometric analysis to extract a set of planar prompts, that correspond to candidate surfaces.
arXiv Detail & Related papers (2024-05-24T03:43:55Z)
FastCAD: Real-Time CAD Retrieval and Alignment from Scans and Videos [4.36478623815937]
FastCAD is a real-time method that simultaneously retrieves and aligns CAD models for all objects in a given scene. Our single-stage method accelerates the inference time by a factor of 50 compared to other methods operating on RGB-D scans. This enables the real-time generation of precise CAD model-based reconstructions from videos at 10 FPS.
arXiv Detail & Related papers (2024-03-22T12:20:23Z)
Model2Scene: Learning 3D Scene Representation via Contrastive Language-CAD Models Pre-training [105.3421541518582]
Current successful methods of 3D scene perception rely on the large-scale annotated point cloud. We propose Model2Scene, a novel paradigm that learns free 3D scene representation from Computer-Aided Design (CAD) models and languages. Model2Scene yields impressive label-free 3D object salient detection with an average mAP of 46.08% and 55.49% on the ScanNet and S3DIS datasets, respectively.
arXiv Detail & Related papers (2023-09-29T03:51:26Z)
SECAD-Net: Self-Supervised CAD Reconstruction by Learning Sketch-Extrude Operations [21.000539206470897]
SECAD-Net is an end-to-end neural network aimed at reconstructing compact and easy-to-edit CAD models. We show superiority over state-of-the-art alternatives including the closely related method for supervised CAD reconstruction.
arXiv Detail & Related papers (2023-03-19T09:26:03Z)
XDGAN: Multi-Modal 3D Shape Generation in 2D Space [60.46777591995821]
We propose a novel method to convert 3D shapes into compact 1-channel geometry images and leverage StyleGAN3 and image-to-image translation networks to generate 3D objects in 2D space. The generated geometry images are quick to convert to 3D meshes, enabling real-time 3D object synthesis, visualization and interactive editing. We show both quantitatively and qualitatively that our method is highly effective at various tasks such as 3D shape generation, single view reconstruction and shape manipulation, while being significantly faster and more flexible compared to recent 3D generative models.
arXiv Detail & Related papers (2022-10-06T15:54:01Z)
Patch2CAD: Patchwise Embedding Learning for In-the-Wild Shape Retrieval from a Single Image [58.953160501596805]
We propose a novel approach towards constructing a joint embedding space between 2D images and 3D CAD models in a patch-wise fashion. Our approach is more robust than state of the art in real-world scenarios without any exact CAD matches.
arXiv Detail & Related papers (2021-08-20T20:58:52Z)
3D-to-2D Distillation for Indoor Scene Parsing [78.36781565047656]
We present a new approach that enables us to leverage 3D features extracted from large-scale 3D data repository to enhance 2D features extracted from RGB images. First, we distill 3D knowledge from a pretrained 3D network to supervise a 2D network to learn simulated 3D features from 2D features during the training. Second, we design a two-stage dimension normalization scheme to calibrate the 2D and 3D features for better integration. Third, we design a semantic-aware adversarial training model to extend our framework for training with unpaired 3D data.
arXiv Detail & Related papers (2021-04-06T02:22:24Z)
Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve [54.054575408582565]
We propose to leverage existing large-scale datasets of 3D models to understand the underlying 3D structure of objects seen in an image. We present Mask2CAD, which jointly detects objects in real-world images and for each detected object, optimize for the most similar CAD model and its pose. This produces a clean, lightweight representation of the objects in an image.
arXiv Detail & Related papers (2020-07-26T00:08:37Z)
CAD-Deform: Deformable Fitting of CAD Models to 3D Scans [30.451330075135076]
We introduce CAD-Deform, a method which obtains more accurate CAD-to-scan fits by non-rigidly deforming retrieved CAD models. A series of experiments demonstrate that our method achieves significantly tighter scan-to-CAD fits, allowing a more accurate digital replica of the scanned real-world environment.
arXiv Detail & Related papers (2020-07-23T12:30:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.