Related papers: Category-Level 3D Non-Rigid Registration from Single-View RGB Images

Category-Level 3D Non-Rigid Registration from Single-View RGB Images

URL: http://arxiv.org/abs/2008.07203v1
Date: Mon, 17 Aug 2020 10:35:19 GMT
Title: Category-Level 3D Non-Rigid Registration from Single-View RGB Images
Authors: Diego Rodriguez, Florian Huber, Sven Behnke
Abstract summary: We propose a novel approach to solve the 3D non-rigid registration problem from RGB images using CNNs. Our objective is to find a deformation field that warps a given 3D canonical model into a novel instance observed by a single-view RGB image.
Score: 28.874008960264202
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we propose a novel approach to solve the 3D non-rigid registration problem from RGB images using Convolutional Neural Networks (CNNs). Our objective is to find a deformation field (typically used for transferring knowledge between instances, e.g., grasping skills) that warps a given 3D canonical model into a novel instance observed by a single-view RGB image. This is done by training a CNN that infers a deformation field for the visible parts of the canonical model and by employing a learned shape (latent) space for inferring the deformations of the occluded parts. As result of the registration, the observed model is reconstructed. Because our method does not need depth information, it can register objects that are typically hard to perceive with RGB-D sensors, e.g. with transparent or shiny surfaces. Even without depth data, our approach outperforms the Coherent Point Drift (CPD) registration method for the evaluated object categories.

Related papers

Towards Human-Level 3D Relative Pose Estimation: Generalizable, Training-Free, with Single Reference [62.99706119370521]
Humans can easily deduce the relative pose of an unseen object, without label/training, given only a single query-reference image pair. We propose a novel 3D generalizable relative pose estimation method by elaborating (i) with a 2.5D shape from an RGB-D reference, (ii) with an off-the-shelf differentiable, and (iii) with semantic cues from a pretrained model like DINOv2.
arXiv Detail & Related papers (2024-06-26T16:01:10Z)
PT43D: A Probabilistic Transformer for Generating 3D Shapes from Single Highly-Ambiguous RGB Images [26.900974153235456]
We propose a transformer-based autoregressive model to generate 3D shapes from single RGB images. To handle realistic scenarios such as field-of-view truncation, we create simulated image-to-shape training pairs. We then adopt cross-attention to effectively identify the most relevant region of interest from the input image for shape generation.
arXiv Detail & Related papers (2024-05-20T09:49:13Z)
MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images [57.71600854525037]
We propose a Fuse-Describe-Match strategy for 6D pose estimation from RGB-D images. MatchU is a generic approach that fuses 2D texture and 3D geometric cues for 6D pose prediction of unseen objects.
arXiv Detail & Related papers (2024-03-03T14:01:03Z)
Anyview: Generalizable Indoor 3D Object Detection with Variable Frames [63.51422844333147]
We present a novel 3D detection framework named AnyView for our practical applications. Our method achieves both great generalizability and high detection accuracy with a simple and clean architecture.
arXiv Detail & Related papers (2023-10-09T02:15:45Z)
Registering Neural Radiance Fields as 3D Density Images [55.64859832225061]
We propose to use universal pre-trained neural networks that can be trained and tested on different scenes. We demonstrate that our method, as a global approach, can effectively register NeRF models.
arXiv Detail & Related papers (2023-05-22T09:08:46Z)
$PC^2$: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction [97.06927852165464]
Reconstructing the 3D shape of an object from a single RGB image is a long-standing and highly challenging problem in computer vision. We propose a novel method for single-image 3D reconstruction which generates a sparse point cloud via a conditional denoising diffusion process.
arXiv Detail & Related papers (2023-02-21T13:37:07Z)
Pose Estimation of Specific Rigid Objects [0.7931904787652707]
We address the problem of estimating the 6D pose of rigid objects from a single RGB or RGB-D input image. This problem is of great importance to many application fields such as robotic manipulation, augmented reality, and autonomous driving.
arXiv Detail & Related papers (2021-12-30T14:36:47Z)
Spatially Invariant Unsupervised 3D Object Segmentation with Graph Neural Networks [23.729853358582506]
We propose a framework, SPAIR3D, to model a point cloud as a spatial mixture model. We jointly learn the multiple-object representation and segmentation in 3D via Variational Autoencoders (VAE) Experimental results demonstrate that SPAIR3D is capable of detecting and segmenting variable number of objects without appearance information.
arXiv Detail & Related papers (2021-06-10T09:20:16Z)
From Points to Multi-Object 3D Reconstruction [71.17445805257196]
We propose a method to detect and reconstruct multiple 3D objects from a single RGB image. A keypoint detector localizes objects as center points and directly predicts all object properties, including 9-DoF bounding boxes and 3D shapes. The presented approach performs lightweight reconstruction in a single-stage, it is real-time capable, fully differentiable and end-to-end trainable.
arXiv Detail & Related papers (2020-12-21T18:52:21Z)
Learning Canonical Shape Space for Category-Level 6D Object Pose and Size Estimation [21.7030393344051]
We learn canonical shape space (CASS), a unified representation for a large variety of instances of a certain object category. We train a variational auto-encoder (VAE) for generating 3D point clouds in the canonical space from an RGBD image. VAE is trained in a cross-category fashion, exploiting the publicly available large 3D shape repositories.
arXiv Detail & Related papers (2020-01-25T14:16:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.