Piecewise Planar Hulls for Semi-Supervised Learning of 3D Shape and Pose
from 2D Images
- URL: http://arxiv.org/abs/2211.07491v1
- Date: Mon, 14 Nov 2022 16:18:11 GMT
- Title: Piecewise Planar Hulls for Semi-Supervised Learning of 3D Shape and Pose
from 2D Images
- Authors: Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc Van Gool
- Abstract summary: We study the problem of estimating 3D shape and pose of an object in terms of keypoints, from a single 2D image.
The shape and pose are learned directly from images collected by categories and their partial 2D keypoint annotations.
- Score: 133.68032636906133
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of estimating 3D shape and pose of an object in terms of
keypoints, from a single 2D image.
The shape and pose are learned directly from images collected by categories
and their partial 2D keypoint annotations.. In this work, we first propose an
end-to-end training framework for intermediate 2D keypoints extraction and
final 3D shape and pose estimation. The proposed framework is then trained
using only the weak supervision of the intermediate 2D keypoints. Additionally,
we devise a semi-supervised training framework that benefits from both labeled
and unlabeled data. To leverage the unlabeled data, we introduce and exploit
the \emph{piece-wise planar hull} prior of the canonical object shape. These
planar hulls are defined manually once per object category, with the help of
the keypoints. On the one hand, the proposed method learns to segment these
planar hulls from the labeled data. On the other hand, it simultaneously
enforces the consistency between predicted keypoints and the segmented hulls on
the unlabeled data. The enforced consistency allows us to efficiently use the
unlabeled data for the task at hand. The proposed method achieves comparable
results with fully supervised state-of-the-art methods by using only half of
the annotations. Our source code will be made publicly available.
Related papers
- Bayesian Self-Training for Semi-Supervised 3D Segmentation [59.544558398992386]
3D segmentation is a core problem in computer vision.
densely labeling 3D point clouds to employ fully-supervised training remains too labor intensive and expensive.
Semi-supervised training provides a more practical alternative, where only a small set of labeled data is given, accompanied by a larger unlabeled set.
arXiv Detail & Related papers (2024-09-12T14:54:31Z) - OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding [54.981605111365056]
This paper introduces OpenGaussian, a method based on 3D Gaussian Splatting (3DGS) capable of 3D point-level open vocabulary understanding.
Our primary motivation stems from observing that existing 3DGS-based open vocabulary methods mainly focus on 2D pixel-level parsing.
arXiv Detail & Related papers (2024-06-04T07:42:33Z) - When 3D Bounding-Box Meets SAM: Point Cloud Instance Segmentation with
Weak-and-Noisy Supervision [20.625754683390536]
We propose a complementary image prompt-induced weakly-supervised point cloud instance segmentation (CIP-WPIS) method.
We leverage pretrained knowledge embedded in the 2D foundation model SAM and 3D geometric prior to achieve accurate point-wise instance labels.
Our method is robust against noisy 3D bounding-box annotations and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-09-02T05:17:03Z) - You Only Need One Thing One Click: Self-Training for Weakly Supervised
3D Scene Understanding [107.06117227661204]
We propose One Thing One Click'', meaning that the annotator only needs to label one point per object.
We iteratively conduct the training and label propagation, facilitated by a graph propagation module.
Our model can be compatible to 3D instance segmentation equipped with a point-clustering strategy.
arXiv Detail & Related papers (2023-03-26T13:57:00Z) - OSOP: A Multi-Stage One Shot Object Pose Estimation Framework [35.89334617258322]
We present a novel one-shot method for object detection and 6 DoF pose estimation, that does not require training on target objects.
At test time, it takes as input a target image and a textured 3D query model.
We evaluate the method on LineMOD, Occlusion, Homebrewed, YCB-V and TLESS datasets.
arXiv Detail & Related papers (2022-03-29T13:12:00Z) - Weakly Supervised Learning of Keypoints for 6D Object Pose Estimation [73.40404343241782]
We propose a weakly supervised 6D object pose estimation approach based on 2D keypoint detection.
Our approach achieves comparable performance with state-of-the-art fully supervised approaches.
arXiv Detail & Related papers (2022-03-07T16:23:47Z) - End-to-End Learning of Multi-category 3D Pose and Shape Estimation [128.881857704338]
We propose an end-to-end method that simultaneously detects 2D keypoints from an image and lifts them to 3D.
The proposed method learns both 2D detection and 3D lifting only from 2D keypoints annotations.
In addition to being end-to-end in image to 3D learning, our method also handles objects from multiple categories using a single neural network.
arXiv Detail & Related papers (2021-12-19T17:10:40Z) - 3D Guided Weakly Supervised Semantic Segmentation [27.269847900950943]
We propose a weakly supervised 2D semantic segmentation model by incorporating sparse bounding box labels with available 3D information.
We manually labeled a subset of the 2D-3D Semantics(2D-3D-S) dataset with bounding boxes, and introduce our 2D-3D inference module to generate accurate pixel-wise segment proposal masks.
arXiv Detail & Related papers (2020-12-01T03:34:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.