DR-WLC: Dimensionality Reduction cognition for object detection and pose
estimation by Watching, Learning and Checking
- URL: http://arxiv.org/abs/2301.06944v1
- Date: Tue, 17 Jan 2023 15:08:32 GMT
- Title: DR-WLC: Dimensionality Reduction cognition for object detection and pose
estimation by Watching, Learning and Checking
- Authors: Yu Gao, Xi Xu, Tianji Jiang, Siyuan Chen, Yi Yang, Yufeng Yue, Mengyin
Fu
- Abstract summary: Existing object detection and pose estimation methods mostly adopt the same-dimensional data for training.
DR-WLC, a dimensionality reduction cognitive model, can perform both object detection and pose estimation tasks at the same time.
- Score: 30.58114448119465
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object detection and pose estimation are difficult tasks in robotics and
autonomous driving. Existing object detection and pose estimation methods
mostly adopt the same-dimensional data for training. For example, 2D object
detection usually requires a large amount of 2D annotation data with high cost.
Using high-dimensional information to supervise lower-dimensional tasks is a
feasible way to reduce datasets size. In this work, the DR-WLC, a
dimensionality reduction cognitive model, which can perform both object
detection and pose estimation tasks at the same time is proposed. The model
only requires 3D model of objects and unlabeled environment images (with or
without objects) to finish the training. In addition, a bounding boxes
generation strategy is also proposed to build the relationship between 3D model
and 2D object detection task. Experiments show that our method can qualify the
work without any manual annotations and it is easy to deploy for practical
applications. Source code is at https://github.com/IN2-ViAUn/DR-WLC.
Related papers
- STONE: A Submodular Optimization Framework for Active 3D Object Detection [20.54906045954377]
Key requirement for training an accurate 3D object detector is the availability of a large amount of LiDAR-based point cloud data.
This paper proposes a unified active 3D object detection framework, for greatly reducing the labeling cost of training 3D object detectors.
arXiv Detail & Related papers (2024-10-04T20:45:33Z) - Generalized Few-Shot 3D Object Detection of LiDAR Point Cloud for
Autonomous Driving [91.39625612027386]
We propose a novel task, called generalized few-shot 3D object detection, where we have a large amount of training data for common (base) objects, but only a few data for rare (novel) classes.
Specifically, we analyze in-depth differences between images and point clouds, and then present a practical principle for the few-shot setting in the 3D LiDAR dataset.
To solve this task, we propose an incremental fine-tuning method to extend existing 3D detection models to recognize both common and rare objects.
arXiv Detail & Related papers (2023-02-08T07:11:36Z) - Objects as Spatio-Temporal 2.5D points [5.588892124219713]
We propose a weakly supervised method to estimate 3D position of objects by jointly learning to regress the 2D object detections scene's depth prediction in a single feed-forward pass of a network.
Our proposed method extends a single-point based object detector, and introduces a novel object representation where each object is modeled as a BEV point-temporally, without the need of any 3D or BEV annotations for training and LiDAR data at query time.
arXiv Detail & Related papers (2022-12-06T05:14:30Z) - Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.
We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set.
By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z) - RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z) - Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection [70.71934539556916]
We learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection.
Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised.
Our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting.
arXiv Detail & Related papers (2021-07-29T12:30:39Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - 3D for Free: Crossmodal Transfer Learning using HD Maps [36.70550754737353]
We leverage the large class-taxonomies of modern 2D datasets and the robustness of state-of-the-art 2D detection methods.
We mine a collection of 1151 unlabeled, multimodal driving logs from an autonomous vehicle.
We show that detector performance increases as we mine more unlabeled data.
arXiv Detail & Related papers (2020-08-24T17:54:51Z) - DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes [54.239416488865565]
We propose a fast single-stage 3D object detection method for LIDAR data.
The core novelty of our method is a fast, single-pass architecture that both detects objects in 3D and estimates their shapes.
We find that our proposed method achieves state-of-the-art results by 5% on object detection in ScanNet scenes, and it gets top results by 3.4% in the Open dataset.
arXiv Detail & Related papers (2020-04-02T17:48:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.