Related papers: End-to-End 3D Object Detection using LiDAR Point Cloud

End-to-End 3D Object Detection using LiDAR Point Cloud

URL: http://arxiv.org/abs/2312.15377v1
Date: Sun, 24 Dec 2023 00:52:14 GMT
Title: End-to-End 3D Object Detection using LiDAR Point Cloud
Authors: Gaurav Raut, Advait Patole
Abstract summary: We present an approach wherein, using a novel encoding of the LiDAR point cloud we infer the location of different classes near the autonomous vehicles. The output is predictions about the location and orientation of objects in the scene in form of 3D bounding boxes and labels of scene objects.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: There has been significant progress made in the field of autonomous vehicles. Object detection and tracking are the primary tasks for any autonomous vehicle. The task of object detection in autonomous vehicles relies on a variety of sensors like cameras, and LiDAR. Although image features are typically preferred, numerous approaches take spatial data as input. Exploiting this information we present an approach wherein, using a novel encoding of the LiDAR point cloud we infer the location of different classes near the autonomous vehicles. This approach does not implement a bird's eye view approach, which is generally applied for this application and thus saves the extensive pre-processing required. After studying the numerous networks and approaches used to solve this approach, we have implemented a novel model with the intention to inculcate their advantages and avoid their shortcomings. The output is predictions about the location and orientation of objects in the scene in form of 3D bounding boxes and labels of scene objects.

Related papers

LISO: Lidar-only Self-Supervised 3D Object Detection [25.420879730860936]
We introduce a novel self-supervised method to train SOTA lidar object detection networks. It works on unlabeled sequences of lidar point clouds only. It utilizes a SOTA self-supervised lidar scene flow network under the hood to generate, track, and iteratively refine pseudo ground truth.
arXiv Detail & Related papers (2024-03-11T18:02:52Z)
View-to-Label: Multi-View Consistency for Self-Supervised 3D Object Detection [46.077668660248534]
We propose a novel approach to self-supervise 3D object detection purely from RGB sequences alone. Our experiments on KITTI 3D dataset demonstrate performance on par with state-of-the-art self-supervised methods.
arXiv Detail & Related papers (2023-05-29T09:30:39Z)
Once Detected, Never Lost: Surpassing Human Performance in Offline LiDAR based 3D Object Detection [50.959453059206446]
This paper aims for high-performance offline LiDAR-based 3D object detection. We first observe that experienced human annotators annotate objects from a track-centric perspective. We propose a high-performance offline detector in a track-centric perspective instead of the conventional object-centric perspective.
arXiv Detail & Related papers (2023-04-24T17:59:05Z)
Generalized Few-Shot 3D Object Detection of LiDAR Point Cloud for Autonomous Driving [91.39625612027386]
We propose a novel task, called generalized few-shot 3D object detection, where we have a large amount of training data for common (base) objects, but only a few data for rare (novel) classes. Specifically, we analyze in-depth differences between images and point clouds, and then present a practical principle for the few-shot setting in the 3D LiDAR dataset. To solve this task, we propose an incremental fine-tuning method to extend existing 3D detection models to recognize both common and rare objects.
arXiv Detail & Related papers (2023-02-08T07:11:36Z)
BEV-MAE: Bird's Eye View Masked Autoencoders for Point Cloud Pre-training in Autonomous Driving Scenarios [51.285561119993105]
We present BEV-MAE, an efficient masked autoencoder pre-training framework for LiDAR-based 3D object detection in autonomous driving. Specifically, we propose a bird's eye view (BEV) guided masking strategy to guide the 3D encoder learning feature representation. We introduce a learnable point token to maintain a consistent receptive field size of the 3D encoder.
arXiv Detail & Related papers (2022-12-12T08:15:03Z)
Exploring Diversity-based Active Learning for 3D Object Detection in Autonomous Driving [45.405303803618]
We investigate diversity-based active learning (AL) as a potential solution to alleviate the annotation burden. We propose a novel acquisition function that enforces spatial and temporal diversity in the selected samples. We demonstrate the effectiveness of the proposed method on the nuScenes dataset and show that it outperforms existing AL strategies significantly.
arXiv Detail & Related papers (2022-05-16T14:21:30Z)
Weakly Supervised Training of Monocular 3D Object Detectors Using Wide Baseline Multi-view Traffic Camera Data [19.63193201107591]
7DoF prediction of vehicles at an intersection is an important task for assessing potential conflicts between road users. We develop an approach using a weakly supervised method of fine tuning 3D object detectors for traffic observation cameras. Our method achieves vehicle 7DoF pose prediction accuracy on our dataset comparable to the top performing monocular 3D object detectors on autonomous vehicle datasets.
arXiv Detail & Related papers (2021-10-21T08:26:48Z)
CFTrack: Center-based Radar and Camera Fusion for 3D Multi-Object Tracking [9.62721286522053]
We propose an end-to-end network for joint object detection and tracking based on radar and camera sensor fusion. Our proposed method uses a center-based radar-camera fusion algorithm for object detection and utilizes a greedy algorithm for object association. We evaluate our method on the challenging nuScenes dataset, where it achieves 20.0 AMOTA and outperforms all vision-based 3D tracking methods in the benchmark.
arXiv Detail & Related papers (2021-07-11T23:56:53Z)
Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem. We employ a Neural Message Passing network for data association that is fully trainable. We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z)
Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images. Our approach is fully automatic without any human interaction. We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z)
DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes [54.239416488865565]
We propose a fast single-stage 3D object detection method for LIDAR data. The core novelty of our method is a fast, single-pass architecture that both detects objects in 3D and estimates their shapes. We find that our proposed method achieves state-of-the-art results by 5% on object detection in ScanNet scenes, and it gets top results by 3.4% in the Open dataset.
arXiv Detail & Related papers (2020-04-02T17:48:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.