LiDAR guided Small obstacle Segmentation
- URL: http://arxiv.org/abs/2003.05970v1
- Date: Thu, 12 Mar 2020 18:34:46 GMT
- Title: LiDAR guided Small obstacle Segmentation
- Authors: Aasheesh Singh, Aditya Kamireddypalli, Vineet Gandhi, K Madhava
Krishna
- Abstract summary: Small obstacles on the road are critical for autonomous driving.
We present a method to reliably detect such obstacles through a multi-modal framework of sparse LiDAR and Monocular vision.
We show significant performance gains when the context is fed as an additional input to monocular semantic segmentation frameworks.
- Score: 14.880698940693609
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Detecting small obstacles on the road is critical for autonomous driving. In
this paper, we present a method to reliably detect such obstacles through a
multi-modal framework of sparse LiDAR(VLP-16) and Monocular vision. LiDAR is
employed to provide additional context in the form of confidence maps to
monocular segmentation networks. We show significant performance gains when the
context is fed as an additional input to monocular semantic segmentation
frameworks. We further present a new semantic segmentation dataset to the
community, comprising of over 3000 image frames with corresponding LiDAR
observations. The images come with pixel-wise annotations of three classes
off-road, road, and small obstacle. We stress that precise calibration between
LiDAR and camera is crucial for this task and thus propose a novel Hausdorff
distance based calibration refinement method over extrinsic parameters. As a
first benchmark over this dataset, we report our results with 73% instance
detection up to a distance of 50 meters on challenging scenarios. Qualitatively
by showcasing accurate segmentation of obstacles less than 15 cms at 50m depth
and quantitatively through favourable comparisons vis a vis prior art, we
vindicate the method's efficacy. Our project-page and Dataset is hosted at
https://small-obstacle-dataset.github.io/
Related papers
- Few-shot Semantic Learning for Robust Multi-Biome 3D Semantic Mapping in Off-Road Environments [4.106846770364469]
Off-road environments pose significant perception challenges for high-speed autonomous navigation.
We propose an approach that leverages a pre-trained Vision Transformer (ViT) with fine-tuning on a small (500 images), sparse and coarsely labeled (30% pixels) multi-biome dataset.
These classes are fused over time via a novel range-based metric and aggregated into a 3D semantic voxel map.
arXiv Detail & Related papers (2024-11-10T23:52:24Z) - Reflectivity Is All You Need!: Advancing LiDAR Semantic Segmentation [11.684330305297523]
This paper explores the advantages of employing calibrated intensity (also referred to as reflectivity) within learning-based LiDAR semantic segmentation frameworks.
We show that replacing intensity with reflectivity results in a 4% improvement in mean Intersection over Union for off-road scenarios.
We demonstrate the potential benefits of using calibrated intensity for semantic segmentation in urban environments.
arXiv Detail & Related papers (2024-03-19T22:57:03Z) - LIP-Loc: LiDAR Image Pretraining for Cross-Modal Localization [0.9562145896371785]
We apply Contrastive Language-Image Pre-Training to the domains of 2D image and 3D LiDAR points on the task of cross-modal localization.
Our method outperforms state-of-the-art recall@1 accuracy on the KITTI-360 dataset by 22.4%, using only perspective images.
We also demonstrate the zero-shot capabilities of our model and we beat SOTA by 8% without even training on it.
arXiv Detail & Related papers (2023-12-27T17:23:57Z) - Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image [85.91935485902708]
We show that the key to a zero-shot single-view metric depth model lies in the combination of large-scale data training and resolving the metric ambiguity from various camera models.
We propose a canonical camera space transformation module, which explicitly addresses the ambiguity problems and can be effortlessly plugged into existing monocular models.
Our method enables the accurate recovery of metric 3D structures on randomly collected internet images.
arXiv Detail & Related papers (2023-07-20T16:14:23Z) - Rethinking Range View Representation for LiDAR Segmentation [66.73116059734788]
"Many-to-one" mapping, semantic incoherence, and shape deformation are possible impediments against effective learning from range view projections.
We present RangeFormer, a full-cycle framework comprising novel designs across network architecture, data augmentation, and post-processing.
We show that, for the first time, a range view method is able to surpass the point, voxel, and multi-view fusion counterparts in the competing LiDAR semantic and panoptic segmentation benchmarks.
arXiv Detail & Related papers (2023-03-09T16:13:27Z) - Benchmarking the Robustness of LiDAR Semantic Segmentation Models [78.6597530416523]
In this paper, we aim to comprehensively analyze the robustness of LiDAR semantic segmentation models under various corruptions.
We propose a new benchmark called SemanticKITTI-C, which features 16 out-of-domain LiDAR corruptions in three groups, namely adverse weather, measurement noise and cross-device discrepancy.
We design a robust LiDAR segmentation model (RLSeg) which greatly boosts the robustness with simple but effective modifications.
arXiv Detail & Related papers (2023-01-03T06:47:31Z) - Boosting 3D Object Detection by Simulating Multimodality on Point Clouds [51.87740119160152]
This paper presents a new approach to boost a single-modality (LiDAR) 3D object detector by teaching it to simulate features and responses that follow a multi-modality (LiDAR-image) detector.
The approach needs LiDAR-image data only when training the single-modality detector, and once well-trained, it only needs LiDAR data at inference.
Experimental results on the nuScenes dataset show that our approach outperforms all SOTA LiDAR-only 3D detectors.
arXiv Detail & Related papers (2022-06-30T01:44:30Z) - Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via
Cross-modal Distillation [32.33170182669095]
This work investigates learning pixel-wise semantic image segmentation in urban scenes without any manual annotation, just from the raw non-curated data collected by cars.
We propose a novel method for cross-modal unsupervised learning of semantic image segmentation by leveraging synchronized LiDAR and image data.
arXiv Detail & Related papers (2022-03-21T17:35:46Z) - Highly Accurate Dichotomous Image Segmentation [139.79513044546]
A new task called dichotomous image segmentation (DIS) aims to segment highly accurate objects from natural images.
We collect the first large-scale dataset, DIS5K, which contains 5,470 high-resolution (e.g., 2K, 4K or larger) images.
We also introduce a simple intermediate supervision baseline (IS-Net) using both feature-level and mask-level guidance for DIS model training.
arXiv Detail & Related papers (2022-03-06T20:09:19Z) - Progressive Coordinate Transforms for Monocular 3D Object Detection [52.00071336733109]
We propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
In this paper, we propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
arXiv Detail & Related papers (2021-08-12T15:22:33Z) - SemanticVoxels: Sequential Fusion for 3D Pedestrian Detection using
LiDAR Point Cloud and Semantic Segmentation [4.350338899049983]
We propose a generalization of PointPainting to be able to apply fusion at different levels.
We show that SemanticVoxels achieves state-of-the-art performance in both 3D and bird's eye view pedestrian detection benchmarks.
arXiv Detail & Related papers (2020-09-25T14:52:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.