UdeerLID+: Integrating LiDAR, Image, and Relative Depth with Semi-Supervised
- URL: http://arxiv.org/abs/2409.06197v1
- Date: Tue, 10 Sep 2024 03:57:30 GMT
- Title: UdeerLID+: Integrating LiDAR, Image, and Relative Depth with Semi-Supervised
- Authors: Tao Ni, Xin Zhan, Tao Luo, Wenbin Liu, Zhan Shi, JunBo Chen,
- Abstract summary: Road segmentation is a critical task for autonomous driving systems.
Our work introduces an innovative approach that integrates LiDAR point cloud data, visual image, and relative depth maps.
One of the primary challenges is the scarcity of large-scale, accurately labeled datasets.
- Score: 12.440461420762265
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Road segmentation is a critical task for autonomous driving systems, requiring accurate and robust methods to classify road surfaces from various environmental data. Our work introduces an innovative approach that integrates LiDAR point cloud data, visual image, and relative depth maps derived from images. The integration of multiple data sources in road segmentation presents both opportunities and challenges. One of the primary challenges is the scarcity of large-scale, accurately labeled datasets that are necessary for training robust deep learning models. To address this, we have developed the [UdeerLID+] framework under a semi-supervised learning paradigm. Experiments results on KITTI datasets validate the superior performance.
Related papers
- Multiple data sources and domain generalization learning method for road surface defect classification [2.9109581496560044]
We propose a method for classifying road surface defects using camera images.
We present a domain generalization training algorithm for developing a generalized model.
The results show that our method can efficiently classify road surface defects on previously unseen data.
arXiv Detail & Related papers (2024-07-14T13:37:47Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Multi-Modal Multi-Task (3MT) Road Segmentation [0.8287206589886879]
We focus on using raw sensor inputs instead of, as it is typically done in many SOTA works, leveraging architectures that require high pre-processing costs.
This study presents a cost-effective and highly accurate solution for road segmentation by integrating data from multiple sensors within a multi-task learning architecture.
arXiv Detail & Related papers (2023-08-23T08:15:15Z) - SynDrone -- Multi-modal UAV Dataset for Urban Scenarios [11.338399194998933]
The scarcity of large-scale real datasets with pixel-level annotations poses a significant challenge to researchers.
We propose a multimodal synthetic dataset containing both images and 3D data taken at multiple flying heights.
The dataset will be made publicly available to support the development of novel computer vision methods targeting UAV applications.
arXiv Detail & Related papers (2023-08-21T06:22:10Z) - Scaling Data Generation in Vision-and-Language Navigation [116.95534559103788]
We propose an effective paradigm for generating large-scale data for learning.
We apply 1200+ photo-realistic environments from HM3D and Gibson datasets and synthesizes 4.9 million instruction trajectory pairs.
Thanks to our large-scale dataset, the performance of an existing agent can be pushed up (+11% absolute with regard to previous SoTA) to a significantly new best of 80% single-run success rate on the R2R test split by simple imitation learning.
arXiv Detail & Related papers (2023-07-28T16:03:28Z) - LVLane: Deep Learning for Lane Detection and Classification in
Challenging Conditions [2.5641096293146712]
We present an end-to-end lane detection and classification system based on deep learning methodologies.
In our study, we introduce a unique dataset meticulously curated to encompass scenarios that pose significant challenges for state-of-the-art (SOTA) lane localization models.
We propose a CNN-based classification branch, seamlessly integrated with the detector, facilitating the identification of distinct lane types.
arXiv Detail & Related papers (2023-07-13T16:09:53Z) - PreTraM: Self-Supervised Pre-training via Connecting Trajectory and Map [58.53373202647576]
We propose PreTraM, a self-supervised pre-training scheme for trajectory forecasting.
It consists of two parts: 1) Trajectory-Map Contrastive Learning, where we project trajectories and maps to a shared embedding space with cross-modal contrastive learning, and 2) Map Contrastive Learning, where we enhance map representation with contrastive learning on large quantities of HD-maps.
On top of popular baselines such as AgentFormer and Trajectron++, PreTraM boosts their performance by 5.5% and 6.9% relatively in FDE-10 on the challenging nuScenes dataset.
arXiv Detail & Related papers (2022-04-21T23:01:21Z) - Towards Scale Consistent Monocular Visual Odometry by Learning from the
Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data.
We first train a scale-aware disparity network using both monocular real images and stereo virtual data.
The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z) - Aerial Images Meet Crowdsourced Trajectories: A New Approach to Robust
Road Extraction [110.61383502442598]
We introduce a novel neural network framework termed Cross-Modal Message Propagation Network (CMMPNet)
CMMPNet is composed of two deep Auto-Encoders for modality-specific representation learning and a tailor-designed Dual Enhancement Module for cross-modal representation refinement.
Experiments on three real-world benchmarks demonstrate the effectiveness of our CMMPNet for robust road extraction.
arXiv Detail & Related papers (2021-11-30T04:30:10Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - RELLIS-3D Dataset: Data, Benchmarks and Analysis [16.803548871633957]
RELLIS-3D is a multimodal dataset collected in an off-road environment.
The data was collected on the Rellis Campus of Texas A&M University.
arXiv Detail & Related papers (2020-11-17T18:28:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.