Related papers: Uncertainty-aware Perception Models for Off-road Autonomous Unmanned Ground Vehicles

Uncertainty-aware Perception Models for Off-road Autonomous Unmanned Ground Vehicles

URL: http://arxiv.org/abs/2209.11115v1
Date: Thu, 22 Sep 2022 15:59:33 GMT
Title: Uncertainty-aware Perception Models for Off-road Autonomous Unmanned Ground Vehicles
Authors: Zhaoyuan Yang, Yewteck Tan, Shiraj Sen, Johan Reimann, John Karigiannis, Mohammed Yousefhussien, Nurali Virani
Abstract summary: Off-road autonomous unmanned ground vehicles (UGVs) are being developed for military and commercial use to deliver crucial supplies in remote locations. Current datasets used to train perception models for off-road autonomous navigation lack of diversity in seasons, locations, semantic classes, as well as time of day. We investigate how to combine multiple datasets to train a semantic segmentation-based environment perception model. We show that training the model to capture uncertainty could improve the model performance by a significant margin.
Score: 6.2574402913714575
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Off-road autonomous unmanned ground vehicles (UGVs) are being developed for military and commercial use to deliver crucial supplies in remote locations, help with mapping and surveillance, and to assist war-fighters in contested environments. Due to complexity of the off-road environments and variability in terrain, lighting conditions, diurnal and seasonal changes, the models used to perceive the environment must handle a lot of input variability. Current datasets used to train perception models for off-road autonomous navigation lack of diversity in seasons, locations, semantic classes, as well as time of day. We test the hypothesis that model trained on a single dataset may not generalize to other off-road navigation datasets and new locations due to the input distribution drift. Additionally, we investigate how to combine multiple datasets to train a semantic segmentation-based environment perception model and we show that training the model to capture uncertainty could improve the model performance by a significant margin. We extend the Masksembles approach for uncertainty quantification to the semantic segmentation task and compare it with Monte Carlo Dropout and standard baselines. Finally, we test the approach against data collected from a UGV platform in a new testing environment. We show that the developed perception model with uncertainty quantification can be feasibly deployed on an UGV to support online perception and navigation tasks.

Related papers

AdaCropFollow: Self-Supervised Online Adaptation for Visual Under-Canopy Navigation [31.214318150001947]
Under-canopy agricultural robots can enable various applications like precise monitoring, spraying, weeding, and plant manipulation tasks. We propose a self-supervised online adaptation method for adapting the semantic keypoint representation using a visual foundational model, geometric prior, and pseudo labeling. This can enable fully autonomous row-following capability in under-canopy robots across fields and crops without requiring human intervention.
arXiv Detail & Related papers (2024-10-16T09:52:38Z)
Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models [60.87795376541144]
A world model is a neural network capable of predicting an agent's next state given past states and actions. During end-to-end training, our policy learns how to recover from errors by aligning with states observed in human demonstrations. We present qualitative and quantitative results, demonstrating significant improvements upon prior state of the art in closed-loop testing.
arXiv Detail & Related papers (2024-09-25T06:48:25Z)
Hard Cases Detection in Motion Prediction by Vision-Language Foundation Models [16.452638202694246]
This work explores the potential of Vision-Language Foundation Models (VLMs) in detecting hard cases in autonomous driving. We introduce a feasible pipeline where VLMs, fed with sequential image frames with designed prompts, effectively identify challenging agents or scenarios. We show the effectiveness and feasibility of incorporating our pipeline with state-of-the-art methods on NuScenes datasets.
arXiv Detail & Related papers (2024-05-31T16:35:41Z)
Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction [69.29802752614677]
RouteFormer is a novel ego-trajectory prediction network combining GPS data, environmental context, and the driver's field-of-view. To tackle data scarcity and enhance diversity, we introduce GEM, a dataset of urban driving scenarios enriched with synchronized driver field-of-view and gaze data.
arXiv Detail & Related papers (2023-12-13T23:06:30Z)
JRDB-Traj: A Dataset and Benchmark for Trajectory Forecasting in Crowds [79.00975648564483]
Trajectory forecasting models, employed in fields such as robotics, autonomous vehicles, and navigation, face challenges in real-world scenarios. This dataset provides comprehensive data, including the locations of all agents, scene images, and point clouds, all from the robot's perspective. The objective is to predict the future positions of agents relative to the robot using raw sensory input data.
arXiv Detail & Related papers (2023-11-05T18:59:31Z)
METAVerse: Meta-Learning Traversability Cost Map for Off-Road Navigation [5.036362492608702]
This paper presents METAVerse, a meta-learning framework for learning a global model that accurately predicts terrain traversability. We train the traversability prediction network to generate a dense and continuous-terrain cost map from a sparse LiDAR point cloud. Online adaptation is performed to rapidly adapt the network to the local environment by exploiting recent interaction experiences.
arXiv Detail & Related papers (2023-07-26T06:58:19Z)
Safe Navigation in Unstructured Environments by Minimizing Uncertainty in Control and Perception [5.46262127926284]
Uncertainty in control and perception poses challenges for autonomous vehicle navigation in unstructured environments. This paper introduces a framework that minimizes control and perception uncertainty to ensure safe and reliable navigation.
arXiv Detail & Related papers (2023-06-26T11:24:03Z)
TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction [149.5716746789134]
We show data-driven traffic simulation can be formulated as a world model. We present TrafficBots, a multi-agent policy built upon motion prediction and end-to-end driving. Experiments on the open motion dataset show TrafficBots can simulate realistic multi-agent behaviors.
arXiv Detail & Related papers (2023-03-07T18:28:41Z)
LOPR: Latent Occupancy PRediction using Generative Models [49.15687400958916]
LiDAR generated occupancy grid maps (L-OGMs) offer a robust bird's eye-view scene representation. We propose a framework that decouples occupancy prediction into: representation learning and prediction within the learned latent space.
arXiv Detail & Related papers (2022-10-03T22:04:00Z)
Federated Deep Learning Meets Autonomous Vehicle Perception: Design and Verification [168.67190934250868]
Federated learning empowered connected autonomous vehicle (FLCAV) has been proposed. FLCAV preserves privacy while reducing communication and annotation costs. It is challenging to determine the network resources and road sensor poses for multi-stage training.
arXiv Detail & Related papers (2022-06-03T23:55:45Z)
PSE-Match: A Viewpoint-free Place Recognition Method with Parallel Semantic Embedding [9.265785042748158]
PSE-Match is a viewpoint-free place recognition method based on parallel semantic analysis of isolated semantic attributes from 3D point-cloud models. PSE-Match incorporates a divergence place learning network to capture different semantic attributes parallelly through the spherical harmonics domain.
arXiv Detail & Related papers (2021-08-01T22:16:40Z)
SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction [72.37440317774556]
We propose advances that address two key challenges in future trajectory prediction. multimodality in both training data and predictions and constant time inference regardless of number of agents.
arXiv Detail & Related papers (2020-07-26T08:17:10Z)
Counterfactual Vision-and-Language Navigation via Adversarial Path Sampling [65.99956848461915]
Vision-and-Language Navigation (VLN) is a task where agents must decide how to move through a 3D environment to reach a goal. One of the problems of the VLN task is data scarcity since it is difficult to collect enough navigation paths with human-annotated instructions for interactive environments. We propose an adversarial-driven counterfactual reasoning model that can consider effective conditions instead of low-quality augmented data.
arXiv Detail & Related papers (2019-11-17T18:02:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.