Related papers: Unsupervised Self-Driving Attention Prediction via Uncertainty Mining and Knowledge Embedding

Unsupervised Self-Driving Attention Prediction via Uncertainty Mining and Knowledge Embedding

URL: http://arxiv.org/abs/2303.09706v3
Date: Sat, 15 Jul 2023 12:39:08 GMT
Title: Unsupervised Self-Driving Attention Prediction via Uncertainty Mining and Knowledge Embedding
Authors: Pengfei Zhu, Mengshi Qi, Xia Li, Weijian Li and Huadong Ma
Abstract summary: We propose an unsupervised way to predict self-driving attention by uncertainty modeling and driving knowledge integration. Results show equivalent or even more impressive performance compared to fully-supervised state-of-the-art approaches.
Score: 51.8579160500354
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Predicting attention regions of interest is an important yet challenging task for self-driving systems. Existing methodologies rely on large-scale labeled traffic datasets that are labor-intensive to obtain. Besides, the huge domain gap between natural scenes and traffic scenes in current datasets also limits the potential for model training. To address these challenges, we are the first to introduce an unsupervised way to predict self-driving attention by uncertainty modeling and driving knowledge integration. Our approach's Uncertainty Mining Branch (UMB) discovers commonalities and differences from multiple generated pseudo-labels achieved from models pre-trained on natural scenes by actively measuring the uncertainty. Meanwhile, our Knowledge Embedding Block (KEB) bridges the domain gap by incorporating driving knowledge to adaptively refine the generated pseudo-labels. Quantitative and qualitative results with equivalent or even more impressive performance compared to fully-supervised state-of-the-art approaches across all three public datasets demonstrate the effectiveness of the proposed method and the potential of this direction. The code will be made publicly available.

Related papers

INSIGHT: Enhancing Autonomous Driving Safety through Vision-Language Models on Context-Aware Hazard Detection and Edge Case Evaluation [7.362380225654904]
INSIGHT is a hierarchical vision-language model (VLM) framework designed to enhance hazard detection and edge-case evaluation. By using multimodal data fusion, our approach integrates semantic and visual representations, enabling precise interpretation of driving scenarios. Experimental results on the BDD100K dataset demonstrate a substantial improvement in hazard prediction straightforwardness and accuracy over existing models.
arXiv Detail & Related papers (2025-02-01T01:43:53Z)
Towards Robust Unsupervised Attention Prediction in Autonomous Driving [40.84001015982244]
We propose a robust unsupervised attention prediction method for self-driving systems. An Uncertainty Mining Branch refines predictions by analyzing commonalities and differences across multiple pre-trained models on natural scenes. A Knowledge Embedding Block bridges the domain gap by incorporating driving knowledge to adaptively enhance pseudo-labels. A novel data augmentation method improves robustness against corruption through soft attention and dynamic augmentation.
arXiv Detail & Related papers (2025-01-25T03:01:26Z)
Enhancing Lane Segment Perception and Topology Reasoning with Crowdsourcing Trajectory Priors [12.333249510969289]
In this paper, we investigate prior augmentation from a novel perspective of trajectory priors. We design a confidence-based fusion module that takes alignment into account during the fusion process. The results indicate that our method's performance significantly outperforms the current state-of-the-art methods.
arXiv Detail & Related papers (2024-11-26T07:05:05Z)
Enhancing End-to-End Autonomous Driving with Latent World Model [78.22157677787239]
We propose a novel self-supervised method to enhance end-to-end driving without the need for costly labels. Our framework textbfLAW uses a LAtent World model to predict future latent features based on the predicted ego actions and the latent feature of the current frame. As a result, our approach achieves state-of-the-art performance in both open-loop and closed-loop benchmarks without costly annotations.
arXiv Detail & Related papers (2024-06-12T17:59:21Z)
Stochastic Vision Transformers with Wasserstein Distance-Aware Attention [8.407731308079025]
Self-supervised learning is one of the most promising approaches to acquiring knowledge from limited labeled data. We introduce a new vision transformer that integrates uncertainty and distance awareness into self-supervised learning pipelines. Our proposed method achieves superior accuracy and calibration, surpassing the self-supervised baseline in a wide range of experiments on a variety of datasets.
arXiv Detail & Related papers (2023-11-30T15:53:37Z)
Implicit Occupancy Flow Fields for Perception and Prediction in Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants. Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene. This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z)
Interpretable Self-Aware Neural Networks for Robust Trajectory Prediction [50.79827516897913]
We introduce an interpretable paradigm for trajectory prediction that distributes the uncertainty among semantic concepts. We validate our approach on real-world autonomous driving data, demonstrating superior performance over state-of-the-art baselines.
arXiv Detail & Related papers (2022-11-16T06:28:20Z)
Motion Inspired Unsupervised Perception and Prediction in Autonomous Driving [29.731790562352344]
This paper pioneers a novel and challenging direction, i.e., training perception and prediction models to understand open-set moving objects. Our proposed framework uses self-learned flow to trigger an automated meta labeling pipeline to achieve automatic supervision. We show that our approach generates highly promising results in open-set 3D detection and trajectory prediction.
arXiv Detail & Related papers (2022-10-14T18:55:44Z)
Domain Knowledge Driven Pseudo Labels for Interpretable Goal-Conditioned Interactive Trajectory Prediction [29.701029725302586]
We study the joint trajectory prediction problem with the goal-conditioned framework. We introduce a conditional-variational-autoencoder-based (CVAE) model to explicitly encode different interaction modes into the latent space. We propose a novel approach to avoid KL vanishing and induce an interpretable interactive latent space with pseudo labels.
arXiv Detail & Related papers (2022-03-28T21:41:21Z)
Important Object Identification with Semi-Supervised Learning for Autonomous Driving [37.654878298744855]
We propose a novel approach for important object identification in egocentric driving scenarios. We present a semi-supervised learning pipeline to enable the model to learn from unlimited unlabeled data. Our approach also outperforms rule-based baselines by a large margin.
arXiv Detail & Related papers (2022-03-05T01:23:13Z)
Just Label What You Need: Fine-Grained Active Selection for Perception and Prediction through Partially Labeled Scenes [78.23907801786827]
We introduce generalizations that ensure that our approach is both cost-aware and allows for fine-grained selection of examples through partially labeled scenes. Our experiments on a real-world, large-scale self-driving dataset suggest that fine-grained selection can improve the performance across perception, prediction, and downstream planning tasks.
arXiv Detail & Related papers (2021-04-08T17:57:41Z)
TraND: Transferable Neighborhood Discovery for Unsupervised Cross-domain Gait Recognition [77.77786072373942]
This paper proposes a Transferable Neighborhood Discovery (TraND) framework to bridge the domain gap for unsupervised cross-domain gait recognition. We design an end-to-end trainable approach to automatically discover the confident neighborhoods of unlabeled samples in the latent space. Our method achieves state-of-the-art results on two public datasets, i.e., CASIA-B and OU-LP.
arXiv Detail & Related papers (2021-02-09T03:07:07Z)
Diverse Complexity Measures for Dataset Curation in Self-driving [80.55417232642124]
We propose a new data selection method that exploits a diverse set of criteria that quantize interestingness of traffic scenes. Our experiments show that the proposed curation pipeline is able to select datasets that lead to better generalization and higher performance.
arXiv Detail & Related papers (2021-01-16T23:45:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.