Related papers: What Matters in Autonomous Driving Anomaly Detection: A Weakly Supervised Horizon

What Matters in Autonomous Driving Anomaly Detection: A Weakly Supervised Horizon

URL: http://arxiv.org/abs/2408.05562v1
Date: Sat, 10 Aug 2024 14:04:52 GMT
Title: What Matters in Autonomous Driving Anomaly Detection: A Weakly Supervised Horizon
Authors: Utkarsh Tiwari, Snehashis Majhi, Michal Balazia, François Brémond,
Abstract summary: Video anomaly detection (VAD) in autonomous driving scenario is an important task, however it involves several challenges due to the ego-centric views and moving camera. Recent developments in weakly-supervised VAD methods have shown remarkable progress in detecting critical real-world anomalies in static camera scenario. We aim to promote weakly-supervised method development for autonomous driving VAD.
Score: 12.88166582566313
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Video anomaly detection (VAD) in autonomous driving scenario is an important task, however it involves several challenges due to the ego-centric views and moving camera. Due to this, it remains largely under-explored. While recent developments in weakly-supervised VAD methods have shown remarkable progress in detecting critical real-world anomalies in static camera scenario, the development and validation of such methods are yet to be explored for moving camera VAD. This is mainly due to existing datasets like DoTA not following training pre-conditions of weakly-supervised learning. In this paper, we aim to promote weakly-supervised method development for autonomous driving VAD. We reorganize the DoTA dataset and aim to validate recent powerful weakly-supervised VAD methods on moving camera scenarios. Further, we provide a detailed analysis of what modifications on state-of-the-art methods can significantly improve the detection performance. Towards this, we propose a "feature transformation block" and through experimentation we show that our propositions can empower existing weakly-supervised VAD methods significantly in improving the VAD in autonomous driving. Our codes/dataset/demo will be released at github.com/ut21/WSAD-Driving

Related papers

RAD: Retrieval-Augmented Decision-Making of Meta-Actions with Vision-Language Models in Autonomous Driving [10.984203470464687]
Vision-language models (VLMs) often suffer from limitations such as inadequate spatial perception and hallucination. We propose a retrieval-augmented decision-making (RAD) framework to enhance VLMs' capabilities to reliably generate meta-actions in autonomous driving scenes. We fine-tune VLMs on a dataset derived from the NuScenes dataset to enhance their spatial perception and bird's-eye view image comprehension capabilities.
arXiv Detail & Related papers (2025-03-18T03:25:57Z)
VLM-AD: End-to-End Autonomous Driving through Vision-Language Model Supervision [20.43366384946928]
Vision-language models (VLMs) as teachers to enhance training. VLM-AD achieves significant improvements in planning accuracy and reduced collision rates on the nuScenes dataset.
arXiv Detail & Related papers (2024-12-19T01:53:36Z)
Hard Cases Detection in Motion Prediction by Vision-Language Foundation Models [16.452638202694246]
This work explores the potential of Vision-Language Foundation Models (VLMs) in detecting hard cases in autonomous driving. We introduce a feasible pipeline where VLMs, fed with sequential image frames with designed prompts, effectively identify challenging agents or scenarios. We show the effectiveness and feasibility of incorporating our pipeline with state-of-the-art methods on NuScenes datasets.
arXiv Detail & Related papers (2024-05-31T16:35:41Z)
Unsupervised Domain Adaptation for Self-Driving from Past Traversal Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments. Our approach enhances LiDAR-based detection models using spatial quantized historical features. Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z)
Unsupervised Adaptation from Repeated Traversals for Autonomous Driving [54.59577283226982]
Self-driving cars must generalize to the end-user's environment to operate reliably. One potential solution is to leverage unlabeled data collected from the end-users' environments. There is no reliable signal in the target domain to supervise the adaptation process. We show that this simple additional assumption is sufficient to obtain a potent signal that allows us to perform iterative self-training of 3D object detectors on the target domain.
arXiv Detail & Related papers (2023-03-27T15:07:55Z)
AutoFed: Heterogeneity-Aware Federated Multimodal Learning for Robust Autonomous Driving [15.486799633600423]
AutoFed is a framework to fully exploit multimodal sensory data on autonomous vehicles. We propose a novel model leveraging pseudo-labeling to avoid mistakenly treating unlabeled objects as the background. We also propose an autoencoder-based data imputation method to fill missing data modality.
arXiv Detail & Related papers (2023-02-17T01:31:53Z)
Improving Variational Autoencoder based Out-of-Distribution Detection for Embedded Real-time Applications [2.9327503320877457]
Out-of-distribution (OD) detection is an emerging approach to address the challenge of detecting out-of-distribution in real-time. In this paper, we show how we can robustly detect hazardous motion around autonomous driving agents. Our methods significantly improve detection capabilities of OoD factors to unique driving scenarios, 42% better than state-of-the-art approaches. Our model also generalized near-perfectly, 97% better than the state-of-the-art across the real-world and simulation driving data sets experimented.
arXiv Detail & Related papers (2021-07-25T07:52:53Z)
SODA10M: Towards Large-Scale Object Detection Benchmark for Autonomous Driving [94.11868795445798]
We release a Large-Scale Object Detection benchmark for Autonomous driving, named as SODA10M, containing 10 million unlabeled images and 20K images labeled with 6 representative object categories. To improve diversity, the images are collected every ten seconds per frame within 32 different cities under different weather conditions, periods and location scenes. We provide extensive experiments and deep analyses of existing supervised state-of-the-art detection models, popular self-supervised and semi-supervised approaches, and some insights about how to develop future models.
arXiv Detail & Related papers (2021-06-21T13:55:57Z)
Exploiting Playbacks in Unsupervised Domain Adaptation for 3D Object Detection [55.12894776039135]
State-of-the-art 3D object detectors, based on deep learning, have shown promising accuracy but are prone to over-fit to domain idiosyncrasies. We propose a novel learning approach that drastically reduces this gap by fine-tuning the detector on pseudo-labels in the target domain. We show, on five autonomous driving datasets, that fine-tuning the detector on these pseudo-labels substantially reduces the domain gap to new driving environments.
arXiv Detail & Related papers (2021-03-26T01:18:11Z)
Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images. Our approach is fully automatic without any human interaction. We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z)
Self-supervised Video Object Segmentation [76.83567326586162]
The objective of this paper is self-supervised representation learning, with the goal of solving semi-supervised video object segmentation (a.k.a. dense tracking) We make the following contributions: (i) we propose to improve the existing self-supervised approach, with a simple, yet more effective memory mechanism for long-term correspondence matching; (ii) by augmenting the self-supervised approach with an online adaptation module, our method successfully alleviates tracker drifts caused by spatial-temporal discontinuity; (iv) we demonstrate state-of-the-art results among the self-supervised approaches on DAVIS-2017 and YouTube
arXiv Detail & Related papers (2020-06-22T17:55:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.