Related papers: Latency-aware Road Anomaly Segmentation in Videos: A Photorealistic Dataset and New Metrics

Latency-aware Road Anomaly Segmentation in Videos: A Photorealistic Dataset and New Metrics

URL: http://arxiv.org/abs/2401.04942v1
Date: Wed, 10 Jan 2024 05:38:48 GMT
Title: Latency-aware Road Anomaly Segmentation in Videos: A Photorealistic Dataset and New Metrics
Authors: Beiwen Tian, Huan-ang Gao, Leiyao Cui, Yupeng Zheng, Lan Luo, Baofeng Wang, Rong Zhi, Guyue Zhou, Hao Zhao
Abstract summary: We contribute the first video anomaly segmentation dataset for autonomous driving. Our dataset consists of 120,000 high-resolution frames at a 60 FPS framerate, as recorded in 7 different towns. We focus on two new metrics: temporal consistency and latencyaware streaming accuracy.
Score: 15.09892709945568
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In the past several years, road anomaly segmentation is actively explored in the academia and drawing growing attention in the industry. The rationale behind is straightforward: if the autonomous car can brake before hitting an anomalous object, safety is promoted. However, this rationale naturally calls for a temporally informed setting while existing methods and benchmarks are designed in an unrealistic frame-wise manner. To bridge this gap, we contribute the first video anomaly segmentation dataset for autonomous driving. Since placing various anomalous objects on busy roads and annotating them in every frame are dangerous and expensive, we resort to synthetic data. To improve the relevance of this synthetic dataset to real-world applications, we train a generative adversarial network conditioned on rendering G-buffers for photorealism enhancement. Our dataset consists of 120,000 high-resolution frames at a 60 FPS framerate, as recorded in 7 different towns. As an initial benchmarking, we provide baselines using latest supervised and unsupervised road anomaly segmentation methods. Apart from conventional ones, we focus on two new metrics: temporal consistency and latencyaware streaming accuracy. We believe the latter is valuable as it measures whether an anomaly segmentation algorithm can truly prevent a car from crashing in a temporally informed setting.

Related papers

What You Have is What You Track: Adaptive and Robust Multimodal Tracking [72.92244578461869]
We present the first comprehensive study on tracker performance with temporally incomplete multimodal data.<n>Our model achieves SOTA performance across 9 benchmarks, excelling in both conventional complete and missing modality settings.
arXiv Detail & Related papers (2025-07-08T11:40:21Z)
Homography Guided Temporal Fusion for Road Line and Marking Segmentation [73.47092021519245]
Road lines and markings are frequently occluded in the presence of moving vehicles, shadow, and glare. We propose a Homography Guided Fusion (HomoFusion) module to exploit temporally-adjacent video frames for complementary cues. We show that exploiting available camera intrinsic data and ground plane assumption for cross-frame correspondence can lead to a light-weight network with significantly improved performances in speed and accuracy.
arXiv Detail & Related papers (2024-04-11T10:26:40Z)
Leveraging the Edge and Cloud for V2X-Based Real-Time Object Detection in Autonomous Driving [0.0]
Environmental perception is a key element of autonomous driving. In this paper, we investigate the best trade-off between detection quality and latency for real-time perception in autonomous vehicles. We show that models with adequate compression can be run in real-time on the cloud while outperforming local detection performance.
arXiv Detail & Related papers (2023-08-09T21:39:10Z)
Event-Free Moving Object Segmentation from Moving Ego Vehicle [88.33470650615162]
Moving object segmentation (MOS) in dynamic scenes is an important, challenging, but under-explored research topic for autonomous driving. Most segmentation methods leverage motion cues obtained from optical flow maps. We propose to exploit event cameras for better video understanding, which provide rich motion cues without relying on optical flow.
arXiv Detail & Related papers (2023-04-28T23:43:10Z)
Real-Time Driver Monitoring Systems through Modality and View Analysis [28.18784311981388]
Driver distractions are known to be the dominant cause of road accidents. State-of-the-art methods prioritize accuracy while ignoring latency. We propose time-effective detection models by neglecting the temporal relation between video frames.
arXiv Detail & Related papers (2022-10-17T21:22:41Z)
Distortion-Aware Network Pruning and Feature Reuse for Real-time Video Segmentation [49.17930380106643]
We propose a novel framework to speed up any architecture with skip-connections for real-time vision tasks. Specifically, at the arrival of each frame, we transform the features from the previous frame to reuse them at specific spatial bins. We then perform partial computation of the backbone network on the regions of the current frame that captures temporal differences between the current and previous frame.
arXiv Detail & Related papers (2022-06-20T07:20:02Z)
Real-time Object Detection for Streaming Perception [84.2559631820007]
Streaming perception is proposed to jointly evaluate the latency and accuracy into a single metric for video online perception. We build a simple and effective framework for streaming perception. Our method achieves competitive performance on Argoverse-HD dataset and improves the AP by 4.9% compared to the strong baseline.
arXiv Detail & Related papers (2022-03-23T11:33:27Z)
Real Time Monocular Vehicle Velocity Estimation using Synthetic Data [78.85123603488664]
We look at the problem of estimating the velocity of road vehicles from a camera mounted on a moving car. We propose a two-step approach where first an off-the-shelf tracker is used to extract vehicle bounding boxes and then a small neural network is used to regress the vehicle velocity.
arXiv Detail & Related papers (2021-09-16T13:10:27Z)
Predicting Pedestrian Crossing Intention with Feature Fusion and Spatio-Temporal Attention [0.0]
Pedestrian crossing intention should be recognized in real-time for urban driving. Recent works have shown the potential of using vision-based deep neural network models for this task. This work introduces a neural network architecture to fuse inherently different novel-temporal features for pedestrian crossing intention prediction.
arXiv Detail & Related papers (2021-04-12T14:10:25Z)
Exploiting Playbacks in Unsupervised Domain Adaptation for 3D Object Detection [55.12894776039135]
State-of-the-art 3D object detectors, based on deep learning, have shown promising accuracy but are prone to over-fit to domain idiosyncrasies. We propose a novel learning approach that drastically reduces this gap by fine-tuning the detector on pseudo-labels in the target domain. We show, on five autonomous driving datasets, that fine-tuning the detector on these pseudo-labels substantially reduces the domain gap to new driving environments.
arXiv Detail & Related papers (2021-03-26T01:18:11Z)
Decoupled Appearance and Motion Learning for Efficient Anomaly Detection in Surveillance Video [9.80717374118619]
We propose a new neural network architecture that learns the normal behavior in a purely unsupervised fashion. Our model can process 16 to 45 times more frames per second than related approaches.
arXiv Detail & Related papers (2020-11-10T11:40:06Z)
Towards Anomaly Detection in Dashcam Videos [9.558392439655012]
We propose to apply data-driven anomaly detection ideas from deep learning to dashcam videos. We present a large and diverse dataset of truck dashcam videos, namely RetroTrucks. We apply: (i) one-class classification loss and (ii) reconstruction-based loss, for anomaly detection on RetroTrucks.
arXiv Detail & Related papers (2020-04-11T00:10:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.