Latency-aware Road Anomaly Segmentation in Videos: A Photorealistic
Dataset and New Metrics
- URL: http://arxiv.org/abs/2401.04942v1
- Date: Wed, 10 Jan 2024 05:38:48 GMT
- Title: Latency-aware Road Anomaly Segmentation in Videos: A Photorealistic
Dataset and New Metrics
- Authors: Beiwen Tian, Huan-ang Gao, Leiyao Cui, Yupeng Zheng, Lan Luo, Baofeng
Wang, Rong Zhi, Guyue Zhou, Hao Zhao
- Abstract summary: We contribute the first video anomaly segmentation dataset for autonomous driving.
Our dataset consists of 120,000 high-resolution frames at a 60 FPS framerate, as recorded in 7 different towns.
We focus on two new metrics: temporal consistency and latencyaware streaming accuracy.
- Score: 15.09892709945568
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In the past several years, road anomaly segmentation is actively explored in
the academia and drawing growing attention in the industry. The rationale
behind is straightforward: if the autonomous car can brake before hitting an
anomalous object, safety is promoted. However, this rationale naturally calls
for a temporally informed setting while existing methods and benchmarks are
designed in an unrealistic frame-wise manner. To bridge this gap, we contribute
the first video anomaly segmentation dataset for autonomous driving. Since
placing various anomalous objects on busy roads and annotating them in every
frame are dangerous and expensive, we resort to synthetic data. To improve the
relevance of this synthetic dataset to real-world applications, we train a
generative adversarial network conditioned on rendering G-buffers for
photorealism enhancement. Our dataset consists of 120,000 high-resolution
frames at a 60 FPS framerate, as recorded in 7 different towns. As an initial
benchmarking, we provide baselines using latest supervised and unsupervised
road anomaly segmentation methods. Apart from conventional ones, we focus on
two new metrics: temporal consistency and latencyaware streaming accuracy. We
believe the latter is valuable as it measures whether an anomaly segmentation
algorithm can truly prevent a car from crashing in a temporally informed
setting.
Related papers
- Homography Guided Temporal Fusion for Road Line and Marking Segmentation [73.47092021519245]
Road lines and markings are frequently occluded in the presence of moving vehicles, shadow, and glare.
We propose a Homography Guided Fusion (HomoFusion) module to exploit temporally-adjacent video frames for complementary cues.
We show that exploiting available camera intrinsic data and ground plane assumption for cross-frame correspondence can lead to a light-weight network with significantly improved performances in speed and accuracy.
arXiv Detail & Related papers (2024-04-11T10:26:40Z) - Leveraging the Edge and Cloud for V2X-Based Real-Time Object Detection
in Autonomous Driving [0.0]
Environmental perception is a key element of autonomous driving.
In this paper, we investigate the best trade-off between detection quality and latency for real-time perception in autonomous vehicles.
We show that models with adequate compression can be run in real-time on the cloud while outperforming local detection performance.
arXiv Detail & Related papers (2023-08-09T21:39:10Z) - Event-Free Moving Object Segmentation from Moving Ego Vehicle [88.33470650615162]
Moving object segmentation (MOS) in dynamic scenes is an important, challenging, but under-explored research topic for autonomous driving.
Most segmentation methods leverage motion cues obtained from optical flow maps.
We propose to exploit event cameras for better video understanding, which provide rich motion cues without relying on optical flow.
arXiv Detail & Related papers (2023-04-28T23:43:10Z) - Real-Time Driver Monitoring Systems through Modality and View Analysis [28.18784311981388]
Driver distractions are known to be the dominant cause of road accidents.
State-of-the-art methods prioritize accuracy while ignoring latency.
We propose time-effective detection models by neglecting the temporal relation between video frames.
arXiv Detail & Related papers (2022-10-17T21:22:41Z) - Distortion-Aware Network Pruning and Feature Reuse for Real-time Video
Segmentation [49.17930380106643]
We propose a novel framework to speed up any architecture with skip-connections for real-time vision tasks.
Specifically, at the arrival of each frame, we transform the features from the previous frame to reuse them at specific spatial bins.
We then perform partial computation of the backbone network on the regions of the current frame that captures temporal differences between the current and previous frame.
arXiv Detail & Related papers (2022-06-20T07:20:02Z) - Real-time Object Detection for Streaming Perception [84.2559631820007]
Streaming perception is proposed to jointly evaluate the latency and accuracy into a single metric for video online perception.
We build a simple and effective framework for streaming perception.
Our method achieves competitive performance on Argoverse-HD dataset and improves the AP by 4.9% compared to the strong baseline.
arXiv Detail & Related papers (2022-03-23T11:33:27Z) - Real Time Monocular Vehicle Velocity Estimation using Synthetic Data [78.85123603488664]
We look at the problem of estimating the velocity of road vehicles from a camera mounted on a moving car.
We propose a two-step approach where first an off-the-shelf tracker is used to extract vehicle bounding boxes and then a small neural network is used to regress the vehicle velocity.
arXiv Detail & Related papers (2021-09-16T13:10:27Z) - Predicting Pedestrian Crossing Intention with Feature Fusion and
Spatio-Temporal Attention [0.0]
Pedestrian crossing intention should be recognized in real-time for urban driving.
Recent works have shown the potential of using vision-based deep neural network models for this task.
This work introduces a neural network architecture to fuse inherently different novel-temporal features for pedestrian crossing intention prediction.
arXiv Detail & Related papers (2021-04-12T14:10:25Z) - Exploiting Playbacks in Unsupervised Domain Adaptation for 3D Object
Detection [55.12894776039135]
State-of-the-art 3D object detectors, based on deep learning, have shown promising accuracy but are prone to over-fit to domain idiosyncrasies.
We propose a novel learning approach that drastically reduces this gap by fine-tuning the detector on pseudo-labels in the target domain.
We show, on five autonomous driving datasets, that fine-tuning the detector on these pseudo-labels substantially reduces the domain gap to new driving environments.
arXiv Detail & Related papers (2021-03-26T01:18:11Z) - Decoupled Appearance and Motion Learning for Efficient Anomaly Detection
in Surveillance Video [9.80717374118619]
We propose a new neural network architecture that learns the normal behavior in a purely unsupervised fashion.
Our model can process 16 to 45 times more frames per second than related approaches.
arXiv Detail & Related papers (2020-11-10T11:40:06Z) - Towards Anomaly Detection in Dashcam Videos [9.558392439655012]
We propose to apply data-driven anomaly detection ideas from deep learning to dashcam videos.
We present a large and diverse dataset of truck dashcam videos, namely RetroTrucks.
We apply: (i) one-class classification loss and (ii) reconstruction-based loss, for anomaly detection on RetroTrucks.
arXiv Detail & Related papers (2020-04-11T00:10:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.