Real-time Object Detection for Streaming Perception
- URL: http://arxiv.org/abs/2203.12338v1
- Date: Wed, 23 Mar 2022 11:33:27 GMT
- Title: Real-time Object Detection for Streaming Perception
- Authors: Jinrong Yang, Songtao Liu, Zeming Li, Xiaoping Li and Jian Sun
- Abstract summary: Streaming perception is proposed to jointly evaluate the latency and accuracy into a single metric for video online perception.
We build a simple and effective framework for streaming perception.
Our method achieves competitive performance on Argoverse-HD dataset and improves the AP by 4.9% compared to the strong baseline.
- Score: 84.2559631820007
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autonomous driving requires the model to perceive the environment and (re)act
within a low latency for safety. While past works ignore the inevitable changes
in the environment after processing, streaming perception is proposed to
jointly evaluate the latency and accuracy into a single metric for video online
perception. In this paper, instead of searching trade-offs between accuracy and
speed like previous works, we point out that endowing real-time models with the
ability to predict the future is the key to dealing with this problem. We build
a simple and effective framework for streaming perception. It equips a novel
DualFlow Perception module (DFP), which includes dynamic and static flows to
capture the moving trend and basic detection feature for streaming prediction.
Further, we introduce a Trend-Aware Loss (TAL) combined with a trend factor to
generate adaptive weights for objects with different moving speeds. Our simple
method achieves competitive performance on Argoverse-HD dataset and improves
the AP by 4.9% compared to the strong baseline, validating its effectiveness.
Our code will be made available at https://github.com/yancie-yjr/StreamYOLO.
Related papers
- Real-time Stereo-based 3D Object Detection for Streaming Perception [12.52037626475608]
We introduce StreamDSGN, the first real-time stereo-based 3D object detection framework designed for streaming perception.
StreamDSGN directly predicts the 3D properties of objects in the next moment by leveraging historical information.
Compared with the strong baseline, StreamDSGN significantly improves the streaming average precision by up to 4.33%.
arXiv Detail & Related papers (2024-10-16T09:23:02Z) - Physics-guided Active Sample Reweighting for Urban Flow Prediction [75.24539704456791]
Urban flow prediction is a nuanced-temporal modeling that estimates the throughput of transportation services like buses, taxis and ride-driven models.
Some recent prediction solutions bring remedies with the notion of physics-guided machine learning (PGML)
We develop a atized physics-guided network (PN), and propose a data-aware framework Physics-guided Active Sample Reweighting (P-GASR)
arXiv Detail & Related papers (2024-07-18T15:44:23Z) - OFMPNet: Deep End-to-End Model for Occupancy and Flow Prediction in Urban Environment [0.0]
We introduce an end-to-end neural network methodology designed to predict the future behaviors of all dynamic objects in the environment.
We propose a novel time-weighted motion flow loss, whose application has shown a substantial decrease in end-point error.
arXiv Detail & Related papers (2024-04-02T19:37:58Z) - Streaming Motion Forecasting for Autonomous Driving [71.7468645504988]
We introduce a benchmark that queries future trajectories on streaming data and we refer to it as "streaming forecasting"
Our benchmark inherently captures the disappearance and re-appearance of agents, which is a safety-critical problem yet overlooked by snapshot-based benchmarks.
We propose a plug-and-play meta-algorithm called "Predictive Streamer" that can adapt any snapshot-based forecaster into a streaming forecaster.
arXiv Detail & Related papers (2023-10-02T17:13:16Z) - Leveraging the Edge and Cloud for V2X-Based Real-Time Object Detection
in Autonomous Driving [0.0]
Environmental perception is a key element of autonomous driving.
In this paper, we investigate the best trade-off between detection quality and latency for real-time perception in autonomous vehicles.
We show that models with adequate compression can be run in real-time on the cloud while outperforming local detection performance.
arXiv Detail & Related papers (2023-08-09T21:39:10Z) - Rethinking Voxelization and Classification for 3D Object Detection [68.8204255655161]
The main challenge in 3D object detection from LiDAR point clouds is achieving real-time performance without affecting the reliability of the network.
We present a solution to improve network inference speed and precision at the same time by implementing a fast dynamic voxelizer.
In addition, we propose a lightweight detection sub-head model for classifying predicted objects and filter out false detected objects.
arXiv Detail & Related papers (2023-01-10T16:22:04Z) - StreamYOLO: Real-time Object Detection for Streaming Perception [84.2559631820007]
We endow the models with the capacity of predicting the future, significantly improving the results for streaming perception.
We consider multiple velocities driving scene and propose Velocity-awared streaming AP (VsAP) to jointly evaluate the accuracy.
Our simple method achieves the state-of-the-art performance on Argoverse-HD dataset and improves the sAP and VsAP by 4.7% and 8.2% respectively.
arXiv Detail & Related papers (2022-07-21T12:03:02Z) - Teaching BERT to Wait: Balancing Accuracy and Latency for Streaming
Disfluency Detection [3.884530687475798]
Streaming BERT-based sequence tagging model is capable of detecting disfluencies in real-time.
Model attains state-of-the-art latency and stability scores when compared with recent work on incremental disfluency detection.
arXiv Detail & Related papers (2022-05-02T02:13:24Z) - Towards Streaming Perception [70.68520310095155]
We present an approach that coherently integrates latency and accuracy into a single metric for real-time online perception.
The key insight behind this metric is to jointly evaluate the output of the entire perception stack at every time instant.
We focus on the illustrative tasks of object detection and instance segmentation in urban video streams, and contribute a novel dataset with high-quality and temporally-dense annotations.
arXiv Detail & Related papers (2020-05-21T01:51:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.