Transtreaming: Adaptive Delay-aware Transformer for Real-time Streaming Perception
- URL: http://arxiv.org/abs/2409.06584v1
- Date: Tue, 10 Sep 2024 15:26:38 GMT
- Title: Transtreaming: Adaptive Delay-aware Transformer for Real-time Streaming Perception
- Authors: Xiang Zhang, Yufei Cui, Chenchen Fu, Weiwei Wu, Zihao Wang, Yuyang Sun, Xue Liu,
- Abstract summary: This work presents an innovative real-time streaming perception method, Transtreaming, which addresses the challenge of real-time object detection with dynamic computational delay.
The proposed model outperforms the existing state-of-the-art methods, even in single-frame detection scenarios.
Transtreaming meets the stringent real-time processing requirements on all kinds of devices.
- Score: 18.403242474776764
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Real-time object detection is critical for the decision-making process for many real-world applications, such as collision avoidance and path planning in autonomous driving. This work presents an innovative real-time streaming perception method, Transtreaming, which addresses the challenge of real-time object detection with dynamic computational delay. The core innovation of Transtreaming lies in its adaptive delay-aware transformer, which can concurrently predict multiple future frames and select the output that best matches the real-world present time, compensating for any system-induced computation delays. The proposed model outperforms the existing state-of-the-art methods, even in single-frame detection scenarios, by leveraging a transformer-based methodology. It demonstrates robust performance across a range of devices, from powerful V100 to modest 2080Ti, achieving the highest level of perceptual accuracy on all platforms. Unlike most state-of-the-art methods that struggle to complete computation within a single frame on less powerful devices, Transtreaming meets the stringent real-time processing requirements on all kinds of devices. The experimental results emphasize the system's adaptability and its potential to significantly improve the safety and reliability for many real-world systems, such as autonomous driving.
Related papers
- Fast-COS: A Fast One-Stage Object Detector Based on Reparameterized Attention Vision Transformer for Autonomous Driving [3.617580194719686]
This paper introduces Fast-COS, a novel single-stage object detection framework crafted specifically for driving scenes.
RAViT achieves 81.4% Top-1 accuracy on the ImageNet-1K dataset.
It surpasses leading models in efficiency, delivering up to 75.9% faster GPU inference and 1.38 higher throughput on edge devices.
arXiv Detail & Related papers (2025-02-11T09:54:09Z) - Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics [50.191655141020505]
We introduce a novel framework for learning world models.
By providing a scalable and robust framework, we pave the way for adaptive and efficient robotic systems in real-world applications.
arXiv Detail & Related papers (2025-01-17T10:39:09Z) - CorrDiff: Adaptive Delay-aware Detector with Temporal Cue Inputs for Real-time Object Detection [11.714072240331518]
CorrDiff is designed to tackle the challenge of delays in real-time detection systems.
It is able to utilize runtime-estimated temporal cues to predict objects' locations for multiple future frames.
It meets the stringent real-time processing requirements on all kinds of devices.
arXiv Detail & Related papers (2025-01-09T10:34:25Z) - Event-Aided Time-to-Collision Estimation for Autonomous Driving [28.13397992839372]
We present a novel method that estimates the time to collision using a neuromorphic event-based camera.
The proposed algorithm consists of a two-step approach for efficient and accurate geometric model fitting on event data.
Experiments on both synthetic and real data demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2024-07-10T02:37:36Z) - MTD: Multi-Timestep Detector for Delayed Streaming Perception [0.5439020425819]
Streaming perception is a task of reporting the current state of the world, which is used to evaluate the delay and accuracy of autonomous driving systems.
This paper propose the Multi- Timestep Detector (MTD), an end-to-end detector which uses dynamic routing for multi-branch future prediction.
The proposed method has been evaluated on the Argoverse-HD dataset, and the experimental results show that it has achieved state-of-the-art performance across various delay settings.
arXiv Detail & Related papers (2023-09-13T06:23:58Z) - StreamYOLO: Real-time Object Detection for Streaming Perception [84.2559631820007]
We endow the models with the capacity of predicting the future, significantly improving the results for streaming perception.
We consider multiple velocities driving scene and propose Velocity-awared streaming AP (VsAP) to jointly evaluate the accuracy.
Our simple method achieves the state-of-the-art performance on Argoverse-HD dataset and improves the sAP and VsAP by 4.7% and 8.2% respectively.
arXiv Detail & Related papers (2022-07-21T12:03:02Z) - Scalable Vehicle Re-Identification via Self-Supervision [66.2562538902156]
Vehicle Re-Identification is one of the key elements in city-scale vehicle analytics systems.
Many state-of-the-art solutions for vehicle re-id mostly focus on improving the accuracy on existing re-id benchmarks and often ignore computational complexity.
We propose a simple yet effective hybrid solution empowered by self-supervised training which only uses a single network during inference time.
arXiv Detail & Related papers (2022-05-16T12:14:42Z) - Real-time Object Detection for Streaming Perception [84.2559631820007]
Streaming perception is proposed to jointly evaluate the latency and accuracy into a single metric for video online perception.
We build a simple and effective framework for streaming perception.
Our method achieves competitive performance on Argoverse-HD dataset and improves the AP by 4.9% compared to the strong baseline.
arXiv Detail & Related papers (2022-03-23T11:33:27Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - Real-time Streaming Perception System for Autonomous Driving [2.6058660721533187]
We present the real-time steaming perception system, which is also the 2nd Place solution of Streaming Perception Challenge.
Unlike traditional object detection challenges, which focus mainly on the absolute performance, streaming perception task requires achieving a balance of accuracy and latency.
On the Argoverse-HD test set, our method achieves 33.2 streaming AP (34.6 streaming AP verified by the organizer) under the required hardware.
arXiv Detail & Related papers (2021-07-30T01:32:44Z) - Towards Streaming Perception [70.68520310095155]
We present an approach that coherently integrates latency and accuracy into a single metric for real-time online perception.
The key insight behind this metric is to jointly evaluate the output of the entire perception stack at every time instant.
We focus on the illustrative tasks of object detection and instance segmentation in urban video streams, and contribute a novel dataset with high-quality and temporally-dense annotations.
arXiv Detail & Related papers (2020-05-21T01:51:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.