Transformer-based models and hardware acceleration analysis in
autonomous driving: A survey
- URL: http://arxiv.org/abs/2304.10891v1
- Date: Fri, 21 Apr 2023 11:15:31 GMT
- Title: Transformer-based models and hardware acceleration analysis in
autonomous driving: A survey
- Authors: Juan Zhong, Zheng Liu, Xi Chen
- Abstract summary: Transformer-based models specifically tailored for autonomous driving tasks such as lane detection, segmentation, tracking, planning, and decision-making.
We review different architectures for organizing Transformer inputs and outputs, such as encoder-decoder and encoder-only structures.
We discuss Transformer-related operators and their hardware acceleration schemes in depth, taking into account key factors such as quantization and runtime.
- Score: 7.129512302898792
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformer architectures have exhibited promising performance in various
autonomous driving applications in recent years. On the other hand, its
dedicated hardware acceleration on portable computational platforms has become
the next critical step for practical deployment in real autonomous vehicles.
This survey paper provides a comprehensive overview, benchmark, and analysis of
Transformer-based models specifically tailored for autonomous driving tasks
such as lane detection, segmentation, tracking, planning, and decision-making.
We review different architectures for organizing Transformer inputs and
outputs, such as encoder-decoder and encoder-only structures, and explore their
respective advantages and disadvantages. Furthermore, we discuss
Transformer-related operators and their hardware acceleration schemes in depth,
taking into account key factors such as quantization and runtime. We
specifically illustrate the operator level comparison between layers from
convolutional neural network, Swin-Transformer, and Transformer with 4D
encoder. The paper also highlights the challenges, trends, and current insights
in Transformer-based models, addressing their hardware deployment and
acceleration issues within the context of long-term autonomous driving
applications.
Related papers
- A Survey of Vision Transformers in Autonomous Driving: Current Trends
and Future Directions [0.0]
This survey explores the adaptation of visual transformer models in Autonomous Driving.
It focuses on foundational concepts such as self-attention, multi-head attention, and encoder-decoder architecture.
Survey concludes with future research directions, highlighting the growing role of Vision Transformers in Autonomous Driving.
arXiv Detail & Related papers (2024-03-12T11:29:40Z) - Isomer: Isomerous Transformer for Zero-shot Video Object Segmentation [59.91357714415056]
We propose two Transformer variants: Context-Sharing Transformer (CST) and Semantic Gathering-Scattering Transformer (S GST)
CST learns the global-shared contextual information within image frames with a lightweight computation; S GST models the semantic correlation separately for the foreground and background.
Compared with the baseline that uses vanilla Transformers for multi-stage fusion, ours significantly increase the speed by 13 times and achieves new state-of-the-art ZVOS performance.
arXiv Detail & Related papers (2023-08-13T06:12:00Z) - Exploring the Performance and Efficiency of Transformer Models for NLP
on Mobile Devices [3.809702129519641]
New deep neural network (DNN) architectures and approaches are emerging every few years, driving the field's advancement.
Transformers are a relatively new model family that has achieved new levels of accuracy across AI tasks, but poses significant computational challenges.
This work aims to make steps towards bridging this gap by examining the current state of Transformers' on-device execution.
arXiv Detail & Related papers (2023-06-20T10:15:01Z) - A Comprehensive Survey on Applications of Transformers for Deep Learning
Tasks [60.38369406877899]
Transformer is a deep neural network that employs a self-attention mechanism to comprehend the contextual relationships within sequential data.
transformer models excel in handling long dependencies between input sequence elements and enable parallel processing.
Our survey encompasses the identification of the top five application domains for transformer-based models.
arXiv Detail & Related papers (2023-06-11T23:13:51Z) - Transformers in Time-series Analysis: A Tutorial [0.0]
Transformer architecture has widespread applications, particularly in Natural Language Processing and computer vision.
This tutorial provides an overview of the Transformer architecture, its applications, and a collection of examples from recent research papers in time-series analysis.
arXiv Detail & Related papers (2022-04-28T05:17:45Z) - Thinking Like Transformers [64.96770952820691]
We propose a computational model for the transformer-encoder in the form of a programming language.
We show how RASP can be used to program solutions to tasks that could conceivably be learned by a Transformer.
We provide RASP programs for histograms, sorting, and Dyck-languages.
arXiv Detail & Related papers (2021-06-13T13:04:46Z) - Spatiotemporal Transformer for Video-based Person Re-identification [102.58619642363958]
We show that, despite the strong learning ability, the vanilla Transformer suffers from an increased risk of over-fitting.
We propose a novel pipeline where the model is pre-trained on a set of synthesized video data and then transferred to the downstream domains.
The derived algorithm achieves significant accuracy gain on three popular video-based person re-identification benchmarks.
arXiv Detail & Related papers (2021-03-30T16:19:27Z) - Transformers Solve the Limited Receptive Field for Monocular Depth
Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers.
This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z) - Transformers in Vision: A Survey [101.07348618962111]
Transformers enable modeling long dependencies between input sequence elements and support parallel processing of sequence.
Transformers require minimal inductive biases for their design and are naturally suited as set-functions.
This survey aims to provide a comprehensive overview of the Transformer models in the computer vision discipline.
arXiv Detail & Related papers (2021-01-04T18:57:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.