Transformer-Based Sensor Fusion for Autonomous Driving: A Survey
- URL: http://arxiv.org/abs/2302.11481v1
- Date: Wed, 22 Feb 2023 16:28:20 GMT
- Title: Transformer-Based Sensor Fusion for Autonomous Driving: A Survey
- Authors: Apoorv Singh
- Abstract summary: Transformers-based detection head and CNN-based feature encoder to extract features from raw sensor-data has emerged as one of the best performing sensor-fusion 3D-detection-framework.
We briefly go through the Vision transformers (ViT) basics, so that readers can easily follow through the paper.
In conclusion we summarize with sensor-fusion trends to follow and provoke future research.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sensor fusion is an essential topic in many perception systems, such as
autonomous driving and robotics. Transformers-based detection head and
CNN-based feature encoder to extract features from raw sensor-data has emerged
as one of the best performing sensor-fusion 3D-detection-framework, according
to the dataset leaderboards. In this work we provide an in-depth literature
survey of transformer based 3D-object detection task in the recent past,
primarily focusing on the sensor fusion. We also briefly go through the Vision
transformers (ViT) basics, so that readers can easily follow through the paper.
Moreover, we also briefly go through few of the non-transformer based
less-dominant methods for sensor fusion for autonomous driving. In conclusion
we summarize with sensor-fusion trends to follow and provoke future research.
More updated summary can be found at:
https://github.com/ApoorvRoboticist/Transformers-Sensor-Fusion
Related papers
- ACROSS: A Deformation-Based Cross-Modal Representation for Robotic Tactile Perception [1.5566524830295307]
ACROSS is a framework for translating data between tactile sensors by exploiting sensor deformation information.
We demonstrate our approach to the most challenging problem of going from a low-dimensional tactile representation to a high-dimensional one.
arXiv Detail & Related papers (2024-11-13T11:29:14Z) - Transferring Tactile Data Across Sensors [1.5566524830295307]
This article introduces a novel method for translating data between tactile sensors.
We demonstrate the approach by translating BioTac signals into the DIGIT sensor.
Our framework consists of three steps: first, converting signal data into corresponding 3D deformation meshes; second, translating these 3D deformation meshes from one sensor to another; and third, generating output images.
arXiv Detail & Related papers (2024-10-18T09:15:47Z) - Multi-Modal 3D Object Detection by Box Matching [109.43430123791684]
We propose a novel Fusion network by Box Matching (FBMNet) for multi-modal 3D detection.
With the learned assignments between 3D and 2D object proposals, the fusion for detection can be effectively performed by combing their ROI features.
arXiv Detail & Related papers (2023-05-12T18:08:51Z) - Transformers in Remote Sensing: A Survey [76.95730131233424]
We are the first to present a systematic review of advances based on transformers in remote sensing.
Our survey covers more than 60 recent transformers-based methods for different remote sensing problems.
We conclude the survey by discussing different challenges and open issues of transformers in remote sensing.
arXiv Detail & Related papers (2022-09-02T17:57:05Z) - TransFuser: Imitation with Transformer-Based Sensor Fusion for
Autonomous Driving [46.409930329699336]
We propose TransFuser, a mechanism to integrate image and LiDAR representations using self-attention.
Our approach uses transformer modules at multiple resolutions to fuse perspective view and bird's eye view feature maps.
We experimentally validate its efficacy on a challenging new benchmark with long routes and dense traffic, as well as the official leaderboard of the CARLA urban driving simulator.
arXiv Detail & Related papers (2022-05-31T17:57:19Z) - TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with
Transformers [49.689566246504356]
We propose TransFusion, a robust solution to LiDAR-camera fusion with a soft-association mechanism to handle inferior image conditions.
TransFusion achieves state-of-the-art performance on large-scale datasets.
We extend the proposed method to the 3D tracking task and achieve the 1st place in the leaderboard of nuScenes tracking.
arXiv Detail & Related papers (2022-03-22T07:15:13Z) - ViDT: An Efficient and Effective Fully Transformer-based Object Detector [97.71746903042968]
Detection transformers are the first fully end-to-end learning systems for object detection.
vision transformers are the first fully transformer-based architecture for image classification.
In this paper, we integrate Vision and Detection Transformers (ViDT) to build an effective and efficient object detector.
arXiv Detail & Related papers (2021-10-08T06:32:05Z) - Radar Voxel Fusion for 3D Object Detection [0.0]
This paper develops a low-level sensor fusion network for 3D object detection.
The radar sensor fusion proves especially beneficial in inclement conditions such as rain and night scenes.
arXiv Detail & Related papers (2021-06-26T20:34:12Z) - Multi-Modal 3D Object Detection in Autonomous Driving: a Survey [10.913958563906931]
Self-driving cars are equipped with a suite of sensors to conduct robust and accurate environment perception.
As the number and type of sensors keep increasing, combining them for better perception is becoming a natural trend.
This survey devotes to review recent fusion-based 3D detection deep learning models that leverage multiple sensor data sources.
arXiv Detail & Related papers (2021-06-24T02:52:12Z) - Spatiotemporal Transformer for Video-based Person Re-identification [102.58619642363958]
We show that, despite the strong learning ability, the vanilla Transformer suffers from an increased risk of over-fitting.
We propose a novel pipeline where the model is pre-trained on a set of synthesized video data and then transferred to the downstream domains.
The derived algorithm achieves significant accuracy gain on three popular video-based person re-identification benchmarks.
arXiv Detail & Related papers (2021-03-30T16:19:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.