TransParking: A Dual-Decoder Transformer Framework with Soft Localization for End-to-End Automatic Parking
- URL: http://arxiv.org/abs/2503.06071v1
- Date: Sat, 08 Mar 2025 05:41:24 GMT
- Title: TransParking: A Dual-Decoder Transformer Framework with Soft Localization for End-to-End Automatic Parking
- Authors: Hangyu Du, Chee-Meng Chew,
- Abstract summary: We present a vision-based transformer model for end-to-end automatic parking, trained using expert trajectories.<n> Experimental results demonstrate that the various errors of our model have decreased by approximately 50%.
- Score: 2.209921757303168
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, fully differentiable end-to-end autonomous driving systems have become a research hotspot in the field of intelligent transportation. Among various research directions, automatic parking is particularly critical as it aims to enable precise vehicle parking in complex environments. In this paper, we present a purely vision-based transformer model for end-to-end automatic parking, trained using expert trajectories. Given camera-captured data as input, the proposed model directly outputs future trajectory coordinates. Experimental results demonstrate that the various errors of our model have decreased by approximately 50% in comparison with the current state-of-the-art end-to-end trajectory prediction algorithm of the same type. Our approach thus provides an effective solution for fully differentiable automatic parking.
Related papers
- ParkDiffusion: Heterogeneous Multi-Agent Multi-Modal Trajectory Prediction for Automated Parking using Diffusion Models [6.58562706945347]
ParkDiffusion is a novel approach that predicts the trajectories of both vehicles and pedestrians in automated parking scenarios.
ParkDiffusion employs diffusion models to capture the inherent uncertainty and multi-modality of future trajectories.
We evaluate ParkDiffusion on the Dragon Lake Parking dataset and the Intersections Drone dataset.
arXiv Detail & Related papers (2025-05-01T15:16:59Z) - Spatial Temporal Attention based Target Vehicle Trajectory Prediction for Internet of Vehicles [4.1268583353286]
This study introduces the Spatio Temporal Attention-based methodology for Vehicle Trajectory Prediction Target (VTPred)<n>We map the vehicle trajectory onto a graph directed, after which spatial attributes are extracted via a Graph Attention Networks(GATs)<n>The Transformer technology is employed to yield temporal features from the sequence, resulting in precise trajectory prediction.
arXiv Detail & Related papers (2025-01-01T16:37:24Z) - DiFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Self-Driving [55.53171248839489]
We propose an ego-centric fully sparse paradigm, named DiFSD, for end-to-end self-driving.<n>Specifically, DiFSD mainly consists of sparse perception, hierarchical interaction and iterative motion planner.<n>Experiments conducted on nuScenes and Bench2Drive datasets demonstrate the superior planning performance and great efficiency of DiFSD.
arXiv Detail & Related papers (2024-09-15T15:55:24Z) - Automated Parking Planning with Vision-Based BEV Approach [10.936433798200907]
This paper proposes an improved automated parking algorithm based on the A* algorithm, integrating vehicle kinematic models, function optimization, bidirectional search, and Bezier curve optimization.
Compared to traditional algorithms, this approach demonstrates reduced computation time with more challenging collision-risk test cases and improved performance in comfort metrics.
arXiv Detail & Related papers (2024-05-24T15:26:09Z) - Leverage Multi-source Traffic Demand Data Fusion with Transformer Model for Urban Parking Prediction [4.672121078249809]
This study proposes a parking availability prediction framework integrating spatial-temporal deep learning with multi-source data fusion.
The framework is based on the Transformer as the spatial-temporal deep learning model.
Real-world empirical data was used to verify the effectiveness of the proposed method.
arXiv Detail & Related papers (2024-05-02T07:28:27Z) - AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving [68.73885845181242]
We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios.
We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
arXiv Detail & Related papers (2024-03-26T04:27:56Z) - Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction [69.29802752614677]
RouteFormer is a novel ego-trajectory prediction network combining GPS data, environmental context, and the driver's field-of-view.
To tackle data scarcity and enhance diversity, we introduce GEM, a dataset of urban driving scenarios enriched with synchronized driver field-of-view and gaze data.
arXiv Detail & Related papers (2023-12-13T23:06:30Z) - Automated Automotive Radar Calibration With Intelligent Vehicles [73.15674960230625]
We present an approach for automated and geo-referenced calibration of automotive radar sensors.
Our method does not require external modifications of a vehicle and instead uses the location data obtained from automated vehicles.
Our evaluation on data from a real testing site shows that our method can correctly calibrate infrastructure sensors in an automated manner.
arXiv Detail & Related papers (2023-06-23T07:01:10Z) - Exploring Contextual Representation and Multi-Modality for End-to-End
Autonomous Driving [58.879758550901364]
Recent perception systems enhance spatial understanding with sensor fusion but often lack full environmental context.
We introduce a framework that integrates three cameras to emulate the human field of view, coupled with top-down bird-eye-view semantic data to enhance contextual representation.
Our method achieves displacement error by 0.67m in open-loop settings, surpassing current methods by 6.9% on the nuScenes dataset.
arXiv Detail & Related papers (2022-10-13T05:56:20Z) - ParkPredict+: Multimodal Intent and Motion Prediction for Vehicles in
Parking Lots with CNN and Transformer [11.287187018907284]
multimodal intent and trajectory prediction for human-driven vehicles in parking lots is addressed in this paper.
Using models designed with CNN and Transformer networks, we extract temporal-spatial and contextual information from trajectory history and local bird's eye view semantic images.
Our methods outperforms existing models in accuracy, while allowing an arbitrary number of modes.
In addition, we present the first public human driving dataset in parking lot with high resolution and rich traffic scenarios.
arXiv Detail & Related papers (2022-04-17T01:54:25Z) - Model-based Decision Making with Imagination for Autonomous Parking [50.41076449007115]
The proposed algorithm consists of three parts: an imaginative model for anticipating results before parking, an improved rapid-exploring random tree (RRT) and a path smoothing module.
Our algorithm is based on a real kinematic vehicle model; which makes it more suitable for algorithm application on real autonomous cars.
In order to evaluate the algorithm's effectiveness, we have compared our algorithm with traditional RRT, within three different parking scenarios.
arXiv Detail & Related papers (2021-08-25T18:24:34Z) - ParkPredict: Motion and Intent Prediction of Vehicles in Parking Lots [65.33650222396078]
We develop a parking lot environment and collect a dataset of human parking maneuvers.
We compare a multi-modal Long Short-Term Memory (LSTM) prediction model and a Convolution Neural Network LSTM (CNN-LSTM) to a physics-based Extended Kalman Filter (EKF) baseline.
Our results show that 1) intent can be estimated well (roughly 85% top-1 accuracy and nearly 100% top-3 accuracy with the LSTM and CNN-LSTM model); 2) knowledge of the human driver's intended parking spot has a major impact on predicting parking trajectory; and 3) the semantic representation of the environment
arXiv Detail & Related papers (2020-04-21T20:46:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.