Related papers: ParkFormer: A Transformer-Based Parking Policy with Goal Embedding and Pedestrian-Aware Control

ParkFormer: A Transformer-Based Parking Policy with Goal Embedding and Pedestrian-Aware Control

URL: http://arxiv.org/abs/2506.16856v1
Date: Fri, 20 Jun 2025 09:14:09 GMT
Title: ParkFormer: A Transformer-Based Parking Policy with Goal Embedding and Pedestrian-Aware Control
Authors: Jun Fu, Bin Tian, Haonan Chen, Shi Meng, Tingting Yao,
Abstract summary: Transformer-based end-to-end framework for autonomous parking learns from expert demonstrations.<n>Network takes as input surround-view camera images, goal-point representations, ego vehicle motion, and pedestrian trajectories.<n>Experiments show our model achieves a high success rate of 96.57%, with average positional and orientation errors of 0.21 meters and 0.41 degrees, respectively.
Score: 8.713707183974304
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Autonomous parking plays a vital role in intelligent vehicle systems, particularly in constrained urban environments where high-precision control is required. While traditional rule-based parking systems struggle with environmental uncertainties and lack adaptability in crowded or dynamic scenes, human drivers demonstrate the ability to park intuitively without explicit modeling. Inspired by this observation, we propose a Transformer-based end-to-end framework for autonomous parking that learns from expert demonstrations. The network takes as input surround-view camera images, goal-point representations, ego vehicle motion, and pedestrian trajectories. It outputs discrete control sequences including throttle, braking, steering, and gear selection. A novel cross-attention module integrates BEV features with target points, and a GRU-based pedestrian predictor enhances safety by modeling dynamic obstacles. We validate our method on the CARLA 0.9.14 simulator in both vertical and parallel parking scenarios. Experiments show our model achieves a high success rate of 96.57\%, with average positional and orientation errors of 0.21 meters and 0.41 degrees, respectively. The ablation studies further demonstrate the effectiveness of key modules such as pedestrian prediction and goal-point attention fusion. The code and dataset will be released at: https://github.com/little-snail-f/ParkFormer.

Related papers

ParkDiffusion: Heterogeneous Multi-Agent Multi-Modal Trajectory Prediction for Automated Parking using Diffusion Models [6.58562706945347]
ParkDiffusion is a novel approach that predicts the trajectories of both vehicles and pedestrians in automated parking scenarios.<n>ParkDiffusion employs diffusion models to capture the inherent uncertainty and multi-modality of future trajectories.<n>We evaluate ParkDiffusion on the Dragon Lake Parking dataset and the Intersections Drone dataset.
arXiv Detail & Related papers (2025-05-01T15:16:59Z)
GPD-1: Generative Pre-training for Driving [77.06803277735132]
We propose a unified Generative Pre-training for Driving (GPD-1) model to accomplish all these tasks.<n>We represent each scene with ego, agent, and map tokens and formulate autonomous driving as a unified token generation problem.<n>Our GPD-1 successfully generalizes to various tasks without finetuning, including scene generation, traffic simulation, closed-loop simulation, map prediction, and motion planning.
arXiv Detail & Related papers (2024-12-11T18:59:51Z)
Planning with Adaptive World Models for Autonomous Driving [50.4439896514353]
We present nuPlan, a real-world motion planning benchmark that captures multi-agent interactions.<n>We learn to model such unique behaviors with BehaviorNet, a graph convolutional neural network (GCNN)<n>We also present AdaptiveDriver, a model-predictive control (MPC) based planner that unrolls different world models conditioned on BehaviorNet's predictions.
arXiv Detail & Related papers (2024-06-15T18:53:45Z)
Online Calibration of a Single-Track Ground Vehicle Dynamics Model by Tight Fusion with Visual-Inertial Odometry [8.165828311550152]
We present ST-VIO, a novel approach which tightly fuses a single-track dynamics model for wheeled ground vehicles with visual inertial odometry (VIO) Our method calibrates and adapts the dynamics model online to improve the accuracy of forward prediction conditioned on future control inputs.
arXiv Detail & Related papers (2023-09-20T08:50:30Z)
Robust Autonomous Vehicle Pursuit without Expert Steering Labels [41.168074206046164]
We present a learning method for lateral and longitudinal motion control of an ego-vehicle for vehicle pursuit. The car being controlled does not have a pre-defined route, rather it reactively adapts to follow a target vehicle while maintaining a safety distance. We extensively validate our approach using the CARLA simulator on a wide range of terrains.
arXiv Detail & Related papers (2023-08-16T14:09:39Z)
Motion Planning and Control for Multi Vehicle Autonomous Racing at High Speeds [100.61456258283245]
This paper presents a multi-layer motion planning and control architecture for autonomous racing. The proposed solution has been applied on a Dallara AV-21 racecar and tested at oval race tracks achieving lateral accelerations up to 25 $m/s2$.
arXiv Detail & Related papers (2022-07-22T15:16:54Z)
Control-Aware Prediction Objectives for Autonomous Driving [78.19515972466063]
We present control-aware prediction objectives (CAPOs) to evaluate the downstream effect of predictions on control without requiring the planner be differentiable. We propose two types of importance weights that weight the predictive likelihood: one using an attention model between agents, and another based on control variation when exchanging predicted trajectories for ground truth trajectories.
arXiv Detail & Related papers (2022-04-28T07:37:21Z)
Fully End-to-end Autonomous Driving with Semantic Depth Cloud Mapping and Multi-Agent [2.512827436728378]
We propose a novel deep learning model trained with end-to-end and multi-task learning manners to perform both perception and control tasks simultaneously. The model is evaluated on CARLA simulator with various scenarios made of normal-adversarial situations and different weathers to mimic real-world conditions.
arXiv Detail & Related papers (2022-04-12T03:57:01Z)
Detecting 32 Pedestrian Attributes for Autonomous Vehicles [103.87351701138554]
In this paper, we address the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes. We introduce a Multi-Task Learning (MTL) model relying on a composite field framework, which achieves both goals in an efficient way. We show competitive detection and attribute recognition results, as well as a more stable MTL training.
arXiv Detail & Related papers (2020-12-04T15:10:12Z)
BayesRace: Learning to race autonomously using prior experience [20.64931380046805]
We present a model-based planning and control framework for autonomous racing. Our approach alleviates the gap induced by simulation-based controller design by learning from on-board sensor measurements.
arXiv Detail & Related papers (2020-05-10T19:15:06Z)
ParkPredict: Motion and Intent Prediction of Vehicles in Parking Lots [65.33650222396078]
We develop a parking lot environment and collect a dataset of human parking maneuvers. We compare a multi-modal Long Short-Term Memory (LSTM) prediction model and a Convolution Neural Network LSTM (CNN-LSTM) to a physics-based Extended Kalman Filter (EKF) baseline. Our results show that 1) intent can be estimated well (roughly 85% top-1 accuracy and nearly 100% top-3 accuracy with the LSTM and CNN-LSTM model); 2) knowledge of the human driver's intended parking spot has a major impact on predicting parking trajectory; and 3) the semantic representation of the environment
arXiv Detail & Related papers (2020-04-21T20:46:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.