Related papers: Exploring Transformer-Augmented LSTM for Temporal and Spatial Feature Learning in Trajectory Prediction

Exploring Transformer-Augmented LSTM for Temporal and Spatial Feature Learning in Trajectory Prediction

URL: http://arxiv.org/abs/2412.13419v1
Date: Wed, 18 Dec 2024 01:31:08 GMT
Title: Exploring Transformer-Augmented LSTM for Temporal and Spatial Feature Learning in Trajectory Prediction
Authors: Chandra Raskoti, Weizi Li,
Abstract summary: This work explores the integration of Transformer based model with Long Short-Term Memory (LSTM) based technique.<n>The proposed model is benchmarked against predecessor LSTM based methods, including STA-LSTM, SA-LSTM, CS-LSTM, and NaiveLSTM.
Score: 1.7273380623090846
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Accurate vehicle trajectory prediction is crucial for ensuring safe and efficient autonomous driving. This work explores the integration of Transformer based model with Long Short-Term Memory (LSTM) based technique to enhance spatial and temporal feature learning in vehicle trajectory prediction. Here, a hybrid model that combines LSTMs for temporal encoding with a Transformer encoder for capturing complex interactions between vehicles is proposed. Spatial trajectory features of the neighboring vehicles are processed and goes through a masked scatter mechanism in a grid based environment, which is then combined with temporal trajectory of the vehicles. This combined trajectory data are learned by sequential LSTM encoding and Transformer based attention layers. The proposed model is benchmarked against predecessor LSTM based methods, including STA-LSTM, SA-LSTM, CS-LSTM, and NaiveLSTM. Our results, while not outperforming it's predecessor, demonstrate the potential of integrating Transformers with LSTM based technique to build interpretable trajectory prediction model. Future work will explore alternative architectures using Transformer applications to further enhance performance. This study provides a promising direction for improving trajectory prediction models by leveraging transformer based architectures, paving the way for more robust and interpretable vehicle trajectory prediction system.

Related papers

DyTTP: Trajectory Prediction with Normalization-Free Transformers [0.0]
Transformer-based architectures have demonstrated significant promise in capturing complex robustnessity dependencies. We present a two-fold approach to address these challenges. First, we integrate DynamicTanh (DyT), which is the latest method to promote transformers, into the backbone, replacing traditional layer normalization. We are the first work to deploy the DyT to the trajectory prediction task.
arXiv Detail & Related papers (2025-04-07T09:26:25Z)
ConvLSTMTransNet: A Hybrid Deep Learning Approach for Internet Traffic Telemetry [0.0]
We present a novel hybrid deep learning model, named ConvLSTMTransNet, designed for time series prediction. Our findings demonstrate that ConvLSTMTransNet significantly outperforms the baseline models by approximately 10% in terms of prediction accuracy.
arXiv Detail & Related papers (2024-09-20T03:12:57Z)
Beam Prediction based on Large Language Models [51.45077318268427]
Millimeter-wave (mmWave) communication is promising for next-generation wireless networks but suffers from significant path loss. Traditional deep learning models, such as long short-term memory (LSTM), enhance beam tracking accuracy however are limited by poor robustness and generalization. In this letter, we use large language models (LLMs) to improve the robustness of beam prediction.
arXiv Detail & Related papers (2024-08-16T12:40:01Z)
Crossfusor: A Cross-Attention Transformer Enhanced Conditional Diffusion Model for Car-Following Trajectory Prediction [10.814758830775727]
This study introduces a Cross-Attention Transformer Enhanced Diffusion Model (Crossfusor) specifically designed for car-following trajectory prediction. It integrates detailed inter-vehicular interactions and car-following dynamics into a robust diffusion framework, improving both the accuracy and realism of predicted trajectories. Experimental results on the NGSIM dataset demonstrate that Crossfusor outperforms state-of-the-art models, particularly in long-term predictions.
arXiv Detail & Related papers (2024-06-17T17:35:47Z)
TrTr: A Versatile Pre-Trained Large Traffic Model based on Transformer for Capturing Trajectory Diversity in Vehicle Population [13.75828180340772]
In this study, we apply the Transformer architecture to traffic tasks, aiming to learn the diversity of trajectories within vehicle populations. We create a data structure tailored to the attention mechanism and introduce a set of noises that correspond to recurrent-temporal demands. The designed pre-training model demonstrates excellent performance in capturing the spatial distribution of the vehicle population.
arXiv Detail & Related papers (2023-09-22T07:36:22Z)
Transformers versus LSTMs for electronic trading [0.0]
This study investigates whether Transformer-based model can be applied in financial time series prediction and beat LSTM. A new LSTM-based model called DLSTM is built and new architecture for the Transformer-based model is designed to adapt for financial prediction. The experiment result reflects that the Transformer-based model only has the limited advantage in absolute price sequence prediction.
arXiv Detail & Related papers (2023-09-20T15:25:43Z)
Towards Long-Term Time-Series Forecasting: Feature, Pattern, and Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning. Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism. We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z)
Joint Spatial-Temporal and Appearance Modeling with Transformer for Multiple Object Tracking [59.79252390626194]
We propose a novel solution named TransSTAM, which leverages Transformer to model both the appearance features of each object and the spatial-temporal relationships among objects. The proposed method is evaluated on multiple public benchmarks including MOT16, MOT17, and MOT20, and it achieves a clear performance improvement in both IDF1 and HOTA.
arXiv Detail & Related papers (2022-05-31T01:19:18Z)
TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking [74.82415271960315]
We propose a solution named TransMOT to efficiently model the spatial and temporal interactions among objects in a video. TransMOT is not only more computationally efficient than the traditional Transformer, but it also achieves better tracking accuracy. The proposed method is evaluated on multiple benchmark datasets including MOT15, MOT16, MOT17, and MOT20.
arXiv Detail & Related papers (2021-04-01T01:49:05Z)
A Driving Behavior Recognition Model with Bi-LSTM and Multi-Scale CNN [59.57221522897815]
We propose a neural network model based on trajectories information for driving behavior recognition. We evaluate the proposed model on the public BLVD dataset, achieving a satisfying performance.
arXiv Detail & Related papers (2021-03-01T06:47:29Z)
Haar Wavelet based Block Autoregressive Flows for Trajectories [129.37479472754083]
Prediction of trajectories such as that of pedestrians is crucial to the performance of autonomous agents. We introduce a novel Haar wavelet based block autoregressive model leveraging split couplings. We illustrate the advantages of our approach for generating diverse and accurate trajectories on two real-world datasets.
arXiv Detail & Related papers (2020-09-21T13:57:10Z)
Traffic Agent Trajectory Prediction Using Social Convolution and Attention Mechanism [57.68557165836806]
We propose a model to predict the trajectories of target agents around an autonomous vehicle. We encode the target agent history trajectories as an attention mask and construct a social map to encode the interactive relationship between the target agent and its surrounding agents. To verify the effectiveness of our method, we widely compare with several methods on a public dataset, achieving a 20% error decrease.
arXiv Detail & Related papers (2020-07-06T03:48:08Z)
A Multi-Modal States based Vehicle Descriptor and Dilated Convolutional Social Pooling for Vehicle Trajectory Prediction [3.131740922192114]
We propose a vehicle-descriptor based LSTM model with the dilated convolutional social pooling (VD+DCS-LSTM) to cope with the above issues. Each vehicle's multi-modal state information is employed as our model's input. The validity of the overall model was verified over the NGSIM US-101 and I-80 datasets.
arXiv Detail & Related papers (2020-03-07T01:23:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.