Related papers: MSCMHMST: A traffic flow prediction model based on Transformer

MSCMHMST: A traffic flow prediction model based on Transformer

URL: http://arxiv.org/abs/2503.13540v1
Date: Sun, 16 Mar 2025 03:40:32 GMT
Title: MSCMHMST: A traffic flow prediction model based on Transformer
Authors: Weiyang Geng, Yiming Pan, Zhecong Xing, Dongyu Liu, Rui Liu, Yuan Zhu,
Abstract summary: This study proposes a hybrid model based on Transformers, named MSCMHMST, aimed at addressing key challenges in traffic flow prediction.<n>The MSCMHMST model introduces a multi-head, multi-scale attention mechanism, allowing the model to parallel process different parts of the data and learn its intrinsic representations from multiple perspectives.<n> Verified through experiments on the PeMS04/08 dataset with specific experimental settings, the MSCMHMST model demonstrated excellent robustness and accuracy in long, medium, and short-term traffic flow predictions.
Score: 7.350117994428983
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: This study proposes a hybrid model based on Transformers, named MSCMHMST, aimed at addressing key challenges in traffic flow prediction. Traditional single-method approaches show limitations in traffic prediction tasks, whereas hybrid methods, by integrating the strengths of different models, can provide more accurate and robust predictions. The MSCMHMST model introduces a multi-head, multi-scale attention mechanism, allowing the model to parallel process different parts of the data and learn its intrinsic representations from multiple perspectives, thereby enhancing the model's ability to handle complex situations. This mechanism enables the model to capture features at various scales effectively, understanding both short-term changes and long-term trends. Verified through experiments on the PeMS04/08 dataset with specific experimental settings, the MSCMHMST model demonstrated excellent robustness and accuracy in long, medium, and short-term traffic flow predictions. The results indicate that this model has significant potential, offering a new and effective solution for the field of traffic flow prediction.

Related papers

FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation [51.110607281391154]
FlowMo is a training-free guidance method for enhancing motion coherence in text-to-video models.<n>It estimates motion coherence by measuring the patch-wise variance across the temporal dimension and guides the model to reduce this variance dynamically during sampling.
arXiv Detail & Related papers (2025-06-01T19:55:33Z)
MSTIM: A MindSpore-Based Model for Traffic Flow Prediction [2.4604039212534508]
This paper proposes a multi-scale time series information modelling model MSTIM based on the Mindspore framework. It integrates long and short-term memory networks (LSTMs), convolutional neural networks (CNN) and the attention mechanism to improve the modelling accuracy and stability. The experimental results show that the MSTIM model achieves better results in the metrics of Mean Absolute Error (MAE), Mean Square Error (MSE), and Root Mean Square Error (RMSE)
arXiv Detail & Related papers (2025-04-18T09:19:51Z)
Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging [75.93960998357812]
Deep model merging represents an emerging research direction that combines multiple fine-tuned models to harness their capabilities across different tasks and domains. Current model merging techniques focus on merging all available models simultaneously, with weight matrices-based methods being the predominant approaches. We propose a training-free projection-based continual merging method that processes models sequentially.
arXiv Detail & Related papers (2025-01-16T13:17:24Z)
MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models. We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z)
Enhanced Prediction of Multi-Agent Trajectories via Control Inference and State-Space Dynamics [14.694200929205975]
This paper introduces a novel methodology for trajectory forecasting based on state-space dynamic system modeling. To enhance the precision of state estimations within the dynamic system, the paper also presents a novel modeling technique for control variables. The proposed approach ingeniously integrates graph neural networks with state-space models, effectively capturing the complexities of multi-agent interactions.
arXiv Detail & Related papers (2024-08-08T08:33:02Z)
Diffusion-Based Environment-Aware Trajectory Prediction [3.1406146587437904]
The ability to predict the future trajectories of traffic participants is crucial for the safe and efficient operation of autonomous vehicles. In this paper, a diffusion-based generative model for multi-agent trajectory prediction is proposed. The model is capable of capturing the complex interactions between traffic participants and the environment, accurately learning the multimodal nature of the data.
arXiv Detail & Related papers (2024-03-18T10:35:15Z)
Towards Generalizable and Interpretable Motion Prediction: A Deep Variational Bayes Approach [54.429396802848224]
This paper proposes an interpretable generative model for motion prediction with robust generalizability to out-of-distribution cases. For interpretability, the model achieves the target-driven motion prediction by estimating the spatial distribution of long-term destinations. Experiments on motion prediction datasets validate that the fitted model can be interpretable and generalizable.
arXiv Detail & Related papers (2024-03-10T04:16:04Z)
Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models. We present theoretical results on the expected churn between models within the Rashomon set. We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z)
Hybrid hidden Markov LSTM for short-term traffic flow prediction [0.0]
We propose a hybrid hidden Markov-LSTM model that is capable of learning complementary features in traffic data. Results indicate significant performance gains in using hybrid architecture compared to conventional methods.
arXiv Detail & Related papers (2023-07-11T00:56:44Z)
MTP-GO: Graph-Based Probabilistic Multi-Agent Trajectory Prediction with Neural ODEs [2.4169078025984825]
We introduce our model titled MTP-GO. It encodes the scene using temporal graph neural networks to produce the inputs to an underlying motion model. Results illustrate the predictive capabilities of the proposed model across various data sets.
arXiv Detail & Related papers (2023-02-01T20:03:47Z)
Multi-scale Attention Flow for Probabilistic Time Series Forecasting [68.20798558048678]
We propose a novel non-autoregressive deep learning model, called Multi-scale Attention Normalizing Flow(MANF) Our model avoids the influence of cumulative error and does not increase the time complexity. Our model achieves state-of-the-art performance on many popular multivariate datasets.
arXiv Detail & Related papers (2022-05-16T07:53:42Z)
Generative Temporal Difference Learning for Infinite-Horizon Prediction [101.59882753763888]
We introduce the $gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon. We discuss how its training reflects an inescapable tradeoff between training-time and testing-time compounding errors.
arXiv Detail & Related papers (2020-10-27T17:54:12Z)
Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors. We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method. Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.