A Time Series is Worth Five Experts: Heterogeneous Mixture of Experts for Traffic Flow Prediction
- URL: http://arxiv.org/abs/2409.17440v1
- Date: Thu, 26 Sep 2024 00:26:47 GMT
- Title: A Time Series is Worth Five Experts: Heterogeneous Mixture of Experts for Traffic Flow Prediction
- Authors: Guangyu Wang, Yujie Chen, Ming Gao, Zhiqiao Wu, Jiafu Tang, Jiabi Zhao,
- Abstract summary: We propose a Heterogeneous Mixture of Experts (TITAN) model for traffic flow prediction.
Experiments on two public traffic network datasets, METR-LA and P-BAY, demonstrate that TITAN effectively captures variable-centric dependencies.
It achieves improvements in all evaluation metrics, ranging from approximately 4.37% to 11.53%, compared to previous state-of-the-art (SOTA) models.
- Score: 9.273632869779929
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate traffic prediction faces significant challenges, necessitating a deep understanding of both temporal and spatial cues and their complex interactions across multiple variables. Recent advancements in traffic prediction systems are primarily due to the development of complex sequence-centric models. However, existing approaches often embed multiple variables and spatial relationships at each time step, which may hinder effective variable-centric learning, ultimately leading to performance degradation in traditional traffic prediction tasks. To overcome these limitations, we introduce variable-centric and prior knowledge-centric modeling techniques. Specifically, we propose a Heterogeneous Mixture of Experts (TITAN) model for traffic flow prediction. TITAN initially consists of three experts focused on sequence-centric modeling. Then, designed a low-rank adaptive method, TITAN simultaneously enables variable-centric modeling. Furthermore, we supervise the gating process using a prior knowledge-centric modeling strategy to ensure accurate routing. Experiments on two public traffic network datasets, METR-LA and PEMS-BAY, demonstrate that TITAN effectively captures variable-centric dependencies while ensuring accurate routing. Consequently, it achieves improvements in all evaluation metrics, ranging from approximately 4.37\% to 11.53\%, compared to previous state-of-the-art (SOTA) models. The code is open at \href{https://github.com/sqlcow/TITAN}{https://github.com/sqlcow/TITAN}.
Related papers
- Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging [75.93960998357812]
Deep model merging represents an emerging research direction that combines multiple fine-tuned models to harness their capabilities across different tasks and domains.
Current model merging techniques focus on merging all available models simultaneously, with weight matrices-based methods being the predominant approaches.
We propose a training-free projection-based continual merging method that processes models sequentially.
arXiv Detail & Related papers (2025-01-16T13:17:24Z) - FRTP: Federating Route Search Records to Enhance Long-term Traffic Prediction [1.5728609542259502]
We propose a federated architecture capable of learning from raw data with varying features and time granularities or lengths.
Our experiments focus on federating route search records and begin by processing raw data within the model framework.
The accuracy of the proposed model is demonstrated through evaluations using diverse learning patterns and parameter settings.
arXiv Detail & Related papers (2024-12-23T08:14:20Z) - Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection.
We design a forgery-style mixture formulation that augments the diversity of forgery source domains.
We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z) - A Multi-Graph Convolutional Neural Network Model for Short-Term Prediction of Turning Movements at Signalized Intersections [0.6215404942415159]
This study introduces a novel deep learning architecture, referred to as the multigraph convolution neural network (MGCNN) for turning movement prediction at intersections.
The proposed architecture combines a multigraph structure, built to model temporal variations in traffic data, with a spectral convolution operation to support modeling the spatial variations in traffic data over the graphs.
The model's ability to perform short-term predictions over 1, 2, 3, 4, and 5 minutes into the future was evaluated against four baseline state-of-the-art models.
arXiv Detail & Related papers (2024-06-02T05:41:25Z) - Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent [53.637837706712794]
We propose a Unified Trajectory Generation model, UniTraj, that processes arbitrary trajectories as masked inputs.
Specifically, we introduce a Ghost Spatial Masking (GSM) module embedded within a Transformer encoder for spatial feature extraction.
We benchmark three practical sports game datasets, Basketball-U, Football-U, and Soccer-U, for evaluation.
arXiv Detail & Related papers (2024-05-27T22:15:23Z) - TESTAM: A Time-Enhanced Spatio-Temporal Attention Model with Mixture of
Experts [6.831798156287652]
We propose a novel deep learning model named TESTAM, which individually models recurring and non-recurring traffic patterns.
We show that TESTAM achieves a better indication and modeling of recurring and non-recurring traffic.
arXiv Detail & Related papers (2024-03-05T02:27:52Z) - Multi-scale Attention Flow for Probabilistic Time Series Forecasting [68.20798558048678]
We propose a novel non-autoregressive deep learning model, called Multi-scale Attention Normalizing Flow(MANF)
Our model avoids the influence of cumulative error and does not increase the time complexity.
Our model achieves state-of-the-art performance on many popular multivariate datasets.
arXiv Detail & Related papers (2022-05-16T07:53:42Z) - SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction [72.37440317774556]
We propose advances that address two key challenges in future trajectory prediction.
multimodality in both training data and predictions and constant time inference regardless of number of agents.
arXiv Detail & Related papers (2020-07-26T08:17:10Z) - Traffic congestion anomaly detection and prediction using deep learning [6.370406399003785]
Congestion prediction is a major priority for traffic management centres around the world to ensure timely incident response handling.
The increasing amounts of generated traffic data have been used to train machine learning predictors for traffic, but this is a challenging task due to inter-dependencies of traffic flow both in time and space.
We show that our deep learning models consistently outperform traditional methods, and we conduct a comparative analysis of the optimal time horizon of historical data required to predict traffic flow at different time points in the future.
arXiv Detail & Related papers (2020-06-23T08:49:46Z) - Forecast Network-Wide Traffic States for Multiple Steps Ahead: A Deep
Learning Approach Considering Dynamic Non-Local Spatial Correlation and
Non-Stationary Temporal Dependency [6.019104024723682]
This research studies two particular problems in traffic forecasting: (1) capture the dynamic and non-local spatial correlation between traffic links and (2) model the dynamics of temporal dependency for accurate multiple steps ahead predictions.
We propose a deep learning framework named Spatial-Temporal Sequence to Sequence model (STSeq2Seq) to address these issues.
This model builds on sequence to sequence (seq2seq) architecture to capture temporal feature and relies on graph convolution for aggregating spatial information.
arXiv Detail & Related papers (2020-04-06T03:40:56Z) - A Spatial-Temporal Attentive Network with Spatial Continuity for
Trajectory Prediction [74.00750936752418]
We propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC)
First, spatial-temporal attention mechanism is presented to explore the most useful and important information.
Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity.
arXiv Detail & Related papers (2020-03-13T04:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.