Cross-Modal Reconstruction Pretraining for Ramp Flow Prediction at Highway Interchanges
- URL: http://arxiv.org/abs/2510.03381v1
- Date: Fri, 03 Oct 2025 15:26:56 GMT
- Title: Cross-Modal Reconstruction Pretraining for Ramp Flow Prediction at Highway Interchanges
- Authors: Yongchao Li, Jun Chen, Zhuoxuan Li, Chao Gao, Yang Li, Chu Zhang, Changyin Dong,
- Abstract summary: STDAE is a two-stage framework that leverages cross-modal reconstruction pretraining.<n>STDAE-GWNET consistently outperforms thirteen state-of-the-art baselines.<n>This demonstrates its effectiveness in overcoming detector scarcity and its plug-and-play potential for diverse forecasting pipelines.
- Score: 30.274689865122056
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Interchanges are crucial nodes for vehicle transfers between highways, yet the lack of real-time ramp detectors creates blind spots in traffic prediction. To address this, we propose a Spatio-Temporal Decoupled Autoencoder (STDAE), a two-stage framework that leverages cross-modal reconstruction pretraining. In the first stage, STDAE reconstructs historical ramp flows from mainline data, forcing the model to capture intrinsic spatio-temporal relations. Its decoupled architecture with parallel spatial and temporal autoencoders efficiently extracts heterogeneous features. In the prediction stage, the learned representations are integrated with models such as GWNet to enhance accuracy. Experiments on three real-world interchange datasets show that STDAE-GWNET consistently outperforms thirteen state-of-the-art baselines and achieves performance comparable to models using historical ramp data. This demonstrates its effectiveness in overcoming detector scarcity and its plug-and-play potential for diverse forecasting pipelines.
Related papers
- GEnSHIN: Graphical Enhanced Spatio-temporal Hierarchical Inference Network for Traffic Flow Prediction [0.7605656525323705]
This paper proposes a Graph Enhanced S-temporal Hierarchical Inference Network (GEnSHIN) to handle the complex-temporal dependencies in traffic flow prediction.<n>Experiments on the public dataset METR-LA show that GEnSHIN surpasses the performance of comparative models across multiple metrics.
arXiv Detail & Related papers (2026-01-08T03:27:10Z) - Towards Resilient Transportation: A Conditional Transformer for Accident-Informed Traffic Forecasting [16.242959582777797]
Accurate forecasting is hindered by the complex influence of external factors such as traffic accidents and regulations.<n>We propose ConFormer, a framework that integrates graph propagation with guided normalization layer.<n>Our model surpasses the state-of-the-art STAEFormer in both predictive performance and efficiency.
arXiv Detail & Related papers (2025-12-10T07:50:20Z) - A Retrieval Augmented Spatio-Temporal Framework for Traffic Prediction [33.28893562327803]
RAST achieves superior performance while maintaining efficiency in large-scale datasets.<n>Our framework consists of three key designs: 1) Decoupled and Query Retriever to capture decoupled temporal features and construct residual fusion via Retrieval-Augmented Generation (RAG); 2) Universal Backbone Predict Storeor that accommodates pre-trained ST-GNNs or simple predictors; and 3) Universal Backbone Predict Storeor that accommodates pre-trained ST-GNNs or simple predictors.
arXiv Detail & Related papers (2025-08-14T10:11:39Z) - LMPOcc: 3D Semantic Occupancy Prediction Utilizing Long-Term Memory Prior from Historical Traversals [4.970345700893879]
Longterm Memory Prior Occupancy (LMPOcc) is the first 3D occupancy prediction methodology that exploits long-term memory priors derived from historical perceptual outputs.<n>We introduce a plug-and-play architecture that integrates long-term memory priors to enhance local perception while simultaneously constructing global occupancy representations.
arXiv Detail & Related papers (2025-04-18T09:58:48Z) - SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining [62.433137130087445]
SuperFlow++ is a novel framework that integrates pretraining and downstream tasks using consecutive camera pairs.<n>We show that SuperFlow++ outperforms state-of-the-art methods across diverse tasks and driving conditions.<n>With strong generalizability and computational efficiency, SuperFlow++ establishes a new benchmark for data-efficient LiDAR-based perception in autonomous driving.
arXiv Detail & Related papers (2025-03-25T17:59:57Z) - ALOcc: Adaptive Lifting-Based 3D Semantic Occupancy and Cost Volume-Based Flow Predictions [91.55655961014027]
3D semantic occupancy and flow prediction are fundamental to understanding scene scene.<n>This paper proposes a vision-based framework with three targeted improvements.<n>Our purely convolutional architecture establishes new SOTA performance on multiple benchmarks for both semantic occupancy and joint semantic-flow prediction.
arXiv Detail & Related papers (2024-11-12T11:32:56Z) - OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries.
OPUS incorporates a suite of non-trivial strategies to enhance model performance.
Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z) - AdaOcc: Adaptive Forward View Transformation and Flow Modeling for 3D Occupancy and Flow Prediction [56.72301849123049]
We present our solution for the Vision-Centric 3D Occupancy and Flow Prediction track in the nuScenes Open-Occ dataset challenge at CVPR 2024.
Our innovative approach involves a dual-stage framework that enhances 3D occupancy and flow predictions by incorporating adaptive forward view transformation and flow modeling.
Our method combines regression with classification to address scale variations in different scenes, and leverages predicted flow to warp current voxel features to future frames, guided by future frame ground truth.
arXiv Detail & Related papers (2024-07-01T16:32:15Z) - A Multi-Channel Spatial-Temporal Transformer Model for Traffic Flow Forecasting [0.0]
We propose a multi-channel spatial-temporal transformer model for traffic flow forecasting.
It improves the accuracy of the prediction by fusing results from different channels of traffic data.
Experimental results on six real-world datasets demonstrate that introducing a multi-channel mechanism into the temporal model enhances performance.
arXiv Detail & Related papers (2024-05-10T06:37:07Z) - HPNet: Dynamic Trajectory Forecasting with Historical Prediction Attention [76.37139809114274]
HPNet is a novel dynamic trajectory forecasting method.
We propose a Historical Prediction Attention module to automatically encode the dynamic relationship between successive predictions.
Our code is available at https://github.com/XiaolongTang23/HPNet.
arXiv Detail & Related papers (2024-04-09T14:42:31Z) - SEPT: Towards Efficient Scene Representation Learning for Motion
Prediction [19.111948522155004]
This paper presents SEPT, a modeling framework that leverages self-supervised learning to develop powerful models for complex traffic scenes.
experiments demonstrate that SEPT, without elaborate architectural design or feature engineering, achieves state-of-the-art performance on the Argoverse 1 and Argoverse 2 motion forecasting benchmarks.
arXiv Detail & Related papers (2023-09-26T21:56:03Z) - An End-to-End Vehicle Trajcetory Prediction Framework [3.7311680121118345]
An accurate prediction of a future trajectory does not just rely on the previous trajectory, but also a simulation of the complex interactions between other vehicles nearby.
Most state-of-the-art networks built to tackle the problem assume readily available past trajectory points.
We propose a novel end-to-end architecture that takes raw video inputs and outputs future trajectory predictions.
arXiv Detail & Related papers (2023-04-19T15:42:03Z) - PSTN: Periodic Spatial-temporal Deep Neural Network for Traffic
Condition Prediction [8.255993195520306]
We propose a periodic deeptemporal neural network (PSTN) with three modules to improve the forecasting performance of traffic conditions.
First, the historical traffic information is folded and fed into a module consisting of a graph convolutional network and a temporal convolutional network.
arXiv Detail & Related papers (2021-08-05T07:42:43Z) - Spatio-temporal Modeling for Large-scale Vehicular Networks Using Graph
Convolutional Networks [110.80088437391379]
A graph-based framework called SMART is proposed to model and keep track of the statistics of vehicle-to-temporal (V2I) communication latency across a large geographical area.
We develop a graph reconstruction-based approach using a graph convolutional network integrated with a deep Q-networks algorithm.
Our results show that the proposed method can significantly improve both the accuracy and efficiency for modeling and the latency performance of large vehicular networks.
arXiv Detail & Related papers (2021-03-13T06:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.