Multimodal Trajectory Representation Learning for Travel Time Estimation
- URL: http://arxiv.org/abs/2510.05840v1
- Date: Tue, 07 Oct 2025 12:04:16 GMT
- Title: Multimodal Trajectory Representation Learning for Travel Time Estimation
- Authors: Zhi Liu, Xuyuan Hu, Xiao Han, Zhehao Dai, Zhaolin Deng, Guojiang Shen, Xiangjie Kong,
- Abstract summary: This paper introduces the Multimodal Dynamic Trajectory Integration framework.<n>It integrates GPS sequences, grid trajectories, and road network constraints to enhance TTE accuracy.<n>It consistently outperforms state-of-the-art baselines in experiments on three real-world datasets.
- Score: 15.25848441558445
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate travel time estimation (TTE) plays a crucial role in intelligent transportation systems. However, it remains challenging due to heterogeneous data sources and complex traffic dynamics. Moreover, conventional approaches typically convert trajectories into fixed-length representations, neglecting the inherent variability of real-world trajectories, which often leads to information loss or feature redundancy. To address these challenges, this paper introduces the Multimodal Dynamic Trajectory Integration (MDTI) framework--a novel multimodal trajectory representation learning approach that integrates GPS sequences, grid trajectories, and road network constraints to enhance TTE accuracy. MDTI employs modality-specific encoders and a cross-modal interaction module to capture complementary spatial, temporal, and topological semantics, while a dynamic trajectory modeling mechanism adaptively regulates information density for trajectories of varying lengths. Two self-supervised pretraining objectives, named contrastive alignment and masked language modeling, further strengthen multimodal consistency and contextual understanding. Extensive experiments on three real-world datasets demonstrate that MDTI consistently outperforms state-of-the-art baselines, confirming its robustness and strong generalization abilities. The code is publicly available at: https://github.com/freshhxy/MDTI/
Related papers
- Empower Low-Altitude Economy: A Reliability-Aware Dynamic Weighting Allocation for Multi-modal UAV Beam Prediction [57.04985443535312]
Low-altitude economy (LAE) is rapidly expanding driven by urban air mobility, logistics drones, and aerial sensing.<n>Current research is shifting from single-signal to multi-modal collaborative approaches.<n>We propose a reliability-aware dynamic weighting scheme applied to a semantic-aware multi-modal beam prediction framework, named SaM2B.
arXiv Detail & Related papers (2025-12-30T16:24:34Z) - Wireless Traffic Prediction with Large Language Model [54.07581399989292]
TIDES is a novel framework that captures spatial-temporal correlations for wireless traffic prediction.<n> TIDES achieves efficient adaptation to domain-specific patterns without incurring excessive training overhead.<n>Our results indicate that integrating spatial awareness into LLM-based predictors is the key to unlocking scalable and intelligent network management in future 6G systems.
arXiv Detail & Related papers (2025-12-19T04:47:40Z) - Spatial-Temporal-Spectral Unified Modeling for Remote Sensing Dense Prediction [20.1863553357121]
Current deep learning architectures for remote sensing are fundamentally rigid.<n>We introduce the Spatial-Temporal-Spectral Unified Network (STSUN) for unified modeling.<n> STSUN can adapt to input and output data with arbitrary spatial sizes, temporal lengths, and spectral bands.<n>It unifies various dense prediction tasks and diverse semantic class predictions.
arXiv Detail & Related papers (2025-05-18T07:39:17Z) - DG-STMTL: A Novel Graph Convolutional Network for Multi-Task Spatio-Temporal Traffic Forecasting [0.0]
Key challenge to accurate prediction is how to model the complex-temporal dependencies and adapt to the inherent dynamics in data.<n>Traditional Graph Contemporal Networks (GCNs) often struggle with static adjacency matrices that introduce bias or learnable patterns.<n>This study introduces a novel MTL framework, Dynamic Group-wise S-temporal Multi-Temporal Learning (DGS-TLTM)
arXiv Detail & Related papers (2025-04-10T15:00:20Z) - Resource-Efficient Beam Prediction in mmWave Communications with Multimodal Realistic Simulation Framework [57.994965436344195]
Beamforming is a key technology in millimeter-wave (mmWave) communications that improves signal transmission by optimizing directionality and intensity.<n> multimodal sensing-aided beam prediction has gained significant attention, using various sensing data to predict user locations or network conditions.<n>Despite its promising potential, the adoption of multimodal sensing-aided beam prediction is hindered by high computational complexity, high costs, and limited datasets.
arXiv Detail & Related papers (2025-04-07T15:38:25Z) - Exploiting Multimodal Spatial-temporal Patterns for Video Object Tracking [53.33637391723555]
We propose a unified multimodal spatial-temporal tracking approach named STTrack.<n>In contrast to previous paradigms, we introduced a temporal state generator (TSG) that continuously generates a sequence of tokens containing multimodal temporal information.<n>These temporal information tokens are used to guide the localization of the target in the next time state, establish long-range contextual relationships between video frames, and capture the temporal trajectory of the target.
arXiv Detail & Related papers (2024-12-20T09:10:17Z) - Context-Enhanced Multi-View Trajectory Representation Learning: Bridging the Gap through Self-Supervised Models [27.316692263196277]
MVTraj is a novel multi-view modeling method for trajectory representation learning.
It integrates diverse contextual knowledge, from GPS to road network and points-of-interest to provide a more comprehensive understanding of trajectory data.
Extensive experiments on real-world datasets demonstrate that MVTraj significantly outperforms existing baselines in tasks associated with various spatial views.
arXiv Detail & Related papers (2024-10-17T03:56:12Z) - FusionTransNet for Smart Urban Mobility: Spatiotemporal Traffic Forecasting Through Multimodal Network Integration [11.77238206600775]
FusionTransNet is a framework designed for Origin-Destination (OD) flow predictions within and smart urban transportation systems.
The framework contains three core components: the Intra-modal Learning Module, the Inter-modal Learning Module, and the Prediction Decoder.
arXiv Detail & Related papers (2024-05-09T14:09:36Z) - Spatiotemporal Implicit Neural Representation as a Generalized Traffic Data Learner [46.866240648471894]
Spatiotemporal Traffic Data (STTD) measures the complex dynamical behaviors of the multiscale transportation system.
We present a novel paradigm to address the STTD learning problem by parameterizing STTD as an implicit neural representation.
We validate its effectiveness through extensive experiments in real-world scenarios, showcasing applications from corridor to network scales.
arXiv Detail & Related papers (2024-05-06T06:23:06Z) - Fine-Grained Trajectory-based Travel Time Estimation for Multi-city
Scenarios Based on Deep Meta-Learning [18.786481521834762]
Travel Time Estimation (TTE) is indispensable in intelligent transportation system (ITS)
It is significant to achieve the fine-grained Trajectory-based Travel Time Estimation (TTTE) for multi-city scenarios.
We propose a meta learning based framework, MetaTTE, to continuously provide accurate travel time estimation over time.
arXiv Detail & Related papers (2022-01-20T06:35:51Z) - Aerial Images Meet Crowdsourced Trajectories: A New Approach to Robust
Road Extraction [110.61383502442598]
We introduce a novel neural network framework termed Cross-Modal Message Propagation Network (CMMPNet)
CMMPNet is composed of two deep Auto-Encoders for modality-specific representation learning and a tailor-designed Dual Enhancement Module for cross-modal representation refinement.
Experiments on three real-world benchmarks demonstrate the effectiveness of our CMMPNet for robust road extraction.
arXiv Detail & Related papers (2021-11-30T04:30:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.