GPS-MTM: Capturing Pattern of Normalcy in GPS-Trajectories with self-supervised learning
- URL: http://arxiv.org/abs/2509.24031v2
- Date: Wed, 08 Oct 2025 08:21:22 GMT
- Title: GPS-MTM: Capturing Pattern of Normalcy in GPS-Trajectories with self-supervised learning
- Authors: Umang Garg, Bowen Zhang, Anantajit Subrahmanya, Chandrakanth Gudavalli, BS Manjunath,
- Abstract summary: We introduce the GPSMasked Trajectory Transformer (GPS-MTM), a foundation model for large-scale mobility data.<n>GPS-MTM decomposes mobility into two complementary modalities: states (point-of-interest categories) and actions (agent transitions)
- Score: 7.240185261197756
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Foundation models have driven remarkable progress in text, vision, and video understanding, and are now poised to unlock similar breakthroughs in trajectory modeling. We introduce the GPSMasked Trajectory Transformer (GPS-MTM), a foundation model for large-scale mobility data that captures patterns of normalcy in human movement. Unlike prior approaches that flatten trajectories into coordinate streams, GPS-MTM decomposes mobility into two complementary modalities: states (point-of-interest categories) and actions (agent transitions). Leveraging a bi-directional Transformer with a self-supervised masked modeling objective, the model reconstructs missing segments across modalities, enabling it to learn rich semantic correlations without manual labels. Across benchmark datasets, including Numosim-LA, Urban Anomalies, and Geolife, GPS-MTM consistently outperforms on downstream tasks such as trajectory infilling and next-stop prediction. Its advantages are most pronounced in dynamic tasks (inverse and forward dynamics), where contextual reasoning is critical. These results establish GPS-MTM as a robust foundation model for trajectory analytics, positioning mobility data as a first-class modality for large-scale representation learning. Code is released for further reference.
Related papers
- Detecting Transportation Mode Using Dense Smartphone GPS Trajectories and Transformer Models [11.280640663443826]
We introduce SpeedTransformer, a novel Transformer-based model that relies solely on speed inputs to infer transportation modes from dense smartphone GPS trajectories.<n>In benchmark experiments, SpeedTransformer outperformed traditional deep learning models, such as the Long Short-Term Memory (LSTM) network.<n>We deployed the model in a real-world experiment, where it consistently outperformed baseline models under complex built environments and high data uncertainty.
arXiv Detail & Related papers (2026-02-27T22:20:29Z) - Wireless Traffic Prediction with Large Language Model [54.07581399989292]
TIDES is a novel framework that captures spatial-temporal correlations for wireless traffic prediction.<n> TIDES achieves efficient adaptation to domain-specific patterns without incurring excessive training overhead.<n>Our results indicate that integrating spatial awareness into LLM-based predictors is the key to unlocking scalable and intelligent network management in future 6G systems.
arXiv Detail & Related papers (2025-12-19T04:47:40Z) - Multimodal Trajectory Representation Learning for Travel Time Estimation [15.25848441558445]
This paper introduces the Multimodal Dynamic Trajectory Integration framework.<n>It integrates GPS sequences, grid trajectories, and road network constraints to enhance TTE accuracy.<n>It consistently outperforms state-of-the-art baselines in experiments on three real-world datasets.
arXiv Detail & Related papers (2025-10-07T12:04:16Z) - TrajSceneLLM: A Multimodal Perspective on Semantic GPS Trajectory Analysis [0.0]
We propose TrajSceneLLM, a multimodal perspective for enhancing semantic understanding of GPS trajectories.<n>We validate the proposed framework on Travel Mode Identification (TMI), a critical task for analyzing travel choices and understanding mobility behavior.<n>This semantic enhancement promises significant potential for diverse downstream applications and future research in artificial intelligence.
arXiv Detail & Related papers (2025-06-19T15:31:40Z) - Learning Spatio-Temporal Dynamics for Trajectory Recovery via Time-Aware Transformer [9.812530969395906]
In real-world applications, GPS trajectories often suffer from low sampling rates, with large and irregular intervals between consecutive points.<n>This paper addresses the task of map-constrained trajectory recovery, aiming to enhance trajectory sampling rates.
arXiv Detail & Related papers (2025-05-20T03:09:17Z) - Dynamic Intent Queries for Motion Transformer-based Trajectory Prediction [36.287188668060075]
In autonomous driving, accurately predicting the movements of other traffic participants is crucial.<n>Our research addresses this limitation by integrating scene-specific dynamic intention points into the MTR model.<n>Our findings demonstrate that incorporating dynamic intention points has a significant positive impact on trajectory accuracy.
arXiv Detail & Related papers (2025-04-22T10:20:35Z) - Unified Human Localization and Trajectory Prediction with Monocular Vision [64.19384064365431]
MonoTransmotion is a Transformer-based framework that uses only a monocular camera to jointly solve localization and prediction tasks.<n>We show that by jointly training both tasks with our unified framework, our method is more robust in real-world scenarios made of noisy inputs.
arXiv Detail & Related papers (2025-03-05T14:18:39Z) - Multi-Transmotion: Pre-trained Model for Human Motion Prediction [68.87010221355223]
Multi-Transmotion is an innovative transformer-based model designed for cross-modality pre-training.
Our methodology demonstrates competitive performance across various datasets on several downstream tasks.
arXiv Detail & Related papers (2024-11-04T23:15:21Z) - More Than Routing: Joint GPS and Route Modeling for Refine Trajectory
Representation Learning [26.630640299709114]
We propose Joint GPS and Route Modelling based on self-supervised technology, namely JGRM.
We develop two encoders, each tailored to capture representations of route and GPS trajectories respectively.
The representations from the two modalities are fed into a shared transformer for inter-modal information interaction.
arXiv Detail & Related papers (2024-02-25T18:27:25Z) - MobilityGPT: Enhanced Human Mobility Modeling with a GPT model [12.01839817432357]
We reformat human mobility modeling as an autoregressive generation task to address these issues.
We propose a geospatially-aware generative model, MobilityGPT, to ensure its controllable generation.
Experiments on real-world datasets demonstrate MobilityGPT's superior performance over state-of-the-art methods.
arXiv Detail & Related papers (2024-02-05T18:22:21Z) - MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor.
Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z) - Motion Transformer with Global Intention Localization and Local Movement
Refinement [103.75625476231401]
Motion TRansformer (MTR) models motion prediction as the joint optimization of global intention localization and local movement refinement.
MTR achieves state-of-the-art performance on both the marginal and joint motion prediction challenges.
arXiv Detail & Related papers (2022-09-27T16:23:14Z) - Transforming Model Prediction for Tracking [109.08417327309937]
Transformers capture global relations with little inductive bias, allowing it to learn the prediction of more powerful target models.
We train the proposed tracker end-to-end and validate its performance by conducting comprehensive experiments on multiple tracking datasets.
Our tracker sets a new state of the art on three benchmarks, achieving an AUC of 68.5% on the challenging LaSOT dataset.
arXiv Detail & Related papers (2022-03-21T17:59:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.