Related papers: GPS-MTM: Capturing Pattern of Normalcy in GPS-Trajectories with self-supervised learning

GPS-MTM: Capturing Pattern of Normalcy in GPS-Trajectories with self-supervised learning

URL: http://arxiv.org/abs/2509.24031v2
Date: Wed, 08 Oct 2025 08:21:22 GMT
Title: GPS-MTM: Capturing Pattern of Normalcy in GPS-Trajectories with self-supervised learning
Authors: Umang Garg, Bowen Zhang, Anantajit Subrahmanya, Chandrakanth Gudavalli, BS Manjunath,
Abstract summary: We introduce the GPSMasked Trajectory Transformer (GPS-MTM), a foundation model for large-scale mobility data.<n>GPS-MTM decomposes mobility into two complementary modalities: states (point-of-interest categories) and actions (agent transitions)
Score: 7.240185261197756
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Foundation models have driven remarkable progress in text, vision, and video understanding, and are now poised to unlock similar breakthroughs in trajectory modeling. We introduce the GPSMasked Trajectory Transformer (GPS-MTM), a foundation model for large-scale mobility data that captures patterns of normalcy in human movement. Unlike prior approaches that flatten trajectories into coordinate streams, GPS-MTM decomposes mobility into two complementary modalities: states (point-of-interest categories) and actions (agent transitions). Leveraging a bi-directional Transformer with a self-supervised masked modeling objective, the model reconstructs missing segments across modalities, enabling it to learn rich semantic correlations without manual labels. Across benchmark datasets, including Numosim-LA, Urban Anomalies, and Geolife, GPS-MTM consistently outperforms on downstream tasks such as trajectory infilling and next-stop prediction. Its advantages are most pronounced in dynamic tasks (inverse and forward dynamics), where contextual reasoning is critical. These results establish GPS-MTM as a robust foundation model for trajectory analytics, positioning mobility data as a first-class modality for large-scale representation learning. Code is released for further reference.

Related papers

Detecting Transportation Mode Using Dense Smartphone GPS Trajectories and Transformer Models [11.280640663443826]
We introduce SpeedTransformer, a novel Transformer-based model that relies solely on speed inputs to infer transportation modes from dense smartphone GPS trajectories.<n>In benchmark experiments, SpeedTransformer outperformed traditional deep learning models, such as the Long Short-Term Memory (LSTM) network.<n>We deployed the model in a real-world experiment, where it consistently outperformed baseline models under complex built environments and high data uncertainty.
arXiv Detail & Related papers (2026-02-27T22:20:29Z)
Wireless Traffic Prediction with Large Language Model [54.07581399989292]
TIDES is a novel framework that captures spatial-temporal correlations for wireless traffic prediction.<n> TIDES achieves efficient adaptation to domain-specific patterns without incurring excessive training overhead.<n>Our results indicate that integrating spatial awareness into LLM-based predictors is the key to unlocking scalable and intelligent network management in future 6G systems.
arXiv Detail & Related papers (2025-12-19T04:47:40Z)
Multimodal Trajectory Representation Learning for Travel Time Estimation [15.25848441558445]
This paper introduces the Multimodal Dynamic Trajectory Integration framework.<n>It integrates GPS sequences, grid trajectories, and road network constraints to enhance TTE accuracy.<n>It consistently outperforms state-of-the-art baselines in experiments on three real-world datasets.
arXiv Detail & Related papers (2025-10-07T12:04:16Z)
TrajSceneLLM: A Multimodal Perspective on Semantic GPS Trajectory Analysis [0.0]
We propose TrajSceneLLM, a multimodal perspective for enhancing semantic understanding of GPS trajectories.<n>We validate the proposed framework on Travel Mode Identification (TMI), a critical task for analyzing travel choices and understanding mobility behavior.<n>This semantic enhancement promises significant potential for diverse downstream applications and future research in artificial intelligence.
arXiv Detail & Related papers (2025-06-19T15:31:40Z)
Learning Spatio-Temporal Dynamics for Trajectory Recovery via Time-Aware Transformer [9.812530969395906]
In real-world applications, GPS trajectories often suffer from low sampling rates, with large and irregular intervals between consecutive points.<n>This paper addresses the task of map-constrained trajectory recovery, aiming to enhance trajectory sampling rates.
arXiv Detail & Related papers (2025-05-20T03:09:17Z)
Dynamic Intent Queries for Motion Transformer-based Trajectory Prediction [36.287188668060075]
In autonomous driving, accurately predicting the movements of other traffic participants is crucial.<n>Our research addresses this limitation by integrating scene-specific dynamic intention points into the MTR model.<n>Our findings demonstrate that incorporating dynamic intention points has a significant positive impact on trajectory accuracy.
arXiv Detail & Related papers (2025-04-22T10:20:35Z)
Unified Human Localization and Trajectory Prediction with Monocular Vision [64.19384064365431]
MonoTransmotion is a Transformer-based framework that uses only a monocular camera to jointly solve localization and prediction tasks.<n>We show that by jointly training both tasks with our unified framework, our method is more robust in real-world scenarios made of noisy inputs.
arXiv Detail & Related papers (2025-03-05T14:18:39Z)
Multi-Transmotion: Pre-trained Model for Human Motion Prediction [68.87010221355223]
Multi-Transmotion is an innovative transformer-based model designed for cross-modality pre-training. Our methodology demonstrates competitive performance across various datasets on several downstream tasks.
arXiv Detail & Related papers (2024-11-04T23:15:21Z)
More Than Routing: Joint GPS and Route Modeling for Refine Trajectory Representation Learning [26.630640299709114]
We propose Joint GPS and Route Modelling based on self-supervised technology, namely JGRM. We develop two encoders, each tailored to capture representations of route and GPS trajectories respectively. The representations from the two modalities are fed into a shared transformer for inter-modal information interaction.
arXiv Detail & Related papers (2024-02-25T18:27:25Z)
MobilityGPT: Enhanced Human Mobility Modeling with a GPT model [12.01839817432357]
We reformat human mobility modeling as an autoregressive generation task to address these issues. We propose a geospatially-aware generative model, MobilityGPT, to ensure its controllable generation. Experiments on real-world datasets demonstrate MobilityGPT's superior performance over state-of-the-art methods.
arXiv Detail & Related papers (2024-02-05T18:22:21Z)
MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor. Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z)
Motion Transformer with Global Intention Localization and Local Movement Refinement [103.75625476231401]
Motion TRansformer (MTR) models motion prediction as the joint optimization of global intention localization and local movement refinement. MTR achieves state-of-the-art performance on both the marginal and joint motion prediction challenges.
arXiv Detail & Related papers (2022-09-27T16:23:14Z)
Transforming Model Prediction for Tracking [109.08417327309937]
Transformers capture global relations with little inductive bias, allowing it to learn the prediction of more powerful target models. We train the proposed tracker end-to-end and validate its performance by conducting comprehensive experiments on multiple tracking datasets. Our tracker sets a new state of the art on three benchmarks, achieving an AUC of 68.5% on the challenging LaSOT dataset.
arXiv Detail & Related papers (2022-03-21T17:59:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.