Related papers: Detecting Transportation Mode Using Dense Smartphone GPS Trajectories and Transformer Models

Detecting Transportation Mode Using Dense Smartphone GPS Trajectories and Transformer Models

URL: http://arxiv.org/abs/2603.00340v1
Date: Fri, 27 Feb 2026 22:20:29 GMT
Title: Detecting Transportation Mode Using Dense Smartphone GPS Trajectories and Transformer Models
Authors: Yuandong Zhang, Othmane Echchabi, Tianshu Feng, Wenyi Zhang, Hsuai-Kai Liao, Charles Chang,
Abstract summary: We introduce SpeedTransformer, a novel Transformer-based model that relies solely on speed inputs to infer transportation modes from dense smartphone GPS trajectories.<n>In benchmark experiments, SpeedTransformer outperformed traditional deep learning models, such as the Long Short-Term Memory (LSTM) network.<n>We deployed the model in a real-world experiment, where it consistently outperformed baseline models under complex built environments and high data uncertainty.
Score: 11.280640663443826
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transportation mode detection is an important topic within GeoAI and transportation research. In this study, we introduce SpeedTransformer, a novel Transformer-based model that relies solely on speed inputs to infer transportation modes from dense smartphone GPS trajectories. In benchmark experiments, SpeedTransformer outperformed traditional deep learning models, such as the Long Short-Term Memory (LSTM) network. Moreover, the model demonstrated strong flexibility in transfer learning, achieving high accuracy across geographical regions after fine-tuning with small datasets. Finally, we deployed the model in a real-world experiment, where it consistently outperformed baseline models under complex built environments and high data uncertainty. These findings suggest that Transformer architectures, when combined with dense GPS trajectories, hold substantial potential for advancing transportation mode detection and broader mobility-related research.

Related papers

Wireless Traffic Prediction with Large Language Model [54.07581399989292]
TIDES is a novel framework that captures spatial-temporal correlations for wireless traffic prediction.<n> TIDES achieves efficient adaptation to domain-specific patterns without incurring excessive training overhead.<n>Our results indicate that integrating spatial awareness into LLM-based predictors is the key to unlocking scalable and intelligent network management in future 6G systems.
arXiv Detail & Related papers (2025-12-19T04:47:40Z)
Multimodal Trajectory Representation Learning for Travel Time Estimation [15.25848441558445]
This paper introduces the Multimodal Dynamic Trajectory Integration framework.<n>It integrates GPS sequences, grid trajectories, and road network constraints to enhance TTE accuracy.<n>It consistently outperforms state-of-the-art baselines in experiments on three real-world datasets.
arXiv Detail & Related papers (2025-10-07T12:04:16Z)
Traj-Transformer: Diffusion Models with Transformer for GPS Trajectory Generation [15.689474391811734]
We propose Trajectory Transformer, a novel model that employs a transformer backbone for both conditional information embedding and noise prediction.<n>Experiments on two real-world datasets demonstrate that Tray Transformer significantly enhances generation quality and effectively alleviates the issues observed in prior approaches.
arXiv Detail & Related papers (2025-10-07T05:41:09Z)
GPS-MTM: Capturing Pattern of Normalcy in GPS-Trajectories with self-supervised learning [7.240185261197756]
We introduce the GPSMasked Trajectory Transformer (GPS-MTM), a foundation model for large-scale mobility data.<n>GPS-MTM decomposes mobility into two complementary modalities: states (point-of-interest categories) and actions (agent transitions)
arXiv Detail & Related papers (2025-09-28T19:00:50Z)
Resource-Efficient Beam Prediction in mmWave Communications with Multimodal Realistic Simulation Framework [57.994965436344195]
Beamforming is a key technology in millimeter-wave (mmWave) communications that improves signal transmission by optimizing directionality and intensity.<n> multimodal sensing-aided beam prediction has gained significant attention, using various sensing data to predict user locations or network conditions.<n>Despite its promising potential, the adoption of multimodal sensing-aided beam prediction is hindered by high computational complexity, high costs, and limited datasets.
arXiv Detail & Related papers (2025-04-07T15:38:25Z)
Gaussian Splatting to Real World Flight Navigation Transfer with Liquid Networks [93.38375271826202]
We present a method to improve generalization and robustness to distribution shifts in sim-to-real visual quadrotor navigation tasks. We first build a simulator by integrating Gaussian splatting with quadrotor flight dynamics, and then, train robust navigation policies using Liquid neural networks. In this way, we obtain a full-stack imitation learning protocol that combines advances in 3D Gaussian splatting radiance field rendering, programming of expert demonstration training data, and the task understanding capabilities of Liquid networks.
arXiv Detail & Related papers (2024-06-21T13:48:37Z)
MobilityGPT: Enhanced Human Mobility Modeling with a GPT model [15.16172813601417]
We reformat human mobility modeling as an autoregressive generation task, leveraging the Generative Pre-trained Transformer architecture.<n>We propose a geospatially-aware generative model, MobilityGPT, to ensure its controllable generation.<n> Experiments on real-world datasets demonstrate MobilityGPT's superior performance over state-of-the-art methods.
arXiv Detail & Related papers (2024-02-05T18:22:21Z)
CTIN: Robust Contextual Transformer Network for Inertial Navigation [20.86392550313961]
We propose a robust Con Transformer-based network for Inertial Navigation(CTIN) to accurately predict velocity and trajectory. CTIN is very robust and outperforms state-of-the-art models.
arXiv Detail & Related papers (2021-12-03T19:57:34Z)
TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking [74.82415271960315]
We propose a solution named TransMOT to efficiently model the spatial and temporal interactions among objects in a video. TransMOT is not only more computationally efficient than the traditional Transformer, but it also achieves better tracking accuracy. The proposed method is evaluated on multiple benchmark datasets including MOT15, MOT16, MOT17, and MOT20.
arXiv Detail & Related papers (2021-04-01T01:49:05Z)
A Driving Behavior Recognition Model with Bi-LSTM and Multi-Scale CNN [59.57221522897815]
We propose a neural network model based on trajectories information for driving behavior recognition. We evaluate the proposed model on the public BLVD dataset, achieving a satisfying performance.
arXiv Detail & Related papers (2021-03-01T06:47:29Z)
Multi-intersection Traffic Optimisation: A Benchmark Dataset and a Strong Baseline [85.9210953301628]
Control of traffic signals is fundamental and critical to alleviate traffic congestion in urban areas. Because of the high complexity of modelling the problem, experimental settings of current works are often inconsistent. We propose a novel and strong baseline model based on deep reinforcement learning with the encoder-decoder structure.
arXiv Detail & Related papers (2021-01-24T03:55:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.