Related papers: Unifying Environment Perception and Route Choice Modeling for Trajectory Representation Learning

Unifying Environment Perception and Route Choice Modeling for Trajectory Representation Learning

URL: http://arxiv.org/abs/2510.14819v1
Date: Thu, 16 Oct 2025 15:55:28 GMT
Title: Unifying Environment Perception and Route Choice Modeling for Trajectory Representation Learning
Authors: Ji Cao, Yu Wang, Tongya Zheng, Zujie Ren, Canghong Jin, Gang Chen, Mingli Song,
Abstract summary: Tray Learning (TRL) aims to encode raw trajectories into low-dimensional vectors, which can be leveraged in various downstream tasks, including travel time estimation, location prediction, and trajectory similarity analysis.<n>We propose a framework that unifies comprehensive environment textbfPertemporal explicit textRoute choice modeling for effective textbfPRTrajectory representation learning, dubbed textbfPRTraj.
Score: 47.00223863430964
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Trajectory Representation Learning (TRL) aims to encode raw trajectories into low-dimensional vectors, which can then be leveraged in various downstream tasks, including travel time estimation, location prediction, and trajectory similarity analysis. However, existing TRL methods suffer from a key oversight: treating trajectories as isolated spatio-temporal sequences, without considering the external environment and internal route choice behavior that govern their formation. To bridge this gap, we propose a novel framework that unifies comprehensive environment \textbf{P}erception and explicit \textbf{R}oute choice modeling for effective \textbf{Traj}ectory representation learning, dubbed \textbf{PRTraj}. Specifically, PRTraj first introduces an Environment Perception Module to enhance the road network by capturing multi-granularity environmental semantics from surrounding POI distributions. Building on this environment-aware backbone, a Route Choice Encoder then captures the route choice behavior inherent in each trajectory by modeling its constituent road segment transitions as a sequence of decisions. These route-choice-aware representations are finally aggregated to form the global trajectory embedding. Extensive experiments on 3 real-world datasets across 5 downstream tasks validate the effectiveness and generalizability of PRTraj. Moreover, PRTraj demonstrates strong data efficiency, maintaining robust performance under few-shot scenarios. Our code is available at: https://anonymous.4open.science/r/PRTraj.

Related papers

Streaming Real-Time Trajectory Prediction Using Endpoint-Aware Modeling [54.94692733670454]
Future trajectories of neighboring traffic agents have a significant influence on the path planning and decision-making of autonomous vehicles.<n>We propose a lightweight yet highly accurate streaming-based trajectory forecasting approach.<n>Our approach significantly reduces inference latency, making it well-suited for real-world deployment.
arXiv Detail & Related papers (2026-03-02T13:44:23Z)
TopoCurate:Modeling Interaction Topology for Tool-Use Agent Training [53.93696896939915]
Training tool-use agents typically rely on Supervised Fine-Tuning (SFT) on successful trajectories and Reinforcement Learning (RL) on pass-rate-selected tasks.<n>We propose TopoCurate, an interaction-aware framework that projects multi-trial rollouts from the same task into a unified semantic quotient topology.<n>TopoCurate achieves consistent gains of 4.2% (SFT) and 6.9% (RL) over state-of-the-art baselines.
arXiv Detail & Related papers (2026-03-02T10:38:54Z)
PathFinder: Advancing Path Loss Prediction for Single-to-Multi-Transmitter Scenario [60.906711761476735]
PathFinder is a novel architecture that actively models buildings and transmitters via disentangled feature encoding.<n>Tests show PathFinder outperforms state-of-the-art methods significantly, especially in challenging multi-transmitter scenarios.
arXiv Detail & Related papers (2025-12-16T07:15:15Z)
VideoGAN-based Trajectory Proposal for Automated Vehicles [1.693200946453174]
We investigate whether a generative network (GAN) trained on videos of bird's-eye view (BEV) traffic scenarios can generate statistically accurate trajectories.<n>To this end, we propose a pipeline that uses low-resolution BEV occupancy grid videos as training data for a video generative model.<n>We obtain our best results within 100 GPU hours of training, with inference times under 20,ms.
arXiv Detail & Related papers (2025-06-19T10:57:44Z)
Efficient Data Representation for Motion Forecasting: A Scene-Specific Trajectory Set Approach [12.335528093380631]
This study introduces a novel approach for generating scene-specific trajectory sets tailored to different contexts.<n>A deterministic goal sampling algorithm identifies relevant map regions, while our Recursive In-Distribution Subsampling (RIDS) method enhances trajectory plausibility.<n>Experiments on the Argoverse 2 dataset demonstrate that our method achieves up to a 10% improvement in Driving Area Compliance.
arXiv Detail & Related papers (2024-07-30T11:06:39Z)
SEPT: Towards Efficient Scene Representation Learning for Motion Prediction [19.111948522155004]
This paper presents SEPT, a modeling framework that leverages self-supervised learning to develop powerful models for complex traffic scenes. experiments demonstrate that SEPT, without elaborate architectural design or feature engineering, achieves state-of-the-art performance on the Argoverse 1 and Argoverse 2 motion forecasting benchmarks.
arXiv Detail & Related papers (2023-09-26T21:56:03Z)
Self-supervised Trajectory Representation Learning with Temporal Regularities and Travel Semantics [30.9735101687326]
Trajectory Representation Learning (TRL) is a powerful tool for spatial-temporal data analysis and management. Existing TRL works usually treat trajectories as ordinary sequence data, while some important spatial-temporal characteristics, such as temporal regularities and travel semantics, are not fully exploited. We propose a novel Self-supervised trajectory representation learning framework with TemporAl Regularities and Travel semantics, namely START.
arXiv Detail & Related papers (2022-11-17T13:14:47Z)
Neural Motion Fields: Encoding Grasp Trajectories as Implicit Value Functions [65.84090965167535]
We present Neural Motion Fields, a novel object representation which encodes both object point clouds and the relative task trajectories as an implicit value function parameterized by a neural network. This object-centric representation models a continuous distribution over the SE(3) space and allows us to perform grasping reactively by leveraging sampling-based MPC to optimize this value function.
arXiv Detail & Related papers (2022-06-29T18:47:05Z)
Cycle and Semantic Consistent Adversarial Domain Adaptation for Reducing Simulation-to-Real Domain Shift in LiDAR Bird's Eye View [110.83289076967895]
We present a BEV domain adaptation method based on CycleGAN that uses prior semantic classification in order to preserve the information of small objects of interest during the domain adaptation process. The quality of the generated BEVs has been evaluated using a state-of-the-art 3D object detection framework at KITTI 3D Object Detection Benchmark.
arXiv Detail & Related papers (2021-04-22T12:47:37Z)
Exploring Dynamic Context for Multi-path Trajectory Prediction [33.66335553588001]
We propose a novel framework, named Dynamic Context Network (DCENet) In our framework, the spatial context between agents is explored by using self-attention architectures. A set of future trajectories for each agent is predicted conditioned on the learned spatial-temporal context.
arXiv Detail & Related papers (2020-10-30T13:39:20Z)
A Deep Learning Framework for Generation and Analysis of Driving Scenario Trajectories [2.908482270923597]
We propose a unified deep learning framework for the generation and analysis of driving scenario trajectories. We experimentally investigate the performance of the proposed framework on real-world scenario trajectories obtained from in-field data collection.
arXiv Detail & Related papers (2020-07-28T23:33:05Z)
Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties. Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates. The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z)
Counterfactual Vision-and-Language Navigation via Adversarial Path Sampling [65.99956848461915]
Vision-and-Language Navigation (VLN) is a task where agents must decide how to move through a 3D environment to reach a goal.<n>One of the problems of the VLN task is data scarcity since it is difficult to collect enough navigation paths with human-annotated instructions for interactive environments.<n>We propose an adversarial-driven counterfactual reasoning model that can consider effective conditions instead of low-quality augmented data.
arXiv Detail & Related papers (2019-11-17T18:02:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.