V2X-RECT: An Efficient V2X Trajectory Prediction Framework via Redundant Interaction Filtering and Tracking Error Correction
- URL: http://arxiv.org/abs/2511.17941v1
- Date: Sat, 22 Nov 2025 06:50:47 GMT
- Title: V2X-RECT: An Efficient V2X Trajectory Prediction Framework via Redundant Interaction Filtering and Tracking Error Correction
- Authors: Xiangyan Kong, Xuecheng Wu, Xiongwei Zhao, Xiaodong Li, Yunyun Shi, Gang Wang, Dingkang Yang, Yang Liu, Hong Chen, Yulong Gao,
- Abstract summary: V2X-RECT is a trajectory prediction framework designed for high-density environments.<n>It enhances data association consistency, reduces redundant interactions, and reuses historical information to enable more efficient and accurate prediction.
- Score: 30.222991833643785
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: V2X prediction can alleviate perception incompleteness caused by limited line of sight through fusing trajectory data from infrastructure and vehicles, which is crucial to traffic safety and efficiency. However, in dense traffic scenarios, frequent identity switching of targets hinders cross-view association and fusion. Meanwhile, multi-source information tends to generate redundant interactions during the encoding stage, and traditional vehicle-centric encoding leads to large amounts of repetitive historical trajectory feature encoding, degrading real-time inference performance. To address these challenges, we propose V2X-RECT, a trajectory prediction framework designed for high-density environments. It enhances data association consistency, reduces redundant interactions, and reuses historical information to enable more efficient and accurate prediction. Specifically, we design a multi-source identity matching and correction module that leverages multi-view spatiotemporal relationships to achieve stable and consistent target association, mitigating the adverse effects of mismatches on trajectory encoding and cross-view feature fusion. Then we introduce traffic signal-guided interaction module, encoding trend of traffic light changes as features and exploiting their role in constraining spatiotemporal passage rights to accurately filter key interacting vehicles, while capturing the dynamic impact of signal changes on interaction patterns. Furthermore, a local spatiotemporal coordinate encoding enables reusable features of historical trajectories and map, supporting parallel decoding and significantly improving inference efficiency. Extensive experimental results across V2X-Seq and V2X-Traj datasets demonstrate that our V2X-RECT achieves significant improvements compared to SOTA methods, while also enhancing robustness and inference efficiency across diverse traffic densities.
Related papers
- Cross-Modal Reconstruction Pretraining for Ramp Flow Prediction at Highway Interchanges [30.274689865122056]
STDAE is a two-stage framework that leverages cross-modal reconstruction pretraining.<n>STDAE-GWNET consistently outperforms thirteen state-of-the-art baselines.<n>This demonstrates its effectiveness in overcoming detector scarcity and its plug-and-play potential for diverse forecasting pipelines.
arXiv Detail & Related papers (2025-10-03T15:26:56Z) - CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction [77.8576094863446]
We propose a new detextbfCoupled dutextbfAl-interactive lineatextbfR atttextbfEntion (CARE) mechanism.
We first propose an asymmetrical feature decoupling strategy that asymmetrically decouples the learning process for local inductive bias and long-range dependencies.
By adopting a decoupled learning way and fully exploiting complementarity across features, our method can achieve both high efficiency and accuracy.
arXiv Detail & Related papers (2024-11-25T07:56:13Z) - Improving Traffic Flow Predictions with SGCN-LSTM: A Hybrid Model for Spatial and Temporal Dependencies [55.2480439325792]
This paper introduces the Signal-Enhanced Graph Convolutional Network Long Short Term Memory (SGCN-LSTM) model for predicting traffic speeds across road networks.
Experiments on the PEMS-BAY road network traffic dataset demonstrate the SGCN-LSTM model's effectiveness.
arXiv Detail & Related papers (2024-11-01T00:37:00Z) - Crossfusor: A Cross-Attention Transformer Enhanced Conditional Diffusion Model for Car-Following Trajectory Prediction [10.814758830775727]
This study introduces a Cross-Attention Transformer Enhanced Diffusion Model (Crossfusor) specifically designed for car-following trajectory prediction.
It integrates detailed inter-vehicular interactions and car-following dynamics into a robust diffusion framework, improving both the accuracy and realism of predicted trajectories.
Experimental results on the NGSIM dataset demonstrate that Crossfusor outperforms state-of-the-art models, particularly in long-term predictions.
arXiv Detail & Related papers (2024-06-17T17:35:47Z) - Graph Attention Network for Lane-Wise and Topology-Invariant Intersection Traffic Simulation [8.600701437207725]
We propose two efficient and accurate "Digital Twin" models for intersections.
These digital twins capture temporal, spatial, and contextual aspects of traffic within intersections.
Our study's applications extend to lane reconfiguration, driving behavior analysis, and facilitating informed decisions regarding intersection safety and efficiency enhancements.
arXiv Detail & Related papers (2024-04-11T03:02:06Z) - Graph-Based Interaction-Aware Multimodal 2D Vehicle Trajectory
Prediction using Diffusion Graph Convolutional Networks [17.989423104706397]
This study presents the Graph-based Interaction-aware Multi-modal Trajectory Prediction framework.
Within this framework, vehicles' motions are conceptualized as nodes in a time-varying graph, and the traffic interactions are represented by a dynamic adjacency matrix.
We employ a driving intention-specific feature fusion, enabling the adaptive integration of historical and future embeddings.
arXiv Detail & Related papers (2023-09-05T06:28:13Z) - Cross-modal Orthogonal High-rank Augmentation for RGB-Event
Transformer-trackers [58.802352477207094]
We explore the great potential of a pre-trained vision Transformer (ViT) to bridge the vast distribution gap between two modalities.
We propose a mask modeling strategy that randomly masks a specific modality of some tokens to enforce the interaction between tokens from different modalities interacting proactively.
Experiments demonstrate that our plug-and-play training augmentation techniques can significantly boost state-of-the-art one-stream and two trackersstream to a large extent in terms of both tracking precision and success rate.
arXiv Detail & Related papers (2023-07-09T08:58:47Z) - Towards better traffic volume estimation: Jointly addressing the
underdetermination and nonequilibrium problems with correlation-adaptive GNNs [47.18837782862979]
This paper studies two key problems with regard to traffic volume estimation: (1) underdetermined traffic flows caused by undetected movements, and (2) non-equilibrium traffic flows arise from congestion propagation.
We demonstrate a graph-based deep learning method that can offer a data-driven, model-free and correlation adaptive approach to tackle the above issues.
arXiv Detail & Related papers (2023-03-10T02:22:33Z) - PDFormer: Propagation Delay-Aware Dynamic Long-Range Transformer for
Traffic Flow Prediction [78.05103666987655]
spatial-temporal Graph Neural Network (GNN) models have emerged as one of the most promising methods to solve this problem.
We propose a novel propagation delay-aware dynamic long-range transFormer, namely PDFormer, for accurate traffic flow prediction.
Our method can not only achieve state-of-the-art performance but also exhibit competitive computational efficiency.
arXiv Detail & Related papers (2023-01-19T08:42:40Z) - D2-TPred: Discontinuous Dependency for Trajectory Prediction under
Traffic Lights [68.76631399516823]
We present a trajectory prediction approach with respect to traffic lights, D2-TPred, using a spatial dynamic interaction graph (SDG) and a behavior dependency graph (BDG)
Our experimental results show that our model achieves more than 20.45% and 20.78% in terms of ADE and FDE, respectively, on VTP-TL.
arXiv Detail & Related papers (2022-07-21T10:19:07Z) - DynSTGAT: Dynamic Spatial-Temporal Graph Attention Network for Traffic
Signal Control [19.0913165219654]
Adaptive traffic signal control plays a significant role in the construction of smart cities.
We propose a novel neural network framework named DynSTGAT, which integrates dynamic historical state into a new spatial-temporal graph attention network.
Our method can achieve superior performance in travel time and throughput against the state-of-the-art methods.
arXiv Detail & Related papers (2021-09-12T11:27:27Z) - Multi-intersection Traffic Optimisation: A Benchmark Dataset and a
Strong Baseline [85.9210953301628]
Control of traffic signals is fundamental and critical to alleviate traffic congestion in urban areas.
Because of the high complexity of modelling the problem, experimental settings of current works are often inconsistent.
We propose a novel and strong baseline model based on deep reinforcement learning with the encoder-decoder structure.
arXiv Detail & Related papers (2021-01-24T03:55:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.