Related papers: TraF-Align: Trajectory-aware Feature Alignment for Asynchronous Multi-agent Perception

TraF-Align: Trajectory-aware Feature Alignment for Asynchronous Multi-agent Perception

URL: http://arxiv.org/abs/2503.19391v1
Date: Tue, 25 Mar 2025 06:56:35 GMT
Title: TraF-Align: Trajectory-aware Feature Alignment for Asynchronous Multi-agent Perception
Authors: Zhiying Song, Lei Yang, Fuxi Wen, Jun Li,
Abstract summary: TraF-Align learns the flow path of features by predicting the feature-level trajectory of objects from past observations up to the ego vehicle's current time.<n>This approach corrects spatial misalignment and ensures semantic consistency across agents, effectively compensating for motion.<n>Experiments on two real-world datasets, V2V4Real and DAIR-V2X-Seq, show that TraF-Align sets a new benchmark for asynchronous cooperative perception.
Score: 7.382491303268417
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Cooperative perception presents significant potential for enhancing the sensing capabilities of individual vehicles, however, inter-agent latency remains a critical challenge. Latencies cause misalignments in both spatial and semantic features, complicating the fusion of real-time observations from the ego vehicle with delayed data from others. To address these issues, we propose TraF-Align, a novel framework that learns the flow path of features by predicting the feature-level trajectory of objects from past observations up to the ego vehicle's current time. By generating temporally ordered sampling points along these paths, TraF-Align directs attention from the current-time query to relevant historical features along each trajectory, supporting the reconstruction of current-time features and promoting semantic interaction across multiple frames. This approach corrects spatial misalignment and ensures semantic consistency across agents, effectively compensating for motion and achieving coherent feature fusion. Experiments on two real-world datasets, V2V4Real and DAIR-V2X-Seq, show that TraF-Align sets a new benchmark for asynchronous cooperative perception.

Related papers

TaCo: Capturing Spatio-Temporal Semantic Consistency in Remote Sensing Change Detection [54.22717266034045]
Ta-Co is a consistent semantic network for temporal semantic transitions.<n>We show that Ta-Co consistently achieves SOTA performance on remote sensing detection tasks.<n>This design can yield substantial gains without any additional computational overhead during inference.
arXiv Detail & Related papers (2025-11-25T13:44:29Z)
V2X-RECT: An Efficient V2X Trajectory Prediction Framework via Redundant Interaction Filtering and Tracking Error Correction [30.222991833643785]
V2X-RECT is a trajectory prediction framework designed for high-density environments.<n>It enhances data association consistency, reduces redundant interactions, and reuses historical information to enable more efficient and accurate prediction.
arXiv Detail & Related papers (2025-11-22T06:50:47Z)
Cross-modal Offset-guided Dynamic Alignment and Fusion for Weakly Aligned UAV Object Detection [0.0]
Unmanned aerial vehicle (UAV) object detection plays a vital role in applications such as environmental monitoring and urban security.<n>Due to UAV platform motion and asynchronous imaging, spatial misalignment frequently occurs between modalities, leading to weak alignment.<n>We propose Cross-modal Offset-guided Dynamic Alignment and Fusion (CoDAF) to address these issues.
arXiv Detail & Related papers (2025-06-20T04:11:39Z)
STAMImputer: Spatio-Temporal Attention MoE for Traffic Data Imputation [36.880711201508085]
Existing time-to-space methods often fail to effectively extract features in block-wise missing data scenarios.<n>This paper proposes a Spatiotemporal Attention Mixture of experts network named STAMImputer for traffic data imputation.<n>The result shows STAMImputer achieves significantly performance improvement compared with existing SOTA approaches.
arXiv Detail & Related papers (2025-06-09T04:05:00Z)
Contrast & Compress: Learning Lightweight Embeddings for Short Trajectories [11.6132604160666]
We propose a novel framework for learning fixed-dimensional embeddings for short trajectories by leveraging a Transformer encoder.<n>We analyze the influence of Cosine and FFT-based similarity metrics within the contrastive learning paradigm.<n>Our empirical evaluation on the Argoverse 2 dataset demonstrates that embeddings shaped by Cosine similarity objectives yield superior clustering of trajectories.
arXiv Detail & Related papers (2025-06-03T07:53:04Z)
SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining [62.433137130087445]
SuperFlow++ is a novel framework that integrates pretraining and downstream tasks using consecutive camera pairs.<n>We show that SuperFlow++ outperforms state-of-the-art methods across diverse tasks and driving conditions.<n>With strong generalizability and computational efficiency, SuperFlow++ establishes a new benchmark for data-efficient LiDAR-based perception in autonomous driving.
arXiv Detail & Related papers (2025-03-25T17:59:57Z)
Cross Space and Time: A Spatio-Temporal Unitized Model for Traffic Flow Forecasting [16.782154479264126]
Predicting backbone-temporal traffic flow presents challenges due to complex interactions between temporal factors. Existing approaches address these dimensions in isolation, neglecting their critical interdependencies. In this paper, we introduce Sanonymous-Temporal Unitized Unitized Cell (ASTUC), a unified framework designed to capture both spatial and temporal dependencies.
arXiv Detail & Related papers (2024-11-14T07:34:31Z)
StreamLTS: Query-based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection [0.552480439325792]
We propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X. Experiment results confirm the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models.
arXiv Detail & Related papers (2024-07-04T10:56:10Z)
Bidirectional Progressive Transformer for Interaction Intention Anticipation [20.53329698350243]
We introduce a Bidirectional Progressive mechanism into the anticipation of interaction intention. We employ a Trajectory Unit and a C-VAE to introduce appropriate uncertainty to trajectories and interaction hotspots. Our method achieves state-of-the-art results on three benchmark datasets.
arXiv Detail & Related papers (2024-05-09T05:22:18Z)
Triplet Attention Transformer for Spatiotemporal Predictive Learning [9.059462850026216]
We propose an innovative triplet attention transformer designed to capture both inter-frame dynamics and intra-frame static features. The model incorporates the Triplet Attention Module (TAM), which replaces traditional recurrent units by exploring self-attention mechanisms in temporal, spatial, and channel dimensions.
arXiv Detail & Related papers (2023-10-28T12:49:33Z)
Modeling Continuous Motion for 3D Point Cloud Object Tracking [54.48716096286417]
This paper presents a novel approach that views each tracklet as a continuous stream. At each timestamp, only the current frame is fed into the network to interact with multi-frame historical features stored in a memory bank. To enhance the utilization of multi-frame features for robust tracking, a contrastive sequence enhancement strategy is proposed.
arXiv Detail & Related papers (2023-03-14T02:58:27Z)
Joint Spatial-Temporal and Appearance Modeling with Transformer for Multiple Object Tracking [59.79252390626194]
We propose a novel solution named TransSTAM, which leverages Transformer to model both the appearance features of each object and the spatial-temporal relationships among objects. The proposed method is evaluated on multiple public benchmarks including MOT16, MOT17, and MOT20, and it achieves a clear performance improvement in both IDF1 and HOTA.
arXiv Detail & Related papers (2022-05-31T01:19:18Z)
Continuity-Discrimination Convolutional Neural Network for Visual Object Tracking [150.51667609413312]
This paper proposes a novel model, named Continuity-Discrimination Convolutional Neural Network (CD-CNN) for visual object tracking. To address this problem, CD-CNN models temporal appearance continuity based on the idea of temporal slowness. In order to alleviate inaccurate target localization and drifting, we propose a novel notion, object-centroid.
arXiv Detail & Related papers (2021-04-18T06:35:03Z)
A Spatial-Temporal Attentive Network with Spatial Continuity for Trajectory Prediction [74.00750936752418]
We propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC) First, spatial-temporal attention mechanism is presented to explore the most useful and important information. Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity.
arXiv Detail & Related papers (2020-03-13T04:35:50Z)
Spatial-Temporal Transformer Networks for Traffic Flow Forecasting [74.76852538940746]
We propose a novel paradigm of Spatial-Temporal Transformer Networks (STTNs) to improve the accuracy of long-term traffic forecasting. Specifically, we present a new variant of graph neural networks, named spatial transformer, by dynamically modeling directed spatial dependencies. The proposed model enables fast and scalable training over a long range spatial-temporal dependencies.
arXiv Detail & Related papers (2020-01-09T10:21:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.