Learning Cooperative Trajectory Representations for Motion Forecasting
- URL: http://arxiv.org/abs/2311.00371v1
- Date: Wed, 1 Nov 2023 08:53:05 GMT
- Title: Learning Cooperative Trajectory Representations for Motion Forecasting
- Authors: Hongzhi Ruan, Haibao Yu, Wenxian Yang, Siqi Fan, Yingjuan Tang,
Zaiqing Nie
- Abstract summary: We present V2X-Graph, the first interpretable and end-to-end learning framework for cooperative motion forecasting.
V2X-Graph employs an interpretable graph to fully leverage the cooperative motion and interaction contexts.
We construct the first real-world vehicle-to-everything (V2X) motion forecasting dataset.
- Score: 4.380073528690906
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motion forecasting is an essential task for autonomous driving, and the
effective information utilization from infrastructure and other vehicles can
enhance motion forecasting capabilities. Existing research have primarily
focused on leveraging single-frame cooperative information to enhance the
limited perception capability of the ego vehicle, while underutilizing the
motion and interaction information of traffic participants observed from
cooperative devices. In this paper, we first propose the cooperative trajectory
representations learning paradigm. Specifically, we present V2X-Graph, the
first interpretable and end-to-end learning framework for cooperative motion
forecasting. V2X-Graph employs an interpretable graph to fully leverage the
cooperative motion and interaction contexts. Experimental results on the
vehicle-to-infrastructure (V2I) motion forecasting dataset, V2X-Seq,
demonstrate the effectiveness of V2X-Graph. To further evaluate on V2X
scenario, we construct the first real-world vehicle-to-everything (V2X) motion
forecasting dataset V2X-Traj, and the performance shows the advantage of our
method. We hope both V2X-Graph and V2X-Traj can facilitate the further
development of cooperative motion forecasting. Find project at
https://github.com/AIR-THU/V2X-Graph, find data at
https://github.com/AIR-THU/DAIR-V2X-Seq.
Related papers
- End-to-End Autonomous Driving through V2X Cooperation [23.44597411612664]
We introduce UniV2X, a pioneering cooperative autonomous driving framework.
UniV2X seamlessly integrates all key driving modules across diverse views into a unified network.
arXiv Detail & Related papers (2024-03-31T15:22:11Z) - You Only Transfer What You Share: Intersection-Induced Graph Transfer
Learning for Link Prediction [79.15394378571132]
We investigate a previously overlooked phenomenon: in many cases, a densely connected, complementary graph can be found for the original graph.
The denser graph may share nodes with the original graph, which offers a natural bridge for transferring selective, meaningful knowledge.
We identify this setting as Graph Intersection-induced Transfer Learning (GITL), which is motivated by practical applications in e-commerce or academic co-authorship predictions.
arXiv Detail & Related papers (2023-02-27T22:56:06Z) - D2-TPred: Discontinuous Dependency for Trajectory Prediction under
Traffic Lights [68.76631399516823]
We present a trajectory prediction approach with respect to traffic lights, D2-TPred, using a spatial dynamic interaction graph (SDG) and a behavior dependency graph (BDG)
Our experimental results show that our model achieves more than 20.45% and 20.78% in terms of ADE and FDE, respectively, on VTP-TL.
arXiv Detail & Related papers (2022-07-21T10:19:07Z) - CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse
Transformers [36.838065731893735]
CoBEVT is the first generic multi-agent perception framework that can cooperatively generate BEV map predictions.
CoBEVT achieves state-of-the-art performance for cooperative BEV semantic segmentation.
arXiv Detail & Related papers (2022-07-05T17:59:28Z) - V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision
Transformer [58.71845618090022]
We build a holistic attention model, namely V2X-ViT, to fuse information across on-road agents.
V2X-ViT consists of alternating layers of heterogeneous multi-agent self-attention and multi-scale window self-attention.
To validate our approach, we create a large-scale V2X perception dataset.
arXiv Detail & Related papers (2022-03-20T20:18:25Z) - V2X-Sim: A Virtual Collaborative Perception Dataset for Autonomous
Driving [26.961213523096948]
Vehicle-to-everything (V2X) denotes the collaboration between a vehicle and any entity in its surrounding.
We present the V2X-Sim dataset, the first public large-scale collaborative perception dataset in autonomous driving.
arXiv Detail & Related papers (2022-02-17T05:14:02Z) - Unboxing the graph: Neural Relational Inference for Mobility Prediction [15.4049962498675]
Graph Networks (GNNs) have been widely applied on non-euclidean spatial data.
In this paper, we use Neural Inference, a dynamically learn the optimal graph model.
arXiv Detail & Related papers (2022-01-25T13:26:35Z) - Visual Relationship Forecasting in Videos [56.122037294234865]
We present a new task named Visual Relationship Forecasting (VRF) in videos to explore the prediction of visual relationships in a manner of reasoning.
Given a subject-object pair with H existing frames, VRF aims to predict their future interactions for the next T frames without visual evidence.
To evaluate the VRF task, we introduce two video datasets named VRF-AG and VRF-VidOR, with a series oftemporally localized visual relation annotations in a video.
arXiv Detail & Related papers (2021-07-02T16:43:19Z) - Implicit Latent Variable Model for Scene-Consistent Motion Forecasting [78.74510891099395]
In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data.
We model the scene as an interaction graph and employ powerful graph neural networks to learn a distributed latent representation of the scene.
arXiv Detail & Related papers (2020-07-23T14:31:25Z) - Graph Convolution Machine for Context-aware Recommender System [59.50474932860843]
We extend the advantages of graph convolutions to context-aware recommender system (CARS)
We propose textitGraph Convolution Machine (GCM), an end-to-end framework that consists of three components: an encoder, graph convolution layers, and a decoder.
We conduct experiments on three real-world datasets from Yelp and Amazon, validating the effectiveness of GCM and the benefits of performing graph convolutions for CARS.
arXiv Detail & Related papers (2020-01-30T15:32:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.