End-to-end Contextual Perception and Prediction with Interaction
Transformer
- URL: http://arxiv.org/abs/2008.05927v1
- Date: Thu, 13 Aug 2020 14:30:12 GMT
- Title: End-to-end Contextual Perception and Prediction with Interaction
Transformer
- Authors: Lingyun Luke Li, Bin Yang, Ming Liang, Wenyuan Zeng, Mengye Ren, Sean
Segal, Raquel Urtasun
- Abstract summary: We tackle the problem of detecting objects in 3D and forecasting their future motion in the context of self-driving.
To capture their spatial-temporal dependencies, we propose a recurrent neural network with a novel Transformer architecture.
Our model can be trained end-to-end, and runs in real-time.
- Score: 79.14001602890417
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper, we tackle the problem of detecting objects in 3D and
forecasting their future motion in the context of self-driving. Towards this
goal, we design a novel approach that explicitly takes into account the
interactions between actors. To capture their spatial-temporal dependencies, we
propose a recurrent neural network with a novel Transformer architecture, which
we call the Interaction Transformer. Importantly, our model can be trained
end-to-end, and runs in real-time. We validate our approach on two challenging
real-world datasets: ATG4D and nuScenes. We show that our approach can
outperform the state-of-the-art on both datasets. In particular, we
significantly improve the social compliance between the estimated future
trajectories, resulting in far fewer collisions between the predicted actors.
Related papers
- InTraGen: Trajectory-controlled Video Generation for Object Interactions [100.79494904451246]
InTraGen is a pipeline for improved trajectory-based generation of object interaction scenarios.
Our results demonstrate improvements in both visual fidelity and quantitative performance.
arXiv Detail & Related papers (2024-11-25T14:27:50Z) - SocialFormer: Social Interaction Modeling with Edge-enhanced Heterogeneous Graph Transformers for Trajectory Prediction [3.733790302392792]
SocialFormer is an agent interaction-aware trajectory prediction method.
We present a temporal encoder based on gated recurrent units (GRU) to model the temporal social behavior of agent movements.
We evaluate SocialFormer for the trajectory prediction task on the popular nuScenes benchmark and achieve state-of-the-art performance.
arXiv Detail & Related papers (2024-05-06T19:47:23Z) - Social-Transmotion: Promptable Human Trajectory Prediction [65.80068316170613]
Social-Transmotion is a generic Transformer-based model that exploits diverse and numerous visual cues to predict human behavior.
Our approach is validated on multiple datasets, including JTA, JRDB, Pedestrians and Cyclists in Road Traffic, and ETH-UCY.
arXiv Detail & Related papers (2023-12-26T18:56:49Z) - Qualitative Prediction of Multi-Agent Spatial Interactions [5.742409080817885]
We present and benchmark three new approaches to model and predict multi-agent interactions in dense scenes.
The proposed solutions take into account static and dynamic context to predict individual interactions.
They exploit an input- and a temporal-attention mechanism, and are tested on medium and long-term time horizons.
arXiv Detail & Related papers (2023-06-30T18:08:25Z) - Goal-driven Self-Attentive Recurrent Networks for Trajectory Prediction [31.02081143697431]
Human trajectory forecasting is a key component of autonomous vehicles, social-aware robots and video-surveillance applications.
We propose a lightweight attention-based recurrent backbone that acts solely on past observed positions.
We employ a common goal module, based on a U-Net architecture, which additionally extracts semantic information to predict scene-compliant destinations.
arXiv Detail & Related papers (2022-04-25T11:12:37Z) - Pedestrian Trajectory Prediction via Spatial Interaction Transformer
Network [7.150832716115448]
In traffic scenes, when encountering with oncoming people, pedestrians may make sudden turns or stop immediately.
To predict such unpredictable trajectories, we can gain insights into the interaction between pedestrians.
We present a novel generative method named Spatial Interaction Transformer (SIT), which learns the correlation of pedestrian trajectories through attention mechanisms.
arXiv Detail & Related papers (2021-12-13T13:08:04Z) - Scene Synthesis via Uncertainty-Driven Attribute Synchronization [52.31834816911887]
This paper introduces a novel neural scene synthesis approach that can capture diverse feature patterns of 3D scenes.
Our method combines the strength of both neural network-based and conventional scene synthesis approaches.
arXiv Detail & Related papers (2021-08-30T19:45:07Z) - Spatial-Channel Transformer Network for Trajectory Prediction on the
Traffic Scenes [2.7955111755177695]
We present a Spatial-Channel Transformer Network for trajectory prediction with attention functions.
A channel-wise module is inserted to measure the social interaction between agents.
We find that the network achieves promising results on real-world trajectory prediction datasets on the traffic scenes.
arXiv Detail & Related papers (2021-01-27T15:03:42Z) - Implicit Latent Variable Model for Scene-Consistent Motion Forecasting [78.74510891099395]
In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data.
We model the scene as an interaction graph and employ powerful graph neural networks to learn a distributed latent representation of the scene.
arXiv Detail & Related papers (2020-07-23T14:31:25Z) - A Spatial-Temporal Attentive Network with Spatial Continuity for
Trajectory Prediction [74.00750936752418]
We propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC)
First, spatial-temporal attention mechanism is presented to explore the most useful and important information.
Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity.
arXiv Detail & Related papers (2020-03-13T04:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.