Latent Variable Nested Set Transformers & AutoBots
- URL: http://arxiv.org/abs/2104.00563v1
- Date: Fri, 19 Feb 2021 18:53:26 GMT
- Title: Latent Variable Nested Set Transformers & AutoBots
- Authors: Roger Girgis, Florian Golemo, Felipe Codevilla, Jim Aldon D'Souza,
Samira Ebrahimi Kahou, Felix Heide, Christopher Pal
- Abstract summary: We propose a theoretical framework for this problem setting based on autoregressively modelling sequences of nested sets.
We present a new model architecture which employs multi-head self-attention blocks over sets of sets that serve as a form of social attention between the elements of the sets at every timestep.
We validate the Nested Set Transformer for autonomous driving settings which we refer to as ("AutoBot"), where we model the trajectory of an ego-agent based on the sequential observations of key attributes of multiple agents in a scene.
- Score: 25.194344543085005
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Humans have the innate ability to attend to the most relevant actors in their
vicinity and can forecast how they may behave in the future. This ability will
be crucial for the deployment of safety-critical agents such as robots or
vehicles which interact with humans. We propose a theoretical framework for
this problem setting based on autoregressively modelling sequences of nested
sets, using latent variables to better capture multimodal distributions over
future sets of sets. We present a new model architecture which we call a Nested
Set Transformer which employs multi-head self-attention blocks over sets of
sets that serve as a form of social attention between the elements of the sets
at every timestep. Our approach can produce a distribution over future
trajectories for all agents under consideration, or focus upon the trajectory
of an ego-agent. We validate the Nested Set Transformer for autonomous driving
settings which we refer to as ("AutoBot"), where we model the trajectory of an
ego-agent based on the sequential observations of key attributes of multiple
agents in a scene. AutoBot produces results better than state-of-the-art
published prior work on the challenging nuScenes vehicle trajectory modeling
benchmark. We also examine the multi-agent prediction version of our model and
jointly forecast an ego-agent's future trajectory along with the other agents
in the scene. We validate the behavior of our proposed Nested Set Transformer
for scene level forecasting with a pedestrian trajectory dataset.
Related papers
- Trajeglish: Traffic Modeling as Next-Token Prediction [67.28197954427638]
A longstanding challenge for self-driving development is simulating dynamic driving scenarios seeded from recorded driving logs.
We apply tools from discrete sequence modeling to model how vehicles, pedestrians and cyclists interact in driving scenarios.
Our model tops the Sim Agents Benchmark, surpassing prior work along the realism meta metric by 3.3% and along the interaction metric by 9.9%.
arXiv Detail & Related papers (2023-12-07T18:53:27Z) - JRDB-Traj: A Dataset and Benchmark for Trajectory Forecasting in Crowds [79.00975648564483]
Trajectory forecasting models, employed in fields such as robotics, autonomous vehicles, and navigation, face challenges in real-world scenarios.
This dataset provides comprehensive data, including the locations of all agents, scene images, and point clouds, all from the robot's perspective.
The objective is to predict the future positions of agents relative to the robot using raw sensory input data.
arXiv Detail & Related papers (2023-11-05T18:59:31Z) - Robot Learning with Sensorimotor Pre-training [98.7755895548928]
We present a self-supervised sensorimotor pre-training approach for robotics.
Our model, called RPT, is a Transformer that operates on sequences of sensorimotor tokens.
We find that sensorimotor pre-training consistently outperforms training from scratch, has favorable scaling properties, and enables transfer across different tasks, environments, and robots.
arXiv Detail & Related papers (2023-06-16T17:58:10Z) - Scene Transformer: A unified multi-task model for behavior prediction
and planning [42.758178896204036]
We formulate a model for predicting the behavior of all agents jointly in real-world driving environments.
Inspired by recent language modeling approaches, we use a masking strategy as the query to our model.
We evaluate our approach on autonomous driving datasets for behavior prediction, and achieve state-of-the-art performance.
arXiv Detail & Related papers (2021-06-15T20:20:44Z) - Time-series Imputation of Temporally-occluded Multiagent Trajectories [18.862173210927658]
We study the problem of multiagent time-series imputation, where available past and future observations of subsets of agents are used to estimate missing observations for other agents.
Our approach, called the Graph Imputer, uses forward- and backward-information in combination with graph networks and variational autoencoders.
We evaluate our approach on a dataset of football matches, using a projective camera module to train and evaluate our model for the off-screen player state estimation setting.
arXiv Detail & Related papers (2021-06-08T09:58:43Z) - Future Frame Prediction for Robot-assisted Surgery [57.18185972461453]
We propose a ternary prior guided variational autoencoder (TPG-VAE) model for future frame prediction in robotic surgical video sequences.
Besides content distribution, our model learns motion distribution, which is novel to handle the small movements of surgical tools.
arXiv Detail & Related papers (2021-03-18T15:12:06Z) - Instance-Aware Predictive Navigation in Multi-Agent Environments [93.15055834395304]
We propose an Instance-Aware Predictive Control (IPC) approach, which forecasts interactions between agents as well as future scene structures.
We adopt a novel multi-instance event prediction module to estimate the possible interaction among agents in the ego-centric view.
We design a sequential action sampling strategy to better leverage predicted states on both scene-level and instance-level.
arXiv Detail & Related papers (2021-01-14T22:21:25Z) - Traffic Agent Trajectory Prediction Using Social Convolution and
Attention Mechanism [57.68557165836806]
We propose a model to predict the trajectories of target agents around an autonomous vehicle.
We encode the target agent history trajectories as an attention mask and construct a social map to encode the interactive relationship between the target agent and its surrounding agents.
To verify the effectiveness of our method, we widely compare with several methods on a public dataset, achieving a 20% error decrease.
arXiv Detail & Related papers (2020-07-06T03:48:08Z) - Trajectory Prediction for Autonomous Driving based on Multi-Head
Attention with Joint Agent-Map Representation [8.203012391711932]
Future trajectories of agents can be inferred using two important cues: the locations and past motion of agents, and the static scene structure.
We propose a novel approach applying multi-head attention by considering a joint representation of the static scene and surrounding agents.
Our model achieves results on the nuScenes prediction benchmark and generates diverse future trajectories compliant with scene structure and agent configuration.
arXiv Detail & Related papers (2020-05-06T00:39:45Z) - Diverse and Admissible Trajectory Forecasting through Multimodal Context
Understanding [46.52703817997932]
Multi-agent trajectory forecasting in autonomous driving requires an agent to accurately anticipate the behaviors of the surrounding vehicles and pedestrians.
We propose a model that synthesizes multiple input signals from the multimodal world.
We show a significant performance improvement over previous state-of-the-art methods.
arXiv Detail & Related papers (2020-03-06T13:59:39Z) - Trajectron++: Dynamically-Feasible Trajectory Forecasting With
Heterogeneous Data [37.176411554794214]
Reasoning about human motion is an important prerequisite to safe and socially-aware robotic navigation.
We present Trajectron++, a modular, graph-structured recurrent model that forecasts the trajectories of a general number of diverse agents.
We demonstrate its performance on several challenging real-world trajectory forecasting datasets.
arXiv Detail & Related papers (2020-01-09T16:47:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.