Related papers: Latent Variable Nested Set Transformers & AutoBots

Latent Variable Nested Set Transformers & AutoBots

URL: http://arxiv.org/abs/2104.00563v1
Date: Fri, 19 Feb 2021 18:53:26 GMT
Title: Latent Variable Nested Set Transformers & AutoBots
Authors: Roger Girgis, Florian Golemo, Felipe Codevilla, Jim Aldon D'Souza, Samira Ebrahimi Kahou, Felix Heide, Christopher Pal
Abstract summary: We propose a theoretical framework for this problem setting based on autoregressively modelling sequences of nested sets. We present a new model architecture which employs multi-head self-attention blocks over sets of sets that serve as a form of social attention between the elements of the sets at every timestep. We validate the Nested Set Transformer for autonomous driving settings which we refer to as ("AutoBot"), where we model the trajectory of an ego-agent based on the sequential observations of key attributes of multiple agents in a scene.
Score: 25.194344543085005
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Humans have the innate ability to attend to the most relevant actors in their vicinity and can forecast how they may behave in the future. This ability will be crucial for the deployment of safety-critical agents such as robots or vehicles which interact with humans. We propose a theoretical framework for this problem setting based on autoregressively modelling sequences of nested sets, using latent variables to better capture multimodal distributions over future sets of sets. We present a new model architecture which we call a Nested Set Transformer which employs multi-head self-attention blocks over sets of sets that serve as a form of social attention between the elements of the sets at every timestep. Our approach can produce a distribution over future trajectories for all agents under consideration, or focus upon the trajectory of an ego-agent. We validate the Nested Set Transformer for autonomous driving settings which we refer to as ("AutoBot"), where we model the trajectory of an ego-agent based on the sequential observations of key attributes of multiple agents in a scene. AutoBot produces results better than state-of-the-art published prior work on the challenging nuScenes vehicle trajectory modeling benchmark. We also examine the multi-agent prediction version of our model and jointly forecast an ego-agent's future trajectory along with the other agents in the scene. We validate the behavior of our proposed Nested Set Transformer for scene level forecasting with a pedestrian trajectory dataset.

Related papers

Trajeglish: Traffic Modeling as Next-Token Prediction [67.28197954427638]
A longstanding challenge for self-driving development is simulating dynamic driving scenarios seeded from recorded driving logs. We apply tools from discrete sequence modeling to model how vehicles, pedestrians and cyclists interact in driving scenarios. Our model tops the Sim Agents Benchmark, surpassing prior work along the realism meta metric by 3.3% and along the interaction metric by 9.9%.
arXiv Detail & Related papers (2023-12-07T18:53:27Z)
JRDB-Traj: A Dataset and Benchmark for Trajectory Forecasting in Crowds [79.00975648564483]
Trajectory forecasting models, employed in fields such as robotics, autonomous vehicles, and navigation, face challenges in real-world scenarios. This dataset provides comprehensive data, including the locations of all agents, scene images, and point clouds, all from the robot's perspective. The objective is to predict the future positions of agents relative to the robot using raw sensory input data.
arXiv Detail & Related papers (2023-11-05T18:59:31Z)
Robot Learning with Sensorimotor Pre-training [98.7755895548928]
We present a self-supervised sensorimotor pre-training approach for robotics. Our model, called RPT, is a Transformer that operates on sequences of sensorimotor tokens. We find that sensorimotor pre-training consistently outperforms training from scratch, has favorable scaling properties, and enables transfer across different tasks, environments, and robots.
arXiv Detail & Related papers (2023-06-16T17:58:10Z)
Scene Transformer: A unified multi-task model for behavior prediction and planning [42.758178896204036]
We formulate a model for predicting the behavior of all agents jointly in real-world driving environments. Inspired by recent language modeling approaches, we use a masking strategy as the query to our model. We evaluate our approach on autonomous driving datasets for behavior prediction, and achieve state-of-the-art performance.
arXiv Detail & Related papers (2021-06-15T20:20:44Z)
Time-series Imputation of Temporally-occluded Multiagent Trajectories [18.862173210927658]
We study the problem of multiagent time-series imputation, where available past and future observations of subsets of agents are used to estimate missing observations for other agents. Our approach, called the Graph Imputer, uses forward- and backward-information in combination with graph networks and variational autoencoders. We evaluate our approach on a dataset of football matches, using a projective camera module to train and evaluate our model for the off-screen player state estimation setting.
arXiv Detail & Related papers (2021-06-08T09:58:43Z)
Future Frame Prediction for Robot-assisted Surgery [57.18185972461453]
We propose a ternary prior guided variational autoencoder (TPG-VAE) model for future frame prediction in robotic surgical video sequences. Besides content distribution, our model learns motion distribution, which is novel to handle the small movements of surgical tools.
arXiv Detail & Related papers (2021-03-18T15:12:06Z)
Instance-Aware Predictive Navigation in Multi-Agent Environments [93.15055834395304]
We propose an Instance-Aware Predictive Control (IPC) approach, which forecasts interactions between agents as well as future scene structures. We adopt a novel multi-instance event prediction module to estimate the possible interaction among agents in the ego-centric view. We design a sequential action sampling strategy to better leverage predicted states on both scene-level and instance-level.
arXiv Detail & Related papers (2021-01-14T22:21:25Z)
Traffic Agent Trajectory Prediction Using Social Convolution and Attention Mechanism [57.68557165836806]
We propose a model to predict the trajectories of target agents around an autonomous vehicle. We encode the target agent history trajectories as an attention mask and construct a social map to encode the interactive relationship between the target agent and its surrounding agents. To verify the effectiveness of our method, we widely compare with several methods on a public dataset, achieving a 20% error decrease.
arXiv Detail & Related papers (2020-07-06T03:48:08Z)
Trajectory Prediction for Autonomous Driving based on Multi-Head Attention with Joint Agent-Map Representation [8.203012391711932]
Future trajectories of agents can be inferred using two important cues: the locations and past motion of agents, and the static scene structure. We propose a novel approach applying multi-head attention by considering a joint representation of the static scene and surrounding agents. Our model achieves results on the nuScenes prediction benchmark and generates diverse future trajectories compliant with scene structure and agent configuration.
arXiv Detail & Related papers (2020-05-06T00:39:45Z)
Diverse and Admissible Trajectory Forecasting through Multimodal Context Understanding [46.52703817997932]
Multi-agent trajectory forecasting in autonomous driving requires an agent to accurately anticipate the behaviors of the surrounding vehicles and pedestrians. We propose a model that synthesizes multiple input signals from the multimodal world. We show a significant performance improvement over previous state-of-the-art methods.
arXiv Detail & Related papers (2020-03-06T13:59:39Z)
Trajectron++: Dynamically-Feasible Trajectory Forecasting With Heterogeneous Data [37.176411554794214]
Reasoning about human motion is an important prerequisite to safe and socially-aware robotic navigation. We present Trajectron++, a modular, graph-structured recurrent model that forecasts the trajectories of a general number of diverse agents. We demonstrate its performance on several challenging real-world trajectory forecasting datasets.
arXiv Detail & Related papers (2020-01-09T16:47:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.