MATRIX: Multi-Agent Trajectory Generation with Diverse Contexts
- URL: http://arxiv.org/abs/2403.06041v1
- Date: Sat, 9 Mar 2024 23:28:54 GMT
- Title: MATRIX: Multi-Agent Trajectory Generation with Diverse Contexts
- Authors: Zhuo Xu, Rui Zhou, Yida Yin, Huidong Gao, Masayoshi Tomizuka, Jiachen
Li
- Abstract summary: We study trajectory-level data generation for multi-human or human-robot interaction scenarios.
We propose a learning-based automatic trajectory generation model, which we call Multi-Agent TRajectory generation with dIverse conteXts (MATRIX)
- Score: 47.12378253630105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data-driven methods have great advantages in modeling complicated human
behavioral dynamics and dealing with many human-robot interaction applications.
However, collecting massive and annotated real-world human datasets has been a
laborious task, especially for highly interactive scenarios. On the other hand,
algorithmic data generation methods are usually limited by their model
capacities, making them unable to offer realistic and diverse data needed by
various application users. In this work, we study trajectory-level data
generation for multi-human or human-robot interaction scenarios and propose a
learning-based automatic trajectory generation model, which we call Multi-Agent
TRajectory generation with dIverse conteXts (MATRIX). MATRIX is capable of
generating interactive human behaviors in realistic diverse contexts. We
achieve this goal by modeling the explicit and interpretable objectives so that
MATRIX can generate human motions based on diverse destinations and
heterogeneous behaviors. We carried out extensive comparison and ablation
studies to illustrate the effectiveness of our approach across various metrics.
We also presented experiments that demonstrate the capability of MATRIX to
serve as data augmentation for imitation-based motion planning.
Related papers
- COLLAGE: Collaborative Human-Agent Interaction Generation using Hierarchical Latent Diffusion and Language Models [14.130327598928778]
Large language models (LLMs) and hierarchical motion-specific vector-quantized variational autoencoders (VQ-VAEs) are proposed.
Our framework generates realistic and diverse collaborative human-object-human interactions, outperforming state-of-the-art methods.
Our work opens up new possibilities for modeling complex interactions in various domains, such as robotics, graphics and computer vision.
arXiv Detail & Related papers (2024-09-30T17:02:13Z) - Multi-Agent Dynamic Relational Reasoning for Social Robot Navigation [50.01551945190676]
Social robot navigation can be helpful in various contexts of daily life but requires safe human-robot interactions and efficient trajectory planning.
We propose a systematic relational reasoning approach with explicit inference of the underlying dynamically evolving relational structures.
We demonstrate its effectiveness for multi-agent trajectory prediction and social robot navigation.
arXiv Detail & Related papers (2024-01-22T18:58:22Z) - Large Language Models as Zero-Shot Human Models for Human-Robot Interaction [12.455647753787442]
Large-language models (LLMs) can act as zero-shot human models for human-robot interaction.
LLMs achieve performance comparable to purpose-built models.
We present one case study on a simulated trust-based table-clearing task.
arXiv Detail & Related papers (2023-03-06T23:16:24Z) - MILD: Multimodal Interactive Latent Dynamics for Learning Human-Robot
Interaction [34.978017200500005]
We propose Multimodal Interactive Latent Dynamics (MILD) to address the problem of two-party physical Human-Robot Interactions (HRIs)
We learn the interaction dynamics from demonstrations, using Hidden Semi-Markov Models (HSMMs) to model the joint distribution of the interacting agents in the latent space of a Variational Autoencoder (VAE)
MILD generates more accurate trajectories for the controlled agent (robot) when conditioned on the observed agent's (human) trajectory.
arXiv Detail & Related papers (2022-10-22T11:25:11Z) - DIME: Fine-grained Interpretations of Multimodal Models via Disentangled
Local Explanations [119.1953397679783]
We focus on advancing the state-of-the-art in interpreting multimodal models.
Our proposed approach, DIME, enables accurate and fine-grained analysis of multimodal models.
arXiv Detail & Related papers (2022-03-03T20:52:47Z) - Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions.
In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems.
Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z) - Multimodal Deep Generative Models for Trajectory Prediction: A
Conditional Variational Autoencoder Approach [34.70843462687529]
We provide a self-contained tutorial on a conditional variational autoencoder approach to human behavior prediction.
The goals of this tutorial paper are to review and build a taxonomy of state-of-the-art methods in human behavior prediction.
arXiv Detail & Related papers (2020-08-10T03:18:27Z) - Human Trajectory Forecasting in Crowds: A Deep Learning Perspective [89.4600982169]
We present an in-depth analysis of existing deep learning-based methods for modelling social interactions.
We propose two knowledge-based data-driven methods to effectively capture these social interactions.
We develop a large scale interaction-centric benchmark TrajNet++, a significant yet missing component in the field of human trajectory forecasting.
arXiv Detail & Related papers (2020-07-07T17:19:56Z) - Learning Predictive Models From Observation and Interaction [137.77887825854768]
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works.
However, learning a model that captures the dynamics of complex skills represents a major challenge.
We propose a method to augment the training set with observational data of other agents, such as humans.
arXiv Detail & Related papers (2019-12-30T01:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.