LEMMA: A Multi-view Dataset for Learning Multi-agent Multi-task
Activities
- URL: http://arxiv.org/abs/2007.15781v1
- Date: Fri, 31 Jul 2020 00:13:54 GMT
- Title: LEMMA: A Multi-view Dataset for Learning Multi-agent Multi-task
Activities
- Authors: Baoxiong Jia, Yixin Chen, Siyuan Huang, Yixin Zhu, Song-chun Zhu
- Abstract summary: We introduce the LEMMA dataset to provide a single home to address missing dimensions with meticulously designed settings.
We densely annotate the atomic-actions with human-object interactions to provide ground-truths of the compositionality, scheduling, and assignment of daily activities.
We hope this effort would drive the machine vision community to examine goal-directed human activities and further study the task scheduling and assignment in the real world.
- Score: 119.88381048477854
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding and interpreting human actions is a long-standing challenge and
a critical indicator of perception in artificial intelligence. However, a few
imperative components of daily human activities are largely missed in prior
literature, including the goal-directed actions, concurrent multi-tasks, and
collaborations among multi-agents. We introduce the LEMMA dataset to provide a
single home to address these missing dimensions with meticulously designed
settings, wherein the number of tasks and agents varies to highlight different
learning objectives. We densely annotate the atomic-actions with human-object
interactions to provide ground-truths of the compositionality, scheduling, and
assignment of daily activities. We further devise challenging compositional
action recognition and action/task anticipation benchmarks with baseline models
to measure the capability of compositional action understanding and temporal
reasoning. We hope this effort would drive the machine vision community to
examine goal-directed human activities and further study the task scheduling
and assignment in the real world.
Related papers
- Understanding the Human-LLM Dynamic: A Literature Survey of LLM Use in Programming Tasks [0.850206009406913]
Large Language Models (LLMs) are transforming programming practices, offering significant capabilities for code generation activities.
This paper focuses on their use in programming tasks, drawing insights from user studies that assess the impact of LLMs on programming tasks.
arXiv Detail & Related papers (2024-10-01T19:34:46Z) - A Survey on Complex Tasks for Goal-Directed Interactive Agents [60.53915548970061]
This survey compiles relevant tasks and environments for evaluating goal-directed interactive agents.
An up-to-date compilation of relevant resources can be found on our project website.
arXiv Detail & Related papers (2024-09-27T08:17:53Z) - CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics [44.30880626337739]
CooHOI is a framework designed to tackle the challenge of multi-humanoid object transportation problem.
A single humanoid character learns to interact with objects through imitation learning from human motion priors.
Then, the humanoid learns to collaborate with others by considering the shared dynamics of the manipulated object.
arXiv Detail & Related papers (2024-06-20T17:59:22Z) - Continual Robot Learning using Self-Supervised Task Inference [19.635428830237842]
We propose a self-supervised task inference approach to continually learn new tasks.
We use a behavior-matching self-supervised learning objective to train a novel Task Inference Network (TINet)
A multi-task policy is built on top of the TINet and trained with reinforcement learning to optimize performance over tasks.
arXiv Detail & Related papers (2023-09-10T09:32:35Z) - Object-Centric Multi-Task Learning for Human Instances [8.035105819936808]
We explore a compact multi-task network architecture that maximally shares the parameters of the multiple tasks via object-centric learning.
We propose a novel query design to encode the human instance information effectively, called human-centric query (HCQ)
Experimental results show that the proposed multi-task network achieves comparable accuracy to state-of-the-art task-specific models.
arXiv Detail & Related papers (2023-03-13T01:10:50Z) - Exploring the Role of Task Transferability in Large-Scale Multi-Task
Learning [28.104054292437525]
We disentangle the effect of scale and relatedness of tasks in multi-task representation learning.
If the target tasks are known ahead of time, then training on a smaller set of related tasks is competitive to the large-scale multi-task training.
arXiv Detail & Related papers (2022-04-23T18:11:35Z) - Variational Multi-Task Learning with Gumbel-Softmax Priors [105.22406384964144]
Multi-task learning aims to explore task relatedness to improve individual tasks.
We propose variational multi-task learning (VMTL), a general probabilistic inference framework for learning multiple related tasks.
arXiv Detail & Related papers (2021-11-09T18:49:45Z) - Towards More Generalizable One-shot Visual Imitation Learning [81.09074706236858]
A general-purpose robot should be able to master a wide range of tasks and quickly learn a novel one by leveraging past experiences.
One-shot imitation learning (OSIL) approaches this goal by training an agent with (pairs of) expert demonstrations.
We push for a higher level of generalization ability by investigating a more ambitious multi-task setup.
arXiv Detail & Related papers (2021-10-26T05:49:46Z) - Distribution Matching for Heterogeneous Multi-Task Learning: a
Large-scale Face Study [75.42182503265056]
Multi-Task Learning has emerged as a methodology in which multiple tasks are jointly learned by a shared learning algorithm.
We deal with heterogeneous MTL, simultaneously addressing detection, classification & regression problems.
We build FaceBehaviorNet, the first framework for large-scale face analysis, by jointly learning all facial behavior tasks.
arXiv Detail & Related papers (2021-05-08T22:26:52Z) - The IKEA ASM Dataset: Understanding People Assembling Furniture through
Actions, Objects and Pose [108.21037046507483]
IKEA ASM is a three million frame, multi-view, furniture assembly video dataset that includes depth, atomic actions, object segmentation, and human pose.
We benchmark prominent methods for video action recognition, object segmentation and human pose estimation tasks on this challenging dataset.
The dataset enables the development of holistic methods, which integrate multi-modal and multi-view data to better perform on these tasks.
arXiv Detail & Related papers (2020-07-01T11:34:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.