ENIGMA-51: Towards a Fine-Grained Understanding of Human-Object
Interactions in Industrial Scenarios
- URL: http://arxiv.org/abs/2309.14809v2
- Date: Mon, 27 Nov 2023 16:09:03 GMT
- Title: ENIGMA-51: Towards a Fine-Grained Understanding of Human-Object
Interactions in Industrial Scenarios
- Authors: Francesco Ragusa and Rosario Leonardi and Michele Mazzamuto and
Claudia Bonanno and Rosario Scavo and Antonino Furnari and Giovanni Maria
Farinella
- Abstract summary: ENIGMA-51 is a new egocentric dataset acquired in an industrial scenario by 19 subjects who followed instructions to complete the repair of electrical boards using industrial tools.
The 51 egocentric video sequences are densely annotated with a rich set of labels that enable the systematic study of human behavior in the industrial domain.
- Score: 11.424643984957118
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: ENIGMA-51 is a new egocentric dataset acquired in an industrial scenario by
19 subjects who followed instructions to complete the repair of electrical
boards using industrial tools (e.g., electric screwdriver) and equipments
(e.g., oscilloscope). The 51 egocentric video sequences are densely annotated
with a rich set of labels that enable the systematic study of human behavior in
the industrial domain. We provide benchmarks on four tasks related to human
behavior: 1) untrimmed temporal detection of human-object interactions, 2)
egocentric human-object interaction detection, 3) short-term object interaction
anticipation and 4) natural language understanding of intents and entities.
Baseline results show that the ENIGMA-51 dataset poses a challenging benchmark
to study human behavior in industrial scenarios. We publicly release the
dataset at https://iplab.dmi.unict.it/ENIGMA-51.
Related papers
- EgoPet: Egomotion and Interaction Data from an Animal's Perspective [82.7192364237065]
We introduce a dataset of pet egomotion imagery with diverse examples of simultaneous egomotion and multi-agent interaction.
EgoPet offers a radically distinct perspective from existing egocentric datasets of humans or vehicles.
We define two in-domain benchmark tasks that capture animal behavior, and a third benchmark to assess the utility of EgoPet as a pretraining resource to robotic quadruped locomotion.
arXiv Detail & Related papers (2024-04-15T17:59:47Z) - Inter-X: Towards Versatile Human-Human Interaction Analysis [100.254438708001]
We propose Inter-X, a dataset with accurate body movements and diverse interaction patterns.
The dataset includes 11K interaction sequences and more than 8.1M frames.
We also equip Inter-X with versatile annotations of more than 34K fine-grained human part-level textual descriptions.
arXiv Detail & Related papers (2023-12-26T13:36:05Z) - HoloAssist: an Egocentric Human Interaction Dataset for Interactive AI
Assistants in the Real World [48.90399899928823]
This work is part of a broader research effort to develop intelligent agents that can interactively guide humans through performing tasks in the physical world.
We introduce HoloAssist, a large-scale egocentric human interaction dataset.
We present key insights into how human assistants correct mistakes, intervene in the task completion procedure, and ground their instructions to the environment.
arXiv Detail & Related papers (2023-09-29T07:17:43Z) - HODN: Disentangling Human-Object Feature for HOI Detection [51.48164941412871]
We propose a Human and Object Disentangling Network (HODN) to model the Human-Object Interaction (HOI) relationships explicitly.
Considering that human features are more contributive to interaction, we propose a Human-Guide Linking method to make sure the interaction decoder focuses on the human-centric regions.
Our proposed method achieves competitive performance on both the V-COCO and the HICO-Det Linking datasets.
arXiv Detail & Related papers (2023-08-20T04:12:50Z) - Discovering a Variety of Objects in Spatio-Temporal Human-Object
Interactions [45.92485321148352]
In daily HOIs, humans often interact with a variety of objects, e.g., holding and touching dozens of household items in cleaning.
Here, we introduce a new benchmark based on AVA: Discoveringed Objects (DIO) including 51 interactions and 1,000+ objects.
An ST-HOI learning task is proposed expecting vision systems to track human actors, detect interactions and simultaneously discover objects.
arXiv Detail & Related papers (2022-11-14T16:33:54Z) - MECCANO: A Multimodal Egocentric Dataset for Humans Behavior
Understanding in the Industrial-like Domain [23.598727613908853]
We present MECCANO, a dataset of egocentric videos to study humans behavior understanding in industrial-like settings.
The multimodality is characterized by the presence of gaze signals, depth maps and RGB videos acquired simultaneously with a custom headset.
The dataset has been explicitly labeled for fundamental tasks in the context of human behavior understanding from a first person view.
arXiv Detail & Related papers (2022-09-19T00:52:42Z) - Learn to Predict How Humans Manipulate Large-sized Objects from
Interactive Motions [82.90906153293585]
We propose a graph neural network, HO-GCN, to fuse motion data and dynamic descriptors for the prediction task.
We show the proposed network that consumes dynamic descriptors can achieve state-of-the-art prediction results and help the network better generalize to unseen objects.
arXiv Detail & Related papers (2022-06-25T09:55:39Z) - Egocentric Human-Object Interaction Detection Exploiting Synthetic Data [19.220651860718892]
We consider the problem of detecting Egocentric HumanObject Interactions (EHOIs) in industrial contexts.
We propose a pipeline and a tool to generate photo-realistic synthetic First Person Vision (FPV) images automatically labeled for EHOI detection.
arXiv Detail & Related papers (2022-04-14T15:59:15Z) - The MECCANO Dataset: Understanding Human-Object Interactions from
Egocentric Videos in an Industrial-like Domain [20.99718135562034]
We introduce MECCANO, the first dataset of egocentric videos to study human-object interactions in industrial-like settings.
The dataset has been explicitly labeled for the task of recognizing human-object interactions from an egocentric perspective.
Baseline results show that the MECCANO dataset is a challenging benchmark to study egocentric human-object interactions in industrial-like scenarios.
arXiv Detail & Related papers (2020-10-12T12:50:30Z) - Learning Human-Object Interaction Detection using Interaction Points [140.0200950601552]
We propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs.
Our network predicts interaction points, which directly localize and classify the inter-action.
Experiments are performed on two popular benchmarks: V-COCO and HICO-DET.
arXiv Detail & Related papers (2020-03-31T08:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.