MECCANO: A Multimodal Egocentric Dataset for Humans Behavior
Understanding in the Industrial-like Domain
- URL: http://arxiv.org/abs/2209.08691v1
- Date: Mon, 19 Sep 2022 00:52:42 GMT
- Title: MECCANO: A Multimodal Egocentric Dataset for Humans Behavior
Understanding in the Industrial-like Domain
- Authors: Francesco Ragusa and Antonino Furnari and Giovanni Maria Farinella
- Abstract summary: We present MECCANO, a dataset of egocentric videos to study humans behavior understanding in industrial-like settings.
The multimodality is characterized by the presence of gaze signals, depth maps and RGB videos acquired simultaneously with a custom headset.
The dataset has been explicitly labeled for fundamental tasks in the context of human behavior understanding from a first person view.
- Score: 23.598727613908853
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Wearable cameras allow to acquire images and videos from the user's
perspective. These data can be processed to understand humans behavior. Despite
human behavior analysis has been thoroughly investigated in third person
vision, it is still understudied in egocentric settings and in particular in
industrial scenarios. To encourage research in this field, we present MECCANO,
a multimodal dataset of egocentric videos to study humans behavior
understanding in industrial-like settings. The multimodality is characterized
by the presence of gaze signals, depth maps and RGB videos acquired
simultaneously with a custom headset. The dataset has been explicitly labeled
for fundamental tasks in the context of human behavior understanding from a
first person view, such as recognizing and anticipating human-object
interactions. With the MECCANO dataset, we explored five different tasks
including 1) Action Recognition, 2) Active Objects Detection and Recognition,
3) Egocentric Human-Objects Interaction Detection, 4) Action Anticipation and
5) Next-Active Objects Detection. We propose a benchmark aimed to study human
behavior in the considered industrial-like scenario which demonstrates that the
investigated tasks and the considered scenario are challenging for
state-of-the-art algorithms. To support research in this field, we publicy
release the dataset at https://iplab.dmi.unict.it/MECCANO/.
Related papers
- HabitAction: A Video Dataset for Human Habitual Behavior Recognition [3.7478789114676108]
Human habitual behaviors (HHBs) hold significant importance for analyzing a person's personality, habits, and psychological changes.
In this work, we build a novel video dataset to demonstrate various HHBs.
The dataset contains 30 categories of habitual behaviors including more than 300,000 frames and 6,899 action instances.
arXiv Detail & Related papers (2024-08-24T04:40:31Z) - ENIGMA-51: Towards a Fine-Grained Understanding of Human-Object
Interactions in Industrial Scenarios [11.424643984957118]
ENIGMA-51 is a new egocentric dataset acquired in an industrial scenario by 19 subjects who followed instructions to complete the repair of electrical boards using industrial tools.
The 51 egocentric video sequences are densely annotated with a rich set of labels that enable the systematic study of human behavior in the industrial domain.
arXiv Detail & Related papers (2023-09-26T10:14:44Z) - HODN: Disentangling Human-Object Feature for HOI Detection [51.48164941412871]
We propose a Human and Object Disentangling Network (HODN) to model the Human-Object Interaction (HOI) relationships explicitly.
Considering that human features are more contributive to interaction, we propose a Human-Guide Linking method to make sure the interaction decoder focuses on the human-centric regions.
Our proposed method achieves competitive performance on both the V-COCO and the HICO-Det Linking datasets.
arXiv Detail & Related papers (2023-08-20T04:12:50Z) - Towards Continual Egocentric Activity Recognition: A Multi-modal
Egocentric Activity Dataset for Continual Learning [21.68009790164824]
We present a multi-modal egocentric activity dataset for continual learning named UESTC-MMEA-CL.
It contains synchronized data of videos, accelerometers, and gyroscopes, for 32 types of daily activities, performed by 10 participants.
Results of egocentric activity recognition are reported when using separately, and jointly, three modalities: RGB, acceleration, and gyroscope.
arXiv Detail & Related papers (2023-01-26T04:32:00Z) - BEHAVE: Dataset and Method for Tracking Human Object Interactions [105.77368488612704]
We present the first full body human- object interaction dataset with multi-view RGBD frames and corresponding 3D SMPL and object fits along with the annotated contacts between them.
We use this data to learn a model that can jointly track humans and objects in natural environments with an easy-to-use portable multi-camera setup.
arXiv Detail & Related papers (2022-04-14T13:21:19Z) - Overcoming the Domain Gap in Neural Action Representations [60.47807856873544]
3D pose data can now be reliably extracted from multi-view video sequences without manual intervention.
We propose to use it to guide the encoding of neural action representations together with a set of neural and behavioral augmentations.
To reduce the domain gap, during training, we swap neural and behavioral data across animals that seem to be performing similar actions.
arXiv Detail & Related papers (2021-12-02T12:45:46Z) - What Can You Learn from Your Muscles? Learning Visual Representation
from Human Interactions [50.435861435121915]
We use human interaction and attention cues to investigate whether we can learn better representations compared to visual-only representations.
Our experiments show that our "muscly-supervised" representation outperforms a visual-only state-of-the-art method MoCo.
arXiv Detail & Related papers (2020-10-16T17:46:53Z) - The MECCANO Dataset: Understanding Human-Object Interactions from
Egocentric Videos in an Industrial-like Domain [20.99718135562034]
We introduce MECCANO, the first dataset of egocentric videos to study human-object interactions in industrial-like settings.
The dataset has been explicitly labeled for the task of recognizing human-object interactions from an egocentric perspective.
Baseline results show that the MECCANO dataset is a challenging benchmark to study egocentric human-object interactions in industrial-like scenarios.
arXiv Detail & Related papers (2020-10-12T12:50:30Z) - Human-Object Interaction Detection:A Quick Survey and Examination of
Methods [17.8805983491991]
This is the first general survey of the state-of-the-art and milestone works in this field.
We provide a basic survey of the developments in the field of human-object interaction detection.
We examine the HORCNN architecture as it is a foundational work in the field.
arXiv Detail & Related papers (2020-09-27T20:58:39Z) - The IKEA ASM Dataset: Understanding People Assembling Furniture through
Actions, Objects and Pose [108.21037046507483]
IKEA ASM is a three million frame, multi-view, furniture assembly video dataset that includes depth, atomic actions, object segmentation, and human pose.
We benchmark prominent methods for video action recognition, object segmentation and human pose estimation tasks on this challenging dataset.
The dataset enables the development of holistic methods, which integrate multi-modal and multi-view data to better perform on these tasks.
arXiv Detail & Related papers (2020-07-01T11:34:46Z) - Learning Human-Object Interaction Detection using Interaction Points [140.0200950601552]
We propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs.
Our network predicts interaction points, which directly localize and classify the inter-action.
Experiments are performed on two popular benchmarks: V-COCO and HICO-DET.
arXiv Detail & Related papers (2020-03-31T08:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.