Human-Object Interaction Detection:A Quick Survey and Examination of
Methods
- URL: http://arxiv.org/abs/2009.12950v1
- Date: Sun, 27 Sep 2020 20:58:39 GMT
- Title: Human-Object Interaction Detection:A Quick Survey and Examination of
Methods
- Authors: Trevor Bergstrom, Humphrey Shi
- Abstract summary: This is the first general survey of the state-of-the-art and milestone works in this field.
We provide a basic survey of the developments in the field of human-object interaction detection.
We examine the HORCNN architecture as it is a foundational work in the field.
- Score: 17.8805983491991
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human-object interaction detection is a relatively new task in the world of
computer vision and visual semantic information extraction. With the goal of
machines identifying interactions that humans perform on objects, there are
many real-world use cases for the research in this field. To our knowledge,
this is the first general survey of the state-of-the-art and milestone works in
this field. We provide a basic survey of the developments in the field of
human-object interaction detection. Many works in this field use multi-stream
convolutional neural network architectures, which combine features from
multiple sources in the input image. Most commonly these are the humans and
objects in question, as well as the spatial quality of the two. As far as we
are aware, there have not been in-depth studies performed that look into the
performance of each component individually. In order to provide insight to
future researchers, we perform an individualized study that examines the
performance of each component of a multi-stream convolutional neural network
architecture for human-object interaction detection. Specifically, we examine
the HORCNN architecture as it is a foundational work in the field. In addition,
we provide an in-depth look at the HICO-DET dataset, a popular benchmark in the
field of human-object interaction detection. Code and papers can be found at
https://github.com/SHI-Labs/Human-Object-Interaction-Detection.
Related papers
- Enhancing HOI Detection with Contextual Cues from Large Vision-Language Models [56.257840490146]
ConCue is a novel approach for improving visual feature extraction in HOI detection.
We develop a transformer-based feature extraction module with a multi-tower architecture that integrates contextual cues into both instance and interaction detectors.
arXiv Detail & Related papers (2023-11-26T09:11:32Z) - HODN: Disentangling Human-Object Feature for HOI Detection [51.48164941412871]
We propose a Human and Object Disentangling Network (HODN) to model the Human-Object Interaction (HOI) relationships explicitly.
Considering that human features are more contributive to interaction, we propose a Human-Guide Linking method to make sure the interaction decoder focuses on the human-centric regions.
Our proposed method achieves competitive performance on both the V-COCO and the HICO-Det Linking datasets.
arXiv Detail & Related papers (2023-08-20T04:12:50Z) - MECCANO: A Multimodal Egocentric Dataset for Humans Behavior
Understanding in the Industrial-like Domain [23.598727613908853]
We present MECCANO, a dataset of egocentric videos to study humans behavior understanding in industrial-like settings.
The multimodality is characterized by the presence of gaze signals, depth maps and RGB videos acquired simultaneously with a custom headset.
The dataset has been explicitly labeled for fundamental tasks in the context of human behavior understanding from a first person view.
arXiv Detail & Related papers (2022-09-19T00:52:42Z) - Learn to Predict How Humans Manipulate Large-sized Objects from
Interactive Motions [82.90906153293585]
We propose a graph neural network, HO-GCN, to fuse motion data and dynamic descriptors for the prediction task.
We show the proposed network that consumes dynamic descriptors can achieve state-of-the-art prediction results and help the network better generalize to unseen objects.
arXiv Detail & Related papers (2022-06-25T09:55:39Z) - Detecting Human-to-Human-or-Object (H2O) Interactions with DIABOLO [29.0200561485714]
We propose a new interaction dataset to deal with both types of human interactions: Human-to-Human-or-Object (H2O)
In addition, we introduce a novel taxonomy of verbs, intended to be closer to a description of human body attitude in relation to the surrounding targets of interaction.
We propose DIABOLO, an efficient subject-centric single-shot method to detect all interactions in one forward pass.
arXiv Detail & Related papers (2022-01-07T11:00:11Z) - Detecting Human-Object Interaction via Fabricated Compositional Learning [106.37536031160282]
Human-Object Interaction (HOI) detection is a fundamental task for high-level scene understanding.
Human has extremely powerful compositional perception ability to cognize rare or unseen HOI samples.
We propose Fabricated Compositional Learning (FCL) to address the problem of open long-tailed HOI detection.
arXiv Detail & Related papers (2021-03-15T08:52:56Z) - The MECCANO Dataset: Understanding Human-Object Interactions from
Egocentric Videos in an Industrial-like Domain [20.99718135562034]
We introduce MECCANO, the first dataset of egocentric videos to study human-object interactions in industrial-like settings.
The dataset has been explicitly labeled for the task of recognizing human-object interactions from an egocentric perspective.
Baseline results show that the MECCANO dataset is a challenging benchmark to study egocentric human-object interactions in industrial-like scenarios.
arXiv Detail & Related papers (2020-10-12T12:50:30Z) - DRG: Dual Relation Graph for Human-Object Interaction Detection [65.50707710054141]
We tackle the challenging problem of human-object interaction (HOI) detection.
Existing methods either recognize the interaction of each human-object pair in isolation or perform joint inference based on complex appearance-based features.
In this paper, we leverage an abstract spatial-semantic representation to describe each human-object pair and aggregate the contextual information of the scene via a dual relation graph.
arXiv Detail & Related papers (2020-08-26T17:59:40Z) - Learning Human-Object Interaction Detection using Interaction Points [140.0200950601552]
We propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs.
Our network predicts interaction points, which directly localize and classify the inter-action.
Experiments are performed on two popular benchmarks: V-COCO and HICO-DET.
arXiv Detail & Related papers (2020-03-31T08:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.