Learning Human-Object Interaction Detection using Interaction Points
- URL: http://arxiv.org/abs/2003.14023v1
- Date: Tue, 31 Mar 2020 08:42:06 GMT
- Title: Learning Human-Object Interaction Detection using Interaction Points
- Authors: Tiancai Wang and Tong Yang and Martin Danelljan and Fahad Shahbaz Khan
and Xiangyu Zhang and Jian Sun
- Abstract summary: We propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs.
Our network predicts interaction points, which directly localize and classify the inter-action.
Experiments are performed on two popular benchmarks: V-COCO and HICO-DET.
- Score: 140.0200950601552
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding interactions between humans and objects is one of the
fundamental problems in visual classification and an essential step towards
detailed scene understanding. Human-object interaction (HOI) detection strives
to localize both the human and an object as well as the identification of
complex interactions between them. Most existing HOI detection approaches are
instance-centric where interactions between all possible human-object pairs are
predicted based on appearance features and coarse spatial information. We argue
that appearance features alone are insufficient to capture complex human-object
interactions. In this paper, we therefore propose a novel fully-convolutional
approach that directly detects the interactions between human-object pairs. Our
network predicts interaction points, which directly localize and classify the
inter-action. Paired with the densely predicted interaction vectors, the
interactions are associated with human and object detections to obtain final
predictions. To the best of our knowledge, we are the first to propose an
approach where HOI detection is posed as a keypoint detection and grouping
problem. Experiments are performed on two popular benchmarks: V-COCO and
HICO-DET. Our approach sets a new state-of-the-art on both datasets. Code is
available at https://github.com/vaesl/IP-Net.
Related papers
- Visual-Geometric Collaborative Guidance for Affordance Learning [63.038406948791454]
We propose a visual-geometric collaborative guided affordance learning network that incorporates visual and geometric cues.
Our method outperforms the representative models regarding objective metrics and visual quality.
arXiv Detail & Related papers (2024-10-15T07:35:51Z) - UnionDet: Union-Level Detector Towards Real-Time Human-Object
Interaction Detection [35.2385914946471]
We propose a one-stage meta-architecture for HOI detection powered by a novel union-level detector.
Our one-stage detector for human-object interaction shows a significant reduction in interaction prediction time 4x14x.
arXiv Detail & Related papers (2023-12-19T23:34:43Z) - Disentangled Interaction Representation for One-Stage Human-Object
Interaction Detection [70.96299509159981]
Human-Object Interaction (HOI) detection is a core task for human-centric image understanding.
Recent one-stage methods adopt a transformer decoder to collect image-wide cues that are useful for interaction prediction.
Traditional two-stage methods benefit significantly from their ability to compose interaction features in a disentangled and explainable manner.
arXiv Detail & Related papers (2023-12-04T08:02:59Z) - HODN: Disentangling Human-Object Feature for HOI Detection [51.48164941412871]
We propose a Human and Object Disentangling Network (HODN) to model the Human-Object Interaction (HOI) relationships explicitly.
Considering that human features are more contributive to interaction, we propose a Human-Guide Linking method to make sure the interaction decoder focuses on the human-centric regions.
Our proposed method achieves competitive performance on both the V-COCO and the HICO-Det Linking datasets.
arXiv Detail & Related papers (2023-08-20T04:12:50Z) - Detecting Human-to-Human-or-Object (H2O) Interactions with DIABOLO [29.0200561485714]
We propose a new interaction dataset to deal with both types of human interactions: Human-to-Human-or-Object (H2O)
In addition, we introduce a novel taxonomy of verbs, intended to be closer to a description of human body attitude in relation to the surrounding targets of interaction.
We propose DIABOLO, an efficient subject-centric single-shot method to detect all interactions in one forward pass.
arXiv Detail & Related papers (2022-01-07T11:00:11Z) - Exploiting Scene Graphs for Human-Object Interaction Detection [81.49184987430333]
Human-Object Interaction (HOI) detection is a fundamental visual task aiming at localizing and recognizing interactions between humans and objects.
We propose a novel method to exploit this information, through the scene graph, for the Human-Object Interaction (SG2HOI) detection task.
Our method, SG2HOI, incorporates the SG information in two ways: (1) we embed a scene graph into a global context clue, serving as the scene-specific environmental context; and (2) we build a relation-aware message-passing module to gather relationships from objects' neighborhood and transfer them into interactions.
arXiv Detail & Related papers (2021-08-19T09:40:50Z) - GTNet:Guided Transformer Network for Detecting Human-Object Interactions [10.809778265707916]
The human-object interaction (HOI) detection task refers to localizing humans, localizing objects, and predicting the interactions between each human-object pair.
For detecting HOI, it is important to utilize relative spatial configurations and object semantics to find salient spatial regions of images.
This issue is addressed by the novel self-attention based guided transformer network, GTNet.
arXiv Detail & Related papers (2021-08-02T02:06:33Z) - Transferable Interactiveness Knowledge for Human-Object Interaction
Detection [46.89715038756862]
We explore interactiveness knowledge which indicates whether a human and an object interact with each other or not.
We found that interactiveness knowledge can be learned across HOI datasets and bridge the gap between diverse HOI category settings.
Our core idea is to exploit an interactiveness network to learn the general interactiveness knowledge from multiple HOI datasets.
arXiv Detail & Related papers (2021-01-25T18:21:07Z) - DRG: Dual Relation Graph for Human-Object Interaction Detection [65.50707710054141]
We tackle the challenging problem of human-object interaction (HOI) detection.
Existing methods either recognize the interaction of each human-object pair in isolation or perform joint inference based on complex appearance-based features.
In this paper, we leverage an abstract spatial-semantic representation to describe each human-object pair and aggregate the contextual information of the scene via a dual relation graph.
arXiv Detail & Related papers (2020-08-26T17:59:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.