GID-Net: Detecting Human-Object Interaction with Global and Instance
Dependency
- URL: http://arxiv.org/abs/2003.05242v1
- Date: Wed, 11 Mar 2020 11:58:43 GMT
- Title: GID-Net: Detecting Human-Object Interaction with Global and Instance
Dependency
- Authors: Dongming Yang, YueXian Zou, Jian Zhang, Ge Li
- Abstract summary: We introduce a two-stage trainable reasoning mechanism, referred to as GID block.
GID-Net is a human-object interaction detection framework consisting of a human branch, an object branch and an interaction branch.
We have compared our proposed GID-Net with existing state-of-the-art methods on two public benchmarks, including V-COCO and HICO-DET.
- Score: 67.95192190179975
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Since detecting and recognizing individual human or object are not adequate
to understand the visual world, learning how humans interact with surrounding
objects becomes a core technology. However, convolution operations are weak in
depicting visual interactions between the instances since they only build
blocks that process one local neighborhood at a time. To address this problem,
we learn from human perception in observing HOIs to introduce a two-stage
trainable reasoning mechanism, referred to as GID block. GID block breaks
through the local neighborhoods and captures long-range dependency of pixels
both in global-level and instance-level from the scene to help detecting
interactions between instances. Furthermore, we conduct a multi-stream network
called GID-Net, which is a human-object interaction detection framework
consisting of a human branch, an object branch and an interaction branch.
Semantic information in global-level and local-level are efficiently reasoned
and aggregated in each of the branches. We have compared our proposed GID-Net
with existing state-of-the-art methods on two public benchmarks, including
V-COCO and HICO-DET. The results have showed that GID-Net outperforms the
existing best-performing methods on both the above two benchmarks, validating
its efficacy in detecting human-object interactions.
Related papers
- Hierarchical Graph Interaction Transformer with Dynamic Token Clustering for Camouflaged Object Detection [57.883265488038134]
We propose a hierarchical graph interaction network termed HGINet for camouflaged object detection.
The network is capable of discovering imperceptible objects via effective graph interaction among the hierarchical tokenized features.
Our experiments demonstrate the superior performance of HGINet compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2024-08-27T12:53:25Z) - Effective Actor-centric Human-object Interaction Detection [20.564689533862524]
We propose a novel actor-centric framework to detect Human-Object Interaction in images.
Our method achieves the state-of-the-art on the challenging V-COCO and HICO-DET benchmarks.
arXiv Detail & Related papers (2022-02-24T10:24:44Z) - Exploiting Scene Graphs for Human-Object Interaction Detection [81.49184987430333]
Human-Object Interaction (HOI) detection is a fundamental visual task aiming at localizing and recognizing interactions between humans and objects.
We propose a novel method to exploit this information, through the scene graph, for the Human-Object Interaction (SG2HOI) detection task.
Our method, SG2HOI, incorporates the SG information in two ways: (1) we embed a scene graph into a global context clue, serving as the scene-specific environmental context; and (2) we build a relation-aware message-passing module to gather relationships from objects' neighborhood and transfer them into interactions.
arXiv Detail & Related papers (2021-08-19T09:40:50Z) - HOTR: End-to-End Human-Object Interaction Detection with Transformers [26.664864824357164]
We present a novel framework, referred to by HOTR, which directly predicts a set of human, object, interaction> triplets from an image.
Our proposed algorithm achieves the state-of-the-art performance in two HOI detection benchmarks with an inference time under 1 ms after object detection.
arXiv Detail & Related papers (2021-04-28T10:10:29Z) - DRG: Dual Relation Graph for Human-Object Interaction Detection [65.50707710054141]
We tackle the challenging problem of human-object interaction (HOI) detection.
Existing methods either recognize the interaction of each human-object pair in isolation or perform joint inference based on complex appearance-based features.
In this paper, we leverage an abstract spatial-semantic representation to describe each human-object pair and aggregate the contextual information of the scene via a dual relation graph.
arXiv Detail & Related papers (2020-08-26T17:59:40Z) - A Graph-based Interactive Reasoning for Human-Object Interaction
Detection [71.50535113279551]
We present a novel graph-based interactive reasoning model called Interactive Graph (abbr. in-Graph) to infer HOIs.
We construct a new framework to assemble in-Graph models for detecting HOIs, namely in-GraphNet.
Our framework is end-to-end trainable and free from costly annotations like human pose.
arXiv Detail & Related papers (2020-07-14T09:29:03Z) - Learning Human-Object Interaction Detection using Interaction Points [140.0200950601552]
We propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs.
Our network predicts interaction points, which directly localize and classify the inter-action.
Experiments are performed on two popular benchmarks: V-COCO and HICO-DET.
arXiv Detail & Related papers (2020-03-31T08:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.