Knowledge Guided Bidirectional Attention Network for Human-Object
Interaction Detection
- URL: http://arxiv.org/abs/2207.07979v1
- Date: Sat, 16 Jul 2022 16:42:49 GMT
- Title: Knowledge Guided Bidirectional Attention Network for Human-Object
Interaction Detection
- Authors: Jingjia Huang and Baixiang Yang
- Abstract summary: We argue that the independent use of the bottom-up parsing strategy in HOI is counter-intuitive and could lead to the diffusion of attention.
We introduce a novel knowledge-guided top-down attention into HOI, and propose to model the relation parsing as a "look and search" process.
We implement the process via unifying the bottom-up and top-down attention in a single encoder-decoder based model.
- Score: 3.0915392100355192
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human Object Interaction (HOI) detection is a challenging task that requires
to distinguish the interaction between a human-object pair. Attention based
relation parsing is a popular and effective strategy utilized in HOI. However,
current methods execute relation parsing in a "bottom-up" manner. We argue that
the independent use of the bottom-up parsing strategy in HOI is
counter-intuitive and could lead to the diffusion of attention. Therefore, we
introduce a novel knowledge-guided top-down attention into HOI, and propose to
model the relation parsing as a "look and search" process: execute
scene-context modeling (i.e. look), and then, given the knowledge of the target
pair, search visual clues for the discrimination of the interaction between the
pair. We implement the process via unifying the bottom-up and top-down
attention in a single encoder-decoder based model. The experimental results
show that our model achieves competitive performance on the V-COCO and HICO-DET
datasets.
Related papers
- A Review of Human-Object Interaction Detection [6.1941885271010175]
Human-object interaction (HOI) detection plays a key role in high-level visual understanding.
This paper systematically summarizes and discusses the recent work in image-based HOI detection.
arXiv Detail & Related papers (2024-08-20T08:32:39Z) - Exploring Self- and Cross-Triplet Correlations for Human-Object
Interaction Detection [38.86053346974547]
We propose to explore Self- and Cross-Triplet Correlations for HOI detection.
Specifically, we regard each triplet proposal as a graph where Human, Object represent nodes and Action indicates edge.
Also, we try to explore cross-triplet dependencies by jointly considering instance-level, semantic-level, and layout-level relations.
arXiv Detail & Related papers (2024-01-11T05:38:24Z) - Disentangled Interaction Representation for One-Stage Human-Object
Interaction Detection [70.96299509159981]
Human-Object Interaction (HOI) detection is a core task for human-centric image understanding.
Recent one-stage methods adopt a transformer decoder to collect image-wide cues that are useful for interaction prediction.
Traditional two-stage methods benefit significantly from their ability to compose interaction features in a disentangled and explainable manner.
arXiv Detail & Related papers (2023-12-04T08:02:59Z) - Exploring Predicate Visual Context in Detecting Human-Object
Interactions [44.937383506126274]
We study how best to re-introduce image features via cross-attention.
Our model with enhanced predicate visual context (PViC) outperforms state-of-the-art methods on the HICO-DET and V-COCO benchmarks.
arXiv Detail & Related papers (2023-08-11T15:57:45Z) - DRG: Dual Relation Graph for Human-Object Interaction Detection [65.50707710054141]
We tackle the challenging problem of human-object interaction (HOI) detection.
Existing methods either recognize the interaction of each human-object pair in isolation or perform joint inference based on complex appearance-based features.
In this paper, we leverage an abstract spatial-semantic representation to describe each human-object pair and aggregate the contextual information of the scene via a dual relation graph.
arXiv Detail & Related papers (2020-08-26T17:59:40Z) - ConsNet: Learning Consistency Graph for Zero-Shot Human-Object
Interaction Detection [101.56529337489417]
We consider the problem of Human-Object Interaction (HOI) Detection, which aims to locate and recognize HOI instances in the form of human, action, object> in images.
We argue that multi-level consistencies among objects, actions and interactions are strong cues for generating semantic representations of rare or previously unseen HOIs.
Our model takes visual features of candidate human-object pairs and word embeddings of HOI labels as inputs, maps them into visual-semantic joint embedding space and obtains detection results by measuring their similarities.
arXiv Detail & Related papers (2020-08-14T09:11:18Z) - A Graph-based Interactive Reasoning for Human-Object Interaction
Detection [71.50535113279551]
We present a novel graph-based interactive reasoning model called Interactive Graph (abbr. in-Graph) to infer HOIs.
We construct a new framework to assemble in-Graph models for detecting HOIs, namely in-GraphNet.
Our framework is end-to-end trainable and free from costly annotations like human pose.
arXiv Detail & Related papers (2020-07-14T09:29:03Z) - Learning Human-Object Interaction Detection using Interaction Points [140.0200950601552]
We propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs.
Our network predicts interaction points, which directly localize and classify the inter-action.
Experiments are performed on two popular benchmarks: V-COCO and HICO-DET.
arXiv Detail & Related papers (2020-03-31T08:42:06Z) - Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding.
At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network.
With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.