Mining Cross-Person Cues for Body-Part Interactiveness Learning in HOI
Detection
- URL: http://arxiv.org/abs/2207.14192v1
- Date: Thu, 28 Jul 2022 15:57:51 GMT
- Title: Mining Cross-Person Cues for Body-Part Interactiveness Learning in HOI
Detection
- Authors: Xiaoqian Wu, Yong-Lu Li, Xinpeng Liu, Junyi Zhang, Yuzhe Wu, Cewu Lu
- Abstract summary: Human-Object Interaction (HOI) detection plays a crucial role in activity understanding.
Previous works only focus on the target person once (i.e., in a local perspective) and overlook the information of the other persons.
In this paper, we argue that comparing body-parts of multi-person simultaneously can afford us more useful and supplementary interactiveness cues.
We construct body-part saliency maps based on self-attention to mine cross-person informative cues and learn the holistic relationships between all the body-parts.
- Score: 39.61023122191333
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human-Object Interaction (HOI) detection plays a crucial role in activity
understanding. Though significant progress has been made, interactiveness
learning remains a challenging problem in HOI detection: existing methods
usually generate redundant negative H-O pair proposals and fail to effectively
extract interactive pairs. Though interactiveness has been studied in both
whole body- and part- level and facilitates the H-O pairing, previous works
only focus on the target person once (i.e., in a local perspective) and
overlook the information of the other persons. In this paper, we argue that
comparing body-parts of multi-person simultaneously can afford us more useful
and supplementary interactiveness cues. That said, to learn body-part
interactiveness from a global perspective: when classifying a target person's
body-part interactiveness, visual cues are explored not only from
herself/himself but also from other persons in the image. We construct
body-part saliency maps based on self-attention to mine cross-person
informative cues and learn the holistic relationships between all the
body-parts. We evaluate the proposed method on widely-used benchmarks HICO-DET
and V-COCO. With our new perspective, the holistic global-local body-part
interactiveness learning achieves significant improvements over
state-of-the-art. Our code is available at
https://github.com/enlighten0707/Body-Part-Map-for-Interactiveness.
Related papers
- Visual-Geometric Collaborative Guidance for Affordance Learning [63.038406948791454]
We propose a visual-geometric collaborative guided affordance learning network that incorporates visual and geometric cues.
Our method outperforms the representative models regarding objective metrics and visual quality.
arXiv Detail & Related papers (2024-10-15T07:35:51Z) - Disentangled Interaction Representation for One-Stage Human-Object
Interaction Detection [70.96299509159981]
Human-Object Interaction (HOI) detection is a core task for human-centric image understanding.
Recent one-stage methods adopt a transformer decoder to collect image-wide cues that are useful for interaction prediction.
Traditional two-stage methods benefit significantly from their ability to compose interaction features in a disentangled and explainable manner.
arXiv Detail & Related papers (2023-12-04T08:02:59Z) - Skeleton-Based Mutually Assisted Interacted Object Localization and
Human Action Recognition [111.87412719773889]
We propose a joint learning framework for "interacted object localization" and "human action recognition" based on skeleton data.
Our method achieves the best or competitive performance with the state-of-the-art methods for human action recognition.
arXiv Detail & Related papers (2021-10-28T10:09:34Z) - Transferable Interactiveness Knowledge for Human-Object Interaction
Detection [46.89715038756862]
We explore interactiveness knowledge which indicates whether a human and an object interact with each other or not.
We found that interactiveness knowledge can be learned across HOI datasets and bridge the gap between diverse HOI category settings.
Our core idea is to exploit an interactiveness network to learn the general interactiveness knowledge from multiple HOI datasets.
arXiv Detail & Related papers (2021-01-25T18:21:07Z) - Human Interaction Recognition Framework based on Interacting Body Part
Attention [24.913372626903648]
We propose a novel framework that simultaneously considers both implicit and explicit representations of human interactions.
The proposed method captures the subtle difference between different interactions using interacting body part attention.
We validate the effectiveness of the proposed method using four widely used public datasets.
arXiv Detail & Related papers (2021-01-22T06:52:42Z) - Learning Human-Object Interaction Detection using Interaction Points [140.0200950601552]
We propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs.
Our network predicts interaction points, which directly localize and classify the inter-action.
Experiments are performed on two popular benchmarks: V-COCO and HICO-DET.
arXiv Detail & Related papers (2020-03-31T08:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.