UnionDet: Union-Level Detector Towards Real-Time Human-Object
Interaction Detection
- URL: http://arxiv.org/abs/2312.12664v1
- Date: Tue, 19 Dec 2023 23:34:43 GMT
- Title: UnionDet: Union-Level Detector Towards Real-Time Human-Object
Interaction Detection
- Authors: Bumsoo Kim, Taeho Choi, Jaewoo Kang, Hyunwoo J. Kim
- Abstract summary: We propose a one-stage meta-architecture for HOI detection powered by a novel union-level detector.
Our one-stage detector for human-object interaction shows a significant reduction in interaction prediction time 4x14x.
- Score: 35.2385914946471
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in deep neural networks have achieved significant progress in
detecting individual objects from an image. However, object detection is not
sufficient to fully understand a visual scene. Towards a deeper visual
understanding, the interactions between objects, especially humans and objects
are essential. Most prior works have obtained this information with a bottom-up
approach, where the objects are first detected and the interactions are
predicted sequentially by pairing the objects. This is a major bottleneck in
HOI detection inference time. To tackle this problem, we propose UnionDet, a
one-stage meta-architecture for HOI detection powered by a novel union-level
detector that eliminates this additional inference stage by directly capturing
the region of interaction. Our one-stage detector for human-object interaction
shows a significant reduction in interaction prediction time 4x~14x while
outperforming state-of-the-art methods on two public datasets: V-COCO and
HICO-DET.
Related papers
- A Review of Human-Object Interaction Detection [6.1941885271010175]
Human-object interaction (HOI) detection plays a key role in high-level visual understanding.
This paper systematically summarizes and discusses the recent work in image-based HOI detection.
arXiv Detail & Related papers (2024-08-20T08:32:39Z) - Disentangled Interaction Representation for One-Stage Human-Object
Interaction Detection [70.96299509159981]
Human-Object Interaction (HOI) detection is a core task for human-centric image understanding.
Recent one-stage methods adopt a transformer decoder to collect image-wide cues that are useful for interaction prediction.
Traditional two-stage methods benefit significantly from their ability to compose interaction features in a disentangled and explainable manner.
arXiv Detail & Related papers (2023-12-04T08:02:59Z) - HODN: Disentangling Human-Object Feature for HOI Detection [51.48164941412871]
We propose a Human and Object Disentangling Network (HODN) to model the Human-Object Interaction (HOI) relationships explicitly.
Considering that human features are more contributive to interaction, we propose a Human-Guide Linking method to make sure the interaction decoder focuses on the human-centric regions.
Our proposed method achieves competitive performance on both the V-COCO and the HICO-Det Linking datasets.
arXiv Detail & Related papers (2023-08-20T04:12:50Z) - Distance Matters in Human-Object Interaction Detection [22.3445174577181]
We propose a novel two-stage method for better handling distant interactions in HOI detection.
One essential component in our method is a novel Far Near Distance Attention module.
Besides, we devise a novel Distance-Aware loss function which leads the model to focus more on distant yet rare interactions.
arXiv Detail & Related papers (2022-07-05T08:06:05Z) - Exploiting Scene Graphs for Human-Object Interaction Detection [81.49184987430333]
Human-Object Interaction (HOI) detection is a fundamental visual task aiming at localizing and recognizing interactions between humans and objects.
We propose a novel method to exploit this information, through the scene graph, for the Human-Object Interaction (SG2HOI) detection task.
Our method, SG2HOI, incorporates the SG information in two ways: (1) we embed a scene graph into a global context clue, serving as the scene-specific environmental context; and (2) we build a relation-aware message-passing module to gather relationships from objects' neighborhood and transfer them into interactions.
arXiv Detail & Related papers (2021-08-19T09:40:50Z) - HOTR: End-to-End Human-Object Interaction Detection with Transformers [26.664864824357164]
We present a novel framework, referred to by HOTR, which directly predicts a set of human, object, interaction> triplets from an image.
Our proposed algorithm achieves the state-of-the-art performance in two HOI detection benchmarks with an inference time under 1 ms after object detection.
arXiv Detail & Related papers (2021-04-28T10:10:29Z) - Detecting Human-Object Interaction via Fabricated Compositional Learning [106.37536031160282]
Human-Object Interaction (HOI) detection is a fundamental task for high-level scene understanding.
Human has extremely powerful compositional perception ability to cognize rare or unseen HOI samples.
We propose Fabricated Compositional Learning (FCL) to address the problem of open long-tailed HOI detection.
arXiv Detail & Related papers (2021-03-15T08:52:56Z) - DRG: Dual Relation Graph for Human-Object Interaction Detection [65.50707710054141]
We tackle the challenging problem of human-object interaction (HOI) detection.
Existing methods either recognize the interaction of each human-object pair in isolation or perform joint inference based on complex appearance-based features.
In this paper, we leverage an abstract spatial-semantic representation to describe each human-object pair and aggregate the contextual information of the scene via a dual relation graph.
arXiv Detail & Related papers (2020-08-26T17:59:40Z) - Learning Human-Object Interaction Detection using Interaction Points [140.0200950601552]
We propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs.
Our network predicts interaction points, which directly localize and classify the inter-action.
Experiments are performed on two popular benchmarks: V-COCO and HICO-DET.
arXiv Detail & Related papers (2020-03-31T08:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.