Distance Matters in Human-Object Interaction Detection
- URL: http://arxiv.org/abs/2207.01869v1
- Date: Tue, 5 Jul 2022 08:06:05 GMT
- Title: Distance Matters in Human-Object Interaction Detection
- Authors: Guangzhi Wang, Yangyang Guo, Yongkang Wong, Mohan Kankanhalli
- Abstract summary: We propose a novel two-stage method for better handling distant interactions in HOI detection.
One essential component in our method is a novel Far Near Distance Attention module.
Besides, we devise a novel Distance-Aware loss function which leads the model to focus more on distant yet rare interactions.
- Score: 22.3445174577181
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Human-Object Interaction (HOI) detection has received considerable attention
in the context of scene understanding. Despite the growing progress on
benchmarks, we realize that existing methods often perform unsatisfactorily on
distant interactions, where the leading causes are two-fold: 1) Distant
interactions are by nature more difficult to recognize than close ones. A
natural scene often involves multiple humans and objects with intricate spatial
relations, making the interaction recognition for distant human-object largely
affected by complex visual context. 2) Insufficient number of distant
interactions in benchmark datasets results in under-fitting on these instances.
To address these problems, in this paper, we propose a novel two-stage method
for better handling distant interactions in HOI detection. One essential
component in our method is a novel Far Near Distance Attention module. It
enables information propagation between humans and objects, whereby the spatial
distance is skillfully taken into consideration. Besides, we devise a novel
Distance-Aware loss function which leads the model to focus more on distant yet
rare interactions. We conduct extensive experiments on two challenging datasets
- HICO-DET and V-COCO. The results demonstrate that the proposed method can
surpass existing approaches by a large margin, resulting in new
state-of-the-art performance.
Related papers
- UnionDet: Union-Level Detector Towards Real-Time Human-Object
Interaction Detection [35.2385914946471]
We propose a one-stage meta-architecture for HOI detection powered by a novel union-level detector.
Our one-stage detector for human-object interaction shows a significant reduction in interaction prediction time 4x14x.
arXiv Detail & Related papers (2023-12-19T23:34:43Z) - Enhancing HOI Detection with Contextual Cues from Large Vision-Language Models [56.257840490146]
ConCue is a novel approach for improving visual feature extraction in HOI detection.
We develop a transformer-based feature extraction module with a multi-tower architecture that integrates contextual cues into both instance and interaction detectors.
arXiv Detail & Related papers (2023-11-26T09:11:32Z) - HODN: Disentangling Human-Object Feature for HOI Detection [51.48164941412871]
We propose a Human and Object Disentangling Network (HODN) to model the Human-Object Interaction (HOI) relationships explicitly.
Considering that human features are more contributive to interaction, we propose a Human-Guide Linking method to make sure the interaction decoder focuses on the human-centric regions.
Our proposed method achieves competitive performance on both the V-COCO and the HICO-Det Linking datasets.
arXiv Detail & Related papers (2023-08-20T04:12:50Z) - InterTracker: Discovering and Tracking General Objects Interacting with
Hands in the Wild [40.489171608114574]
Existing methods rely on frame-based detectors to locate interacting objects.
We propose to leverage hand-object interaction to track interactive objects.
Our proposed method outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2023-08-06T09:09:17Z) - Chairs Can be Stood on: Overcoming Object Bias in Human-Object
Interaction Detection [22.3445174577181]
Human-Object Interaction (HOI) in images is an important step towards high-level visual comprehension.
We propose a novel plug-and-play Object-wise Debiasing Memory (ODM) method for re-balancing the distribution of interactions under detected objects.
Our method brings consistent and significant improvements over baselines, especially on rare interactions under each object.
arXiv Detail & Related papers (2022-07-06T01:55:28Z) - Interactiveness Field in Human-Object Interactions [89.13149887013905]
We introduce a previously overlooked interactiveness bimodal prior: given an object in an image, after pairing it with the humans, the generated pairs are either mostly non-interactive, or mostly interactive.
We propose new energy constraints based on the cardinality and difference in the inherent "interactiveness field" underlying interactive versus non-interactive pairs.
Our method can detect more precise pairs and thus significantly boost HOI detection performance.
arXiv Detail & Related papers (2022-04-16T05:09:25Z) - DIRV: Dense Interaction Region Voting for End-to-End Human-Object
Interaction Detection [53.40028068801092]
We propose a novel one-stage HOI detection approach based on a new concept called interaction region for the HOI problem.
Unlike previous methods, our approach concentrates on the densely sampled interaction regions across different scales for each human-object pair.
In order to compensate for the detection flaws of a single interaction region, we introduce a novel voting strategy.
arXiv Detail & Related papers (2020-10-02T13:57:58Z) - DRG: Dual Relation Graph for Human-Object Interaction Detection [65.50707710054141]
We tackle the challenging problem of human-object interaction (HOI) detection.
Existing methods either recognize the interaction of each human-object pair in isolation or perform joint inference based on complex appearance-based features.
In this paper, we leverage an abstract spatial-semantic representation to describe each human-object pair and aggregate the contextual information of the scene via a dual relation graph.
arXiv Detail & Related papers (2020-08-26T17:59:40Z) - Learning Human-Object Interaction Detection using Interaction Points [140.0200950601552]
We propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs.
Our network predicts interaction points, which directly localize and classify the inter-action.
Experiments are performed on two popular benchmarks: V-COCO and HICO-DET.
arXiv Detail & Related papers (2020-03-31T08:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.