RR-Net: Injecting Interactive Semantics in Human-Object Interaction
Detection
- URL: http://arxiv.org/abs/2104.15015v1
- Date: Fri, 30 Apr 2021 14:03:10 GMT
- Title: RR-Net: Injecting Interactive Semantics in Human-Object Interaction
Detection
- Authors: Dongming Yang, Yuexian Zou, Can Zhang, Meng Cao, Jie Chen
- Abstract summary: Latest end-to-end HOI detectors are short of relation reasoning, which leads to inability to learn HOI-specific interactive semantics for predictions.
We first present a progressive Relation-aware Frame, which brings a new structure and parameter sharing pattern for interaction inference.
Based on modules above, we construct an end-to-end trainable framework named Relation Reasoning Network (abbr. RR-Net)
- Score: 40.65483058890176
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human-Object Interaction (HOI) detection devotes to learn how humans interact
with surrounding objects. Latest end-to-end HOI detectors are short of relation
reasoning, which leads to inability to learn HOI-specific interactive semantics
for predictions. In this paper, we therefore propose novel relation reasoning
for HOI detection. We first present a progressive Relation-aware Frame, which
brings a new structure and parameter sharing pattern for interaction inference.
Upon the frame, an Interaction Intensifier Module and a Correlation Parsing
Module are carefully designed, where: a) interactive semantics from humans can
be exploited and passed to objects to intensify interactions, b) interactive
correlations among humans, objects and interactions are integrated to promote
predictions. Based on modules above, we construct an end-to-end trainable
framework named Relation Reasoning Network (abbr. RR-Net). Extensive
experiments show that our proposed RR-Net sets a new state-of-the-art on both
V-COCO and HICO-DET benchmarks and improves the baseline about 5.5% and 9.8%
relatively, validating that this first effort in exploring relation reasoning
and integrating interactive semantics has brought obvious improvement for
end-to-end HOI detection.
Related papers
- SSL-Interactions: Pretext Tasks for Interactive Trajectory Prediction [4.286256266868156]
We present SSL-Interactions that proposes pretext tasks to enhance interaction modeling for trajectory prediction.
We introduce four interaction-aware pretext tasks to encapsulate various aspects of agent interactions.
We also propose an approach to curate interaction-heavy scenarios from datasets.
arXiv Detail & Related papers (2024-01-15T14:43:40Z) - Disentangled Interaction Representation for One-Stage Human-Object
Interaction Detection [70.96299509159981]
Human-Object Interaction (HOI) detection is a core task for human-centric image understanding.
Recent one-stage methods adopt a transformer decoder to collect image-wide cues that are useful for interaction prediction.
Traditional two-stage methods benefit significantly from their ability to compose interaction features in a disentangled and explainable manner.
arXiv Detail & Related papers (2023-12-04T08:02:59Z) - Parallel Reasoning Network for Human-Object Interaction Detection [53.422076419484945]
We propose a new transformer-based method named Parallel Reasoning Network(PR-Net)
PR-Net constructs two independent predictors for instance-level localization and relation-level understanding.
Our PR-Net has achieved competitive results on HICO-DET and V-COCO benchmarks.
arXiv Detail & Related papers (2023-01-09T17:00:34Z) - DIDER: Discovering Interpretable Dynamically Evolving Relations [14.69985920418015]
This paper introduces DIDER, Discovering Interpretable Dynamically Evolving Relations, a generic end-to-end interaction modeling framework with intrinsic interpretability.
We evaluate DIDER on both synthetic and real-world datasets.
arXiv Detail & Related papers (2022-08-22T20:55:56Z) - Reformulating HOI Detection as Adaptive Set Prediction [25.44630995307787]
We reformulate HOI detection as an adaptive set prediction problem.
We propose an Adaptive Set-based one-stage framework (AS-Net) with parallel instance and interaction branches.
Our method outperforms previous state-of-the-art methods without any extra human pose and language features.
arXiv Detail & Related papers (2021-03-10T10:40:33Z) - A Graph-based Interactive Reasoning for Human-Object Interaction
Detection [71.50535113279551]
We present a novel graph-based interactive reasoning model called Interactive Graph (abbr. in-Graph) to infer HOIs.
We construct a new framework to assemble in-Graph models for detecting HOIs, namely in-GraphNet.
Our framework is end-to-end trainable and free from costly annotations like human pose.
arXiv Detail & Related papers (2020-07-14T09:29:03Z) - Learning Human-Object Interaction Detection using Interaction Points [140.0200950601552]
We propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs.
Our network predicts interaction points, which directly localize and classify the inter-action.
Experiments are performed on two popular benchmarks: V-COCO and HICO-DET.
arXiv Detail & Related papers (2020-03-31T08:42:06Z) - Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding.
At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network.
With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.