Related papers: Geometric Features Informed Multi-person Human-object Interaction Recognition in Videos

Geometric Features Informed Multi-person Human-object Interaction Recognition in Videos

URL: http://arxiv.org/abs/2207.09425v1
Date: Tue, 19 Jul 2022 17:36:55 GMT
Title: Geometric Features Informed Multi-person Human-object Interaction Recognition in Videos
Authors: Tanqiu Qiao and Qianhui Men and Frederick W. B. Li and Yoshiki Kubotani and Shigeo Morishima and Hubert P. H. Shum
Abstract summary: We argue to combine the benefits of both visual and geometric features in HOI recognition. We propose a novel Two-level Geometric feature-informed Graph Convolutional Network (2G-GCN) To demonstrate the novelty and effectiveness of our method in challenging scenarios, we propose a new multi-person HOI dataset (MPHOI-72)
Score: 19.64072251418535
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Human-Object Interaction (HOI) recognition in videos is important for analyzing human activity. Most existing work focusing on visual features usually suffer from occlusion in the real-world scenarios. Such a problem will be further complicated when multiple people and objects are involved in HOIs. Consider that geometric features such as human pose and object position provide meaningful information to understand HOIs, we argue to combine the benefits of both visual and geometric features in HOI recognition, and propose a novel Two-level Geometric feature-informed Graph Convolutional Network (2G-GCN). The geometric-level graph models the interdependency between geometric features of humans and objects, while the fusion-level graph further fuses them with visual features of humans and objects. To demonstrate the novelty and effectiveness of our method in challenging scenarios, we propose a new multi-person HOI dataset (MPHOI-72). Extensive experiments on MPHOI-72 (multi-person HOI), CAD-120 (single-human HOI) and Bimanual Actions (two-hand HOI) datasets demonstrate our superior performance compared to state-of-the-arts.

Related papers

Egocentric Human-Object Interaction Detection: A New Benchmark and Method [14.765419467710812]
Ego-HOIBench is a new dataset to promote the benchmarking and development of Ego-HOI detection.<n>Our approach is lightweight and effective, and it can be easily applied to HOI baselines in a plug-and-play manner.
arXiv Detail & Related papers (2025-06-17T05:03:42Z)
Geometric Visual Fusion Graph Neural Networks for Multi-Person Human-Object Interaction Recognition in Videos [14.198003271084799]
Human-Object Interaction (HOI) recognition in videos requires understanding both visual patterns and geometric relationships as they evolve over time.<n>We propose the Geometric Visual Fusion Graph Neural Network (GeoVis-GNN), which uses dual-attention feature fusion combined with interdependent entity graph learning.<n>To advance HOI recognition to real-world scenarios, we introduce the Concurrent Partial Interaction dataset.
arXiv Detail & Related papers (2025-06-03T22:51:44Z)
GRACE: Estimating Geometry-level 3D Human-Scene Contact from 2D Images [54.602947113980655]
Estimating the geometry level of human-scene contact aims to ground specific contact surface points at 3D human geometries.<n> GRACE (Geometry-level Reasoning for 3D Human-scene Contact Estimation) is a new paradigm for 3D human contact estimation.<n>It incorporates a point cloud encoder-decoder architecture along with a hierarchical feature extraction and fusion module.
arXiv Detail & Related papers (2025-05-10T09:25:46Z)
Visual-Geometric Collaborative Guidance for Affordance Learning [63.038406948791454]
We propose a visual-geometric collaborative guided affordance learning network that incorporates visual and geometric cues. Our method outperforms the representative models regarding objective metrics and visual quality.
arXiv Detail & Related papers (2024-10-15T07:35:51Z)
From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos [9.159660801125812]
Video-based Human-Object Interaction (HOI) recognition explores the intricate dynamics between humans and objects. In this work, we propose a novel end-to-end category to scenery framework, CATS. We construct a scenery interactive graph with these enhanced geometric-visual features as nodes to learn the relationships among human and object categories.
arXiv Detail & Related papers (2024-07-01T02:42:55Z)
Full-Body Articulated Human-Object Interaction [61.01135739641217]
CHAIRS is a large-scale motion-captured f-AHOI dataset consisting of 16.2 hours of versatile interactions. CHAIRS provides 3D meshes of both humans and articulated objects during the entire interactive process. By learning the geometrical relationships in HOI, we devise the very first model that leverage human pose estimation.
arXiv Detail & Related papers (2022-12-20T19:50:54Z)
A Skeleton-aware Graph Convolutional Network for Human-Object Interaction Detection [14.900704382194013]
We propose a skeleton-aware graph convolutional network for human-object interaction detection, named SGCN4HOI. Our network exploits the spatial connections between human keypoints and object keypoints to capture their fine-grained structural interactions via graph convolutions. It fuses such geometric features with visual features and spatial configuration features obtained from human-object pairs.
arXiv Detail & Related papers (2022-07-11T15:20:18Z)
Learn to Predict How Humans Manipulate Large-sized Objects from Interactive Motions [82.90906153293585]
We propose a graph neural network, HO-GCN, to fuse motion data and dynamic descriptors for the prediction task. We show the proposed network that consumes dynamic descriptors can achieve state-of-the-art prediction results and help the network better generalize to unseen objects.
arXiv Detail & Related papers (2022-06-25T09:55:39Z)
Spatio-Temporal Interaction Graph Parsing Networks for Human-Object Interaction Recognition [55.7731053128204]
In given video-based Human-Object Interaction scene, modeling thetemporal relationship between humans and objects are the important cue to understand the contextual information presented in the video. With the effective-temporal relationship modeling, it is possible not only to uncover contextual information in each frame but also directly capture inter-time dependencies. The full use of appearance features, spatial location and the semantic information are also the key to improve the video-based Human-Object Interaction recognition performance.
arXiv Detail & Related papers (2021-08-19T11:57:27Z)
LIGHTEN: Learning Interactions with Graph and Hierarchical TEmporal Networks for HOI in videos [13.25502885135043]
Analyzing the interactions between humans and objects from a video includes identification of relationships between humans and the objects present in the video. We present a hierarchical approach, LIGHTEN, to learn visual features to effectively capture truth at multiple granularities in a video. We achieve state-of-the-art results in human-object interaction detection (88.9% and 92.6%) and anticipation tasks of CAD-120 and competitive results on image based HOI detection in V-COCO.
arXiv Detail & Related papers (2020-12-17T05:44:07Z)
DRG: Dual Relation Graph for Human-Object Interaction Detection [65.50707710054141]
We tackle the challenging problem of human-object interaction (HOI) detection. Existing methods either recognize the interaction of each human-object pair in isolation or perform joint inference based on complex appearance-based features. In this paper, we leverage an abstract spatial-semantic representation to describe each human-object pair and aggregate the contextual information of the scene via a dual relation graph.
arXiv Detail & Related papers (2020-08-26T17:59:40Z)
Learning Human-Object Interaction Detection using Interaction Points [140.0200950601552]
We propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs. Our network predicts interaction points, which directly localize and classify the inter-action. Experiments are performed on two popular benchmarks: V-COCO and HICO-DET.
arXiv Detail & Related papers (2020-03-31T08:42:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.