HICO-DET-SG and V-COCO-SG: New Data Splits for Evaluating the Systematic Generalization Performance of Human-Object Interaction Detection Models
- URL: http://arxiv.org/abs/2305.09948v5
- Date: Fri, 12 Apr 2024 00:46:26 GMT
- Title: HICO-DET-SG and V-COCO-SG: New Data Splits for Evaluating the Systematic Generalization Performance of Human-Object Interaction Detection Models
- Authors: Kentaro Takemoto, Moyuru Yamada, Tomotake Sasaki, Hisanao Akima,
- Abstract summary: Human-Object Interaction (HOI) detection is a task to localize humans and objects in an image and predict the interactions in human-object pairs.
We created two new sets of HOI detection data splits named HICO-DET-SG and V-COCO-SG based on the HICO-DET and V-COCO datasets.
When evaluated on the new data splits, HOI detection models with various characteristics performed much more poorly than when evaluated on the original splits.
- Score: 1.9374282535132379
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human-Object Interaction (HOI) detection is a task to localize humans and objects in an image and predict the interactions in human-object pairs. In real-world scenarios, HOI detection models need systematic generalization, i.e., generalization to novel combinations of objects and interactions, because the train data are expected to cover a limited portion of all possible combinations. To evaluate the systematic generalization performance of HOI detection models, we created two new sets of HOI detection data splits named HICO-DET-SG and V-COCO-SG based on the HICO-DET and V-COCO datasets, respectively. When evaluated on the new data splits, HOI detection models with various characteristics performed much more poorly than when evaluated on the original splits. This shows that systematic generalization is a challenging goal in HOI detection. By analyzing the evaluation results, we also gain insights for improving the systematic generalization performance and identify four possible future research directions. We hope that our new data splits and presented analysis will encourage further research on systematic generalization in HOI detection.
Related papers
- Evaluating the Predictive Features of Person-Centric Knowledge Graph Embeddings: Unfolding Ablation Studies [0.757843972001219]
We propose a systematic approach to examine the results of GNN models trained with structured and unstructured information from the MIMIC-III dataset.
We show the robustness of this approach in identifying predictive features in PKGs for the task of readmission prediction.
arXiv Detail & Related papers (2024-08-27T09:48:25Z) - Physics Inspired Hybrid Attention for SAR Target Recognition [61.01086031364307]
We propose a physics inspired hybrid attention (PIHA) mechanism and the once-for-all (OFA) evaluation protocol to address the issues.
PIHA leverages the high-level semantics of physical information to activate and guide the feature group aware of local semantics of target.
Our method outperforms other state-of-the-art approaches in 12 test scenarios with same ASC parameters.
arXiv Detail & Related papers (2023-09-27T14:39:41Z) - Parallel Reasoning Network for Human-Object Interaction Detection [53.422076419484945]
We propose a new transformer-based method named Parallel Reasoning Network(PR-Net)
PR-Net constructs two independent predictors for instance-level localization and relation-level understanding.
Our PR-Net has achieved competitive results on HICO-DET and V-COCO benchmarks.
arXiv Detail & Related papers (2023-01-09T17:00:34Z) - A Skeleton-aware Graph Convolutional Network for Human-Object
Interaction Detection [14.900704382194013]
We propose a skeleton-aware graph convolutional network for human-object interaction detection, named SGCN4HOI.
Our network exploits the spatial connections between human keypoints and object keypoints to capture their fine-grained structural interactions via graph convolutions.
It fuses such geometric features with visual features and spatial configuration features obtained from human-object pairs.
arXiv Detail & Related papers (2022-07-11T15:20:18Z) - On Generalisability of Machine Learning-based Network Intrusion
Detection Systems [0.0]
In this paper, we evaluate seven supervised and unsupervised learning models on four benchmark NIDS datasets.
Our investigation indicates that none of the considered models is able to generalise over all studied datasets.
Our investigation also indicates that overall, unsupervised learning methods generalise better than supervised learning models in our considered scenarios.
arXiv Detail & Related papers (2022-05-09T08:26:48Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - DecAug: Augmenting HOI Detection via Decomposition [54.65572599920679]
Current algorithms suffer from insufficient training samples and category imbalance within datasets.
We propose an efficient and effective data augmentation method called DecAug for HOI detection.
Experiments show that our method brings up to 3.3 mAP and 1.6 mAP improvements on V-COCO and HICODET dataset.
arXiv Detail & Related papers (2020-10-02T13:59:05Z) - DRG: Dual Relation Graph for Human-Object Interaction Detection [65.50707710054141]
We tackle the challenging problem of human-object interaction (HOI) detection.
Existing methods either recognize the interaction of each human-object pair in isolation or perform joint inference based on complex appearance-based features.
In this paper, we leverage an abstract spatial-semantic representation to describe each human-object pair and aggregate the contextual information of the scene via a dual relation graph.
arXiv Detail & Related papers (2020-08-26T17:59:40Z) - Novel Human-Object Interaction Detection via Adversarial Domain
Generalization [103.55143362926388]
We study the problem of novel human-object interaction (HOI) detection, aiming at improving the generalization ability of the model to unseen scenarios.
The challenge mainly stems from the large compositional space of objects and predicates, which leads to the lack of sufficient training data for all the object-predicate combinations.
We propose a unified framework of adversarial domain generalization to learn object-invariant features for predicate prediction.
arXiv Detail & Related papers (2020-05-22T22:02:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.