DecAug: Augmenting HOI Detection via Decomposition
- URL: http://arxiv.org/abs/2010.01007v1
- Date: Fri, 2 Oct 2020 13:59:05 GMT
- Title: DecAug: Augmenting HOI Detection via Decomposition
- Authors: Yichen Xie, Hao-Shu Fang, Dian Shao, Yong-Lu Li, Cewu Lu
- Abstract summary: Current algorithms suffer from insufficient training samples and category imbalance within datasets.
We propose an efficient and effective data augmentation method called DecAug for HOI detection.
Experiments show that our method brings up to 3.3 mAP and 1.6 mAP improvements on V-COCO and HICODET dataset.
- Score: 54.65572599920679
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human-object interaction (HOI) detection requires a large amount of annotated
data. Current algorithms suffer from insufficient training samples and category
imbalance within datasets. To increase data efficiency, in this paper, we
propose an efficient and effective data augmentation method called DecAug for
HOI detection. Based on our proposed object state similarity metric, object
patterns across different HOIs are shared to augment local object appearance
features without changing their state. Further, we shift spatial correlation
between humans and objects to other feasible configurations with the aid of a
pose-guided Gaussian Mixture Model while preserving their interactions.
Experiments show that our method brings up to 3.3 mAP and 1.6 mAP improvements
on V-COCO and HICODET dataset for two advanced models. Specifically,
interactions with fewer samples enjoy more notable improvement. Our method can
be easily integrated into various HOI detection models with negligible extra
computational consumption. Our code will be made publicly available.
Related papers
- Data Augmentation for Image Classification using Generative AI [8.74488498507946]
Data augmentation is a promising solution to expanding the dataset size.
Recent approaches use generative AI models to improve dataset diversity.
We propose the Automated Generative Data Augmentation (AGA)
arXiv Detail & Related papers (2024-08-31T21:16:43Z) - A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap [50.079224604394]
We present a novel model-agnostic framework called textbfContext-textbfEnhanced textbfFeature textbfAment (CEFA)
CEFA consists of a feature alignment module and a context enhancement module.
Our method can serve as a plug-and-play module to improve the detection performance of HOI models on rare categories.
arXiv Detail & Related papers (2024-07-31T08:42:48Z) - DiffusionEngine: Diffusion Model is Scalable Data Engine for Object
Detection [41.436817746749384]
Diffusion Model is a scalable data engine for object detection.
DiffusionEngine (DE) provides high-quality detection-oriented training pairs in a single stage.
arXiv Detail & Related papers (2023-09-07T17:55:01Z) - Boosting Human-Object Interaction Detection with Text-to-Image Diffusion
Model [22.31860516617302]
We introduce DiffHOI, a novel HOI detection scheme grounded on a pre-trained text-image diffusion model.
To fill in the gaps of HOI datasets, we propose SynHOI, a class-balance, large-scale, and high-diversity synthetic dataset.
Experiments demonstrate that DiffHOI significantly outperforms the state-of-the-art in regular detection (i.e., 41.50 mAP) and zero-shot detection.
arXiv Detail & Related papers (2023-05-20T17:59:23Z) - Unified Visual Relationship Detection with Vision and Language Models [89.77838890788638]
This work focuses on training a single visual relationship detector predicting over the union of label spaces from multiple datasets.
We propose UniVRD, a novel bottom-up method for Unified Visual Relationship Detection by leveraging vision and language models.
Empirical results on both human-object interaction detection and scene-graph generation demonstrate the competitive performance of our model.
arXiv Detail & Related papers (2023-03-16T00:06:28Z) - Robust and Accurate Object Detection via Adversarial Learning [111.36192453882195]
This work augments the fine-tuning stage for object detectors by exploring adversarial examples.
Our approach boosts the performance of state-of-the-art EfficientDets by +1.1 mAP on the object detection benchmark.
arXiv Detail & Related papers (2021-03-23T19:45:26Z) - Detecting Human-Object Interaction via Fabricated Compositional Learning [106.37536031160282]
Human-Object Interaction (HOI) detection is a fundamental task for high-level scene understanding.
Human has extremely powerful compositional perception ability to cognize rare or unseen HOI samples.
We propose Fabricated Compositional Learning (FCL) to address the problem of open long-tailed HOI detection.
arXiv Detail & Related papers (2021-03-15T08:52:56Z) - Visual Compositional Learning for Human-Object Interaction Detection [111.05263071111807]
Human-Object interaction (HOI) detection aims to localize and infer relationships between human and objects in an image.
It is challenging because an enormous number of possible combinations of objects and verbs types forms a long-tail distribution.
We devise a deep Visual Compositional Learning framework, which is a simple yet efficient framework to effectively address this problem.
arXiv Detail & Related papers (2020-07-24T08:37:40Z) - Asynchronous Interaction Aggregation for Action Detection [43.34864954534389]
We propose the Asynchronous Interaction Aggregation network (AIA) that leverages different interactions to boost action detection.
There are two key designs in it: one is the Interaction Aggregation structure (IA) adopting a uniform paradigm to model and integrate multiple types of interaction; the other is the Asynchronous Memory Update algorithm (AMU) that enables us to achieve better performance.
arXiv Detail & Related papers (2020-04-16T07:03:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.