Adversarial Semantic Data Augmentation for Human Pose Estimation
- URL: http://arxiv.org/abs/2008.00697v1
- Date: Mon, 3 Aug 2020 07:56:04 GMT
- Title: Adversarial Semantic Data Augmentation for Human Pose Estimation
- Authors: Yanrui Bin, Xuan Cao, Xinya Chen, Yanhao Ge, Ying Tai, Chengjie Wang,
Jilin Li, Feiyue Huang, Changxin Gao, Nong Sang
- Abstract summary: We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity.
We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration.
State-of-the-art results are achieved on challenging benchmarks.
- Score: 96.75411357541438
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human pose estimation is the task of localizing body keypoints from still
images. The state-of-the-art methods suffer from insufficient examples of
challenging cases such as symmetric appearance, heavy occlusion and nearby
person. To enlarge the amounts of challenging cases, previous methods augmented
images by cropping and pasting image patches with weak semantics, which leads
to unrealistic appearance and limited diversity. We instead propose Semantic
Data Augmentation (SDA), a method that augments images by pasting segmented
body parts with various semantic granularity. Furthermore, we propose
Adversarial Semantic Data Augmentation (ASDA), which exploits a generative
network to dynamiclly predict tailored pasting configuration. Given
off-the-shelf pose estimation network as discriminator, the generator seeks the
most confusing transformation to increase the loss of the discriminator while
the discriminator takes the generated sample as input and learns from it. The
whole pipeline is optimized in an adversarial manner. State-of-the-art results
are achieved on challenging benchmarks.
Related papers
- Exploring the Robustness of Human Parsers Towards Common Corruptions [99.89886010550836]
We construct three corruption robustness benchmarks, termed LIP-C, ATR-C, and Pascal-Person-Part-C, to assist us in evaluating the risk tolerance of human parsing models.
Inspired by the data augmentation strategy, we propose a novel heterogeneous augmentation-enhanced mechanism to bolster robustness under commonly corrupted conditions.
arXiv Detail & Related papers (2023-09-02T13:32:14Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Training-free Diffusion Model Adaptation for Variable-Sized
Text-to-Image Synthesis [45.19847146506007]
Diffusion models (DMs) have recently gained attention with state-of-the-art performance in text-to-image synthesis.
This paper focuses on adapting text-to-image diffusion models to handle variety while maintaining visual fidelity.
arXiv Detail & Related papers (2023-06-14T17:23:07Z) - Realistic Data Enrichment for Robust Image Segmentation in
Histopathology [2.248423960136122]
We propose a new approach, based on diffusion models, which can enrich an imbalanced dataset with plausible examples from underrepresented groups.
Our method can simply expand limited clinical datasets making them suitable to train machine learning pipelines.
arXiv Detail & Related papers (2023-04-19T09:52:50Z) - DaliID: Distortion-Adaptive Learned Invariance for Identification Models [9.502663556403622]
We propose a methodology called Distortion-Adaptive Learned Invariance for Identification (DaliID) models.
DaliID models achieve state-of-the-art (SOTA) for both face recognition and person re-identification on seven benchmark datasets.
arXiv Detail & Related papers (2023-02-11T18:19:41Z) - Effective Data Augmentation With Diffusion Models [65.09758931804478]
We address the lack of diversity in data augmentation with image-to-image transformations parameterized by pre-trained text-to-image diffusion models.
Our method edits images to change their semantics using an off-the-shelf diffusion model, and generalizes to novel visual concepts from a few labelled examples.
We evaluate our approach on few-shot image classification tasks, and on a real-world weed recognition task, and observe an improvement in accuracy in tested domains.
arXiv Detail & Related papers (2023-02-07T20:42:28Z) - Consistency Regularisation in Varying Contexts and Feature Perturbations
for Semi-Supervised Semantic Segmentation of Histology Images [14.005379068469361]
We present a consistency based semi-supervised learning (SSL) approach that can help mitigate this challenge.
SSL models might also be susceptible to changing context and features perturbations exhibiting poor generalisation due to the limited training data.
We show that cross-consistency training makes the encoder features invariant to different perturbations and improves the prediction confidence.
arXiv Detail & Related papers (2023-01-30T18:21:57Z) - Discriminative Residual Analysis for Image Set Classification with
Posture and Age Variations [27.751472312581228]
Discriminant Residual Analysis (DRA) is proposed to improve the classification performance.
DRA attempts to obtain a powerful projection which casts the residual representations into a discriminant subspace.
Two regularization approaches are used to deal with the probable small sample size problem.
arXiv Detail & Related papers (2020-08-23T08:53:06Z) - Transferring and Regularizing Prediction for Semantic Segmentation [115.88957139226966]
In this paper, we exploit the intrinsic properties of semantic segmentation to alleviate such problem for model transfer.
We present a Regularizer of Prediction Transfer (RPT) that imposes the intrinsic properties as constraints to regularize model transfer in an unsupervised fashion.
Extensive experiments are conducted to verify the proposal of RPT on the transfer of models trained on GTA5 and SYNTHIA (synthetic data) to Cityscapes dataset (urban street scenes)
arXiv Detail & Related papers (2020-06-11T16:19:41Z) - Pathological Retinal Region Segmentation From OCT Images Using Geometric
Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape.
The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.