Exploring the Robustness of Human Parsers Towards Common Corruptions
- URL: http://arxiv.org/abs/2309.00938v2
- Date: Thu, 7 Sep 2023 02:30:16 GMT
- Title: Exploring the Robustness of Human Parsers Towards Common Corruptions
- Authors: Sanyi Zhang, Xiaochun Cao, Rui Wang, Guo-Jun Qi, Jie Zhou
- Abstract summary: We construct three corruption robustness benchmarks, termed LIP-C, ATR-C, and Pascal-Person-Part-C, to assist us in evaluating the risk tolerance of human parsing models.
Inspired by the data augmentation strategy, we propose a novel heterogeneous augmentation-enhanced mechanism to bolster robustness under commonly corrupted conditions.
- Score: 99.89886010550836
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Human parsing aims to segment each pixel of the human image with fine-grained
semantic categories. However, current human parsers trained with clean data are
easily confused by numerous image corruptions such as blur and noise. To
improve the robustness of human parsers, in this paper, we construct three
corruption robustness benchmarks, termed LIP-C, ATR-C, and
Pascal-Person-Part-C, to assist us in evaluating the risk tolerance of human
parsing models. Inspired by the data augmentation strategy, we propose a novel
heterogeneous augmentation-enhanced mechanism to bolster robustness under
commonly corrupted conditions. Specifically, two types of data augmentations
from different views, i.e., image-aware augmentation and model-aware
image-to-image transformation, are integrated in a sequential manner for
adapting to unforeseen image corruptions. The image-aware augmentation can
enrich the high diversity of training images with the help of common image
operations. The model-aware augmentation strategy that improves the diversity
of input data by considering the model's randomness. The proposed method is
model-agnostic, and it can plug and play into arbitrary state-of-the-art human
parsing frameworks. The experimental results show that the proposed method
demonstrates good universality which can improve the robustness of the human
parsing models and even the semantic segmentation models when facing various
image common corruptions. Meanwhile, it can still obtain approximate
performance on clean data.
Related papers
- MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling [64.09238330331195]
We propose a novel Multi-Modal Auto-Regressive (MMAR) probabilistic modeling framework.
Unlike discretization line of method, MMAR takes in continuous-valued image tokens to avoid information loss.
We show that MMAR demonstrates much more superior performance than other joint multi-modal models.
arXiv Detail & Related papers (2024-10-14T17:57:18Z) - Are They the Same Picture? Adapting Concept Bottleneck Models for Human-AI Collaboration in Image Retrieval [3.2495565849970016]
textttCHAIR enables humans to correct intermediate concepts, which helps textitimprove embeddings generated.
We show that our method performs better than similar models on image retrieval metrics without any external intervention.
arXiv Detail & Related papers (2024-07-12T00:59:32Z) - Diversity is Definitely Needed: Improving Model-Agnostic Zero-shot
Classification via Stable Diffusion [22.237426507711362]
Model-Agnostic Zero-Shot Classification (MA-ZSC) refers to training non-specific classification architectures to classify real images without using any real images during training.
Recent research has demonstrated that generating synthetic training images using diffusion models provides a potential solution to address MA-ZSC.
We propose modifications to the text-to-image generation process using a pre-trained diffusion model to enhance diversity.
arXiv Detail & Related papers (2023-02-07T07:13:53Z) - Traditional Classification Neural Networks are Good Generators: They are
Competitive with DDPMs and GANs [104.72108627191041]
We show that conventional neural network classifiers can generate high-quality images comparable to state-of-the-art generative models.
We propose a mask-based reconstruction module to make semantic gradients-aware to synthesize plausible images.
We show that our method is also applicable to text-to-image generation by regarding image-text foundation models.
arXiv Detail & Related papers (2022-11-27T11:25:35Z) - Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis.
Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z) - Adaptive Clustering of Robust Semantic Representations for Adversarial
Image Purification [0.9203366434753543]
We propose a robust defense against adversarial attacks, which is model agnostic and generalizable to unseen adversaries.
In this paper, we extract the latent representations for each class and adaptively cluster the latent representations that share a semantic similarity.
We adversarially train a new model constraining the latent space representation to minimize the distance between the adversarial latent representation and the true cluster distribution.
arXiv Detail & Related papers (2021-04-05T21:07:04Z) - Improving robustness against common corruptions with frequency biased
models [112.65717928060195]
unseen image corruptions can cause a surprisingly large drop in performance.
Image corruption types have different characteristics in the frequency spectrum and would benefit from a targeted type of data augmentation.
We propose a new regularization scheme that minimizes the total variation (TV) of convolution feature-maps to increase high-frequency robustness.
arXiv Detail & Related papers (2021-03-30T10:44:50Z) - Contextual Fusion For Adversarial Robustness [0.0]
Deep neural networks are usually designed to process one particular information stream and susceptible to various types of adversarial perturbations.
We developed a fusion model using a combination of background and foreground features extracted in parallel from Places-CNN and Imagenet-CNN.
For gradient based attacks, our results show that fusion allows for significant improvements in classification without decreasing performance on unperturbed data.
arXiv Detail & Related papers (2020-11-18T20:13:23Z) - Adversarial Semantic Data Augmentation for Human Pose Estimation [96.75411357541438]
We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity.
We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration.
State-of-the-art results are achieved on challenging benchmarks.
arXiv Detail & Related papers (2020-08-03T07:56:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.