Related papers: Exploring the Robustness of Human Parsers Towards Common Corruptions

Exploring the Robustness of Human Parsers Towards Common Corruptions

URL: http://arxiv.org/abs/2309.00938v2
Date: Thu, 7 Sep 2023 02:30:16 GMT
Title: Exploring the Robustness of Human Parsers Towards Common Corruptions
Authors: Sanyi Zhang, Xiaochun Cao, Rui Wang, Guo-Jun Qi, Jie Zhou
Abstract summary: We construct three corruption robustness benchmarks, termed LIP-C, ATR-C, and Pascal-Person-Part-C, to assist us in evaluating the risk tolerance of human parsing models. Inspired by the data augmentation strategy, we propose a novel heterogeneous augmentation-enhanced mechanism to bolster robustness under commonly corrupted conditions.
Score: 99.89886010550836
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Human parsing aims to segment each pixel of the human image with fine-grained semantic categories. However, current human parsers trained with clean data are easily confused by numerous image corruptions such as blur and noise. To improve the robustness of human parsers, in this paper, we construct three corruption robustness benchmarks, termed LIP-C, ATR-C, and Pascal-Person-Part-C, to assist us in evaluating the risk tolerance of human parsing models. Inspired by the data augmentation strategy, we propose a novel heterogeneous augmentation-enhanced mechanism to bolster robustness under commonly corrupted conditions. Specifically, two types of data augmentations from different views, i.e., image-aware augmentation and model-aware image-to-image transformation, are integrated in a sequential manner for adapting to unforeseen image corruptions. The image-aware augmentation can enrich the high diversity of training images with the help of common image operations. The model-aware augmentation strategy that improves the diversity of input data by considering the model's randomness. The proposed method is model-agnostic, and it can plug and play into arbitrary state-of-the-art human parsing frameworks. The experimental results show that the proposed method demonstrates good universality which can improve the robustness of the human parsing models and even the semantic segmentation models when facing various image common corruptions. Meanwhile, it can still obtain approximate performance on clean data.

Related papers

Decoupled Data Augmentation for Improving Image Classification [37.50690945158849]
We introduce Decoupled Data Augmentation (De-DA), which resolves the fidelity-diversity dilemma. We use generative models to modify real CDPs under controlled conditions, preserving semantic consistency. We also replace the image's CIP with inter-class variants, creating diverse CDP-CIP combinations.
arXiv Detail & Related papers (2024-10-29T06:27:09Z)
MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling [64.09238330331195]
We propose a novel Multi-Modal Auto-Regressive (MMAR) probabilistic modeling framework. Unlike discretization line of method, MMAR takes in continuous-valued image tokens to avoid information loss. We show that MMAR demonstrates much more superior performance than other joint multi-modal models.
arXiv Detail & Related papers (2024-10-14T17:57:18Z)
Are They the Same Picture? Adapting Concept Bottleneck Models for Human-AI Collaboration in Image Retrieval [3.2495565849970016]
textttCHAIR enables humans to correct intermediate concepts, which helps textitimprove embeddings generated. We show that our method performs better than similar models on image retrieval metrics without any external intervention.
arXiv Detail & Related papers (2024-07-12T00:59:32Z)
Traditional Classification Neural Networks are Good Generators: They are Competitive with DDPMs and GANs [104.72108627191041]
We show that conventional neural network classifiers can generate high-quality images comparable to state-of-the-art generative models. We propose a mask-based reconstruction module to make semantic gradients-aware to synthesize plausible images. We show that our method is also applicable to text-to-image generation by regarding image-text foundation models.
arXiv Detail & Related papers (2022-11-27T11:25:35Z)
Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis. Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z)
Adaptive Clustering of Robust Semantic Representations for Adversarial Image Purification [0.9203366434753543]
We propose a robust defense against adversarial attacks, which is model agnostic and generalizable to unseen adversaries. In this paper, we extract the latent representations for each class and adaptively cluster the latent representations that share a semantic similarity. We adversarially train a new model constraining the latent space representation to minimize the distance between the adversarial latent representation and the true cluster distribution.
arXiv Detail & Related papers (2021-04-05T21:07:04Z)
Improving robustness against common corruptions with frequency biased models [112.65717928060195]
unseen image corruptions can cause a surprisingly large drop in performance. Image corruption types have different characteristics in the frequency spectrum and would benefit from a targeted type of data augmentation. We propose a new regularization scheme that minimizes the total variation (TV) of convolution feature-maps to increase high-frequency robustness.
arXiv Detail & Related papers (2021-03-30T10:44:50Z)
Contextual Fusion For Adversarial Robustness [0.0]
Deep neural networks are usually designed to process one particular information stream and susceptible to various types of adversarial perturbations. We developed a fusion model using a combination of background and foreground features extracted in parallel from Places-CNN and Imagenet-CNN. For gradient based attacks, our results show that fusion allows for significant improvements in classification without decreasing performance on unperturbed data.
arXiv Detail & Related papers (2020-11-18T20:13:23Z)
Adversarial Semantic Data Augmentation for Human Pose Estimation [96.75411357541438]
We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity. We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration. State-of-the-art results are achieved on challenging benchmarks.
arXiv Detail & Related papers (2020-08-03T07:56:04Z)
Learning Deformable Image Registration from Optimization: Perspective, Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation. We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.