Benchmarking Robustness to Text-Guided Corruptions
- URL: http://arxiv.org/abs/2304.02963v2
- Date: Mon, 31 Jul 2023 09:08:49 GMT
- Title: Benchmarking Robustness to Text-Guided Corruptions
- Authors: Mohammadreza Mofayezi and Yasamin Medghalchi
- Abstract summary: We use diffusion models to edit images to different domains.
We define a prompt hierarchy based on the original ImageNet hierarchy to apply edits in different domains.
We observe that convolutional models are more robust than transformer architectures.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study investigates the robustness of image classifiers to text-guided
corruptions. We utilize diffusion models to edit images to different domains.
Unlike other works that use synthetic or hand-picked data for benchmarking, we
use diffusion models as they are generative models capable of learning to edit
images while preserving their semantic content. Thus, the corruptions will be
more realistic and the comparison will be more informative. Also, there is no
need for manual labeling and we can create large-scale benchmarks with less
effort. We define a prompt hierarchy based on the original ImageNet hierarchy
to apply edits in different domains. As well as introducing a new benchmark we
try to investigate the robustness of different vision models. The results of
this study demonstrate that the performance of image classifiers decreases
significantly in different language-based corruptions and edit domains. We also
observe that convolutional models are more robust than transformer
architectures. Additionally, we see that common data augmentation techniques
can improve the performance on both the original data and the edited images.
The findings of this research can help improve the design of image classifiers
and contribute to the development of more robust machine learning systems. The
code for generating the benchmark is available at
https://github.com/ckoorosh/RobuText.
Related papers
- ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object [78.58860252442045]
We introduce generative model as a data source for hard images that benchmark deep models' robustness.
We are able to generate images with more diversified backgrounds, textures, and materials than any prior work, where we term this benchmark as ImageNet-D.
Our work suggests that diffusion models can be an effective source to test vision models.
arXiv Detail & Related papers (2024-03-27T17:23:39Z) - Diversified in-domain synthesis with efficient fine-tuning for few-shot
classification [64.86872227580866]
Few-shot image classification aims to learn an image classifier using only a small set of labeled examples per class.
We propose DISEF, a novel approach which addresses the generalization challenge in few-shot learning using synthetic data.
We validate our method in ten different benchmarks, consistently outperforming baselines and establishing a new state-of-the-art for few-shot classification.
arXiv Detail & Related papers (2023-12-05T17:18:09Z) - Hardware Resilience Properties of Text-Guided Image Classifiers [15.787551066303804]
We present a novel method to enhance the reliability of image classification models during deployment in the face of transient hardware errors.
Our approach achieves a remarkable $5.5times$ average increase in hardware reliability.
arXiv Detail & Related papers (2023-11-23T15:38:13Z) - Discriminative Class Tokens for Text-to-Image Diffusion Models [107.98436819341592]
We propose a non-invasive fine-tuning technique that capitalizes on the expressive potential of free-form text.
Our method is fast compared to prior fine-tuning methods and does not require a collection of in-class images.
We evaluate our method extensively, showing that the generated images are: (i) more accurate and of higher quality than standard diffusion models, (ii) can be used to augment training data in a low-resource setting, and (iii) reveal information about the data used to train the guiding classifier.
arXiv Detail & Related papers (2023-03-30T05:25:20Z) - ImageNet-E: Benchmarking Neural Network Robustness via Attribute Editing [45.14977000707886]
Higher accuracy on ImageNet usually leads to better robustness against different corruptions.
We create a toolkit for object editing with controls of backgrounds, sizes, positions, and directions.
We evaluate the performance of current deep learning models, including both convolutional neural networks and vision transformers.
arXiv Detail & Related papers (2023-03-30T02:02:32Z) - Re-Imagen: Retrieval-Augmented Text-to-Image Generator [58.60472701831404]
Retrieval-Augmented Text-to-Image Generator (Re-Imagen)
Retrieval-Augmented Text-to-Image Generator (Re-Imagen)
arXiv Detail & Related papers (2022-09-29T00:57:28Z) - GIT: A Generative Image-to-text Transformer for Vision and Language [138.91581326369837]
We train a Generative Image-to-text Transformer, GIT, to unify vision-language tasks such as image/video captioning and question answering.
Our model surpasses the human performance for the first time on TextCaps (138.2 vs. 125.5 in CIDEr)
arXiv Detail & Related papers (2022-05-27T17:03:38Z) - Deepfake Network Architecture Attribution [23.375381198124014]
Existing works on fake image attribution perform multi-class classification on several Generative Adversarial Network (GAN) models.
We present the first study on textitDeepfake Network Architecture Attribution to attribute fake images on architecture-level.
arXiv Detail & Related papers (2022-02-28T14:54:30Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - RTIC: Residual Learning for Text and Image Composition using Graph
Convolutional Network [19.017377597937617]
We study the compositional learning of images and texts for image retrieval.
We introduce a novel method that combines the graph convolutional network (GCN) with existing composition methods.
arXiv Detail & Related papers (2021-04-07T09:41:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.