Hardware Resilience Properties of Text-Guided Image Classifiers
- URL: http://arxiv.org/abs/2311.14062v2
- Date: Tue, 5 Dec 2023 08:56:04 GMT
- Title: Hardware Resilience Properties of Text-Guided Image Classifiers
- Authors: Syed Talal Wasim, Kabila Haile Soboka, Abdulrahman Mahmoud, Salman
Khan, David Brooks, Gu-Yeon Wei
- Abstract summary: We present a novel method to enhance the reliability of image classification models during deployment in the face of transient hardware errors.
Our approach achieves a remarkable $5.5times$ average increase in hardware reliability.
- Score: 15.787551066303804
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper presents a novel method to enhance the reliability of image
classification models during deployment in the face of transient hardware
errors. By utilizing enriched text embeddings derived from GPT-3 with question
prompts per class and CLIP pretrained text encoder, we investigate their impact
as an initialization for the classification layer. Our approach achieves a
remarkable $5.5\times$ average increase in hardware reliability (and up to
$14\times$) across various architectures in the most critical layer, with
minimal accuracy drop ($0.3\%$ on average) compared to baseline PyTorch models.
Furthermore, our method seamlessly integrates with any image classification
backbone, showcases results across various network architectures, decreases
parameter and FLOPs overhead, and follows a consistent training recipe. This
research offers a practical and efficient solution to bolster the robustness of
image classification models against hardware failures, with potential
implications for future studies in this domain. Our code and models are
released at https://github.com/TalalWasim/TextGuidedResilience.
Related papers
- Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model [80.61157097223058]
A prevalent strategy to bolster image classification performance is through augmenting the training set with synthetic images generated by T2I models.
In this study, we scrutinize the shortcomings of both current generative and conventional data augmentation techniques.
We introduce an innovative inter-class data augmentation method known as Diff-Mix, which enriches the dataset by performing image translations between classes.
arXiv Detail & Related papers (2024-03-28T17:23:45Z) - Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection [13.840950434728533]
State-of-the-art Synthetic Image Detection (SID) research has led to strong evidence on the advantages of feature extraction from foundation models.
We leverage the image representations extracted by intermediate Transformer blocks of CLIP's image-encoder via a lightweight network.
Our method is compared against the state-of-the-art by evaluating it on 20 test datasets and exhibits an average +10.6% absolute performance improvement.
arXiv Detail & Related papers (2024-02-29T12:18:43Z) - An Ordinal Regression Framework for a Deep Learning Based Severity
Assessment for Chest Radiographs [50.285682227571996]
We propose a framework that divides the ordinal regression problem into three parts: a model, a target function, and a classification function.
We show that the choice of encoding has a strong impact on performance and that the best encoding depends on the chosen weighting of Cohen's kappa.
arXiv Detail & Related papers (2024-02-08T14:00:45Z) - Benchmarking Robustness to Text-Guided Corruptions [0.0]
We use diffusion models to edit images to different domains.
We define a prompt hierarchy based on the original ImageNet hierarchy to apply edits in different domains.
We observe that convolutional models are more robust than transformer architectures.
arXiv Detail & Related papers (2023-04-06T09:40:02Z) - Fine-Grained ImageNet Classification in the Wild [0.0]
Robustness tests can uncover several vulnerabilities and biases which go unnoticed during the typical model evaluation stage.
In our work, we perform fine-grained classification on closely related categories, which are identified with the help of hierarchical knowledge.
arXiv Detail & Related papers (2023-03-04T12:25:07Z) - Modeling Image Composition for Complex Scene Generation [77.10533862854706]
We present a method that achieves state-of-the-art results on layout-to-image generation tasks.
After compressing RGB images into patch tokens, we propose the Transformer with Focal Attention (TwFA) for exploring dependencies of object-to-object, object-to-patch and patch-to-patch.
arXiv Detail & Related papers (2022-06-02T08:34:25Z) - Simple Open-Vocabulary Object Detection with Vision Transformers [51.57562920090721]
We propose a strong recipe for transferring image-text models to open-vocabulary object detection.
We use a standard Vision Transformer architecture with minimal modifications, contrastive image-text pre-training, and end-to-end detection fine-tuning.
We provide the adaptation strategies and regularizations needed to attain very strong performance on zero-shot text-conditioned and one-shot image-conditioned object detection.
arXiv Detail & Related papers (2022-05-12T17:20:36Z) - DAFormer: Improving Network Architectures and Training Strategies for
Domain-Adaptive Semantic Segmentation [99.88539409432916]
We study the unsupervised domain adaptation (UDA) process.
We propose a novel UDA method, DAFormer, based on the benchmark results.
DAFormer significantly improves the state-of-the-art performance by 10.8 mIoU for GTA->Cityscapes and 5.4 mIoU for Synthia->Cityscapes.
arXiv Detail & Related papers (2021-11-29T19:00:46Z) - A Closer Look at Few-Shot Video Classification: A New Baseline and
Benchmark [33.86872697028233]
We present an in-depth study on few-shot video classification by making three contributions.
First, we perform a consistent comparative study on the existing metric-based methods to figure out their limitations in representation learning.
Second, we discover that there is a high correlation between the novel action class and the ImageNet object class, which is problematic in the few-shot recognition setting.
Third, we present a new benchmark with more base data to facilitate future few-shot video classification without pre-training.
arXiv Detail & Related papers (2021-10-24T06:01:46Z) - Reconciliation of Statistical and Spatial Sparsity For Robust Image and
Image-Set Classification [27.319334479994787]
We propose a novel Joint Statistical and Spatial Sparse representation, dubbed textitJ3S, to model the image or image-set data for classification.
We propose to solve the joint sparse coding problem based on the J3S model, by coupling the local and global image representations using joint sparsity.
Experiments show that the proposed J3S-based image classification scheme outperforms the popular or state-of-the-art competing methods over FMD, UIUC, ETH-80 and YTC databases.
arXiv Detail & Related papers (2021-06-01T06:33:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.