Related papers: Model Reprogramming Outperforms Fine-tuning on Out-of-distribution Data in Text-Image Encoders

Model Reprogramming Outperforms Fine-tuning on Out-of-distribution Data in Text-Image Encoders

URL: http://arxiv.org/abs/2403.10800v2
Date: Fri, 29 Mar 2024 20:34:26 GMT
Title: Model Reprogramming Outperforms Fine-tuning on Out-of-distribution Data in Text-Image Encoders
Authors: Andrew Geng, Pin-Yu Chen,
Abstract summary: In this paper, we unveil the hidden costs associated with intrusive fine-tuning techniques. We introduce a new model reprogramming approach for fine-tuning, which we name Reprogrammer. Our empirical evidence reveals that Reprogrammer is less intrusive and yields superior downstream models.
Score: 56.47577824219207
License: http://creativecommons.org/licenses/by/4.0/
Abstract: When evaluating the performance of a pre-trained model transferred to a downstream task, it is imperative to assess not only the in-distribution (ID) accuracy of the downstream model but also its capacity to generalize and identify out-of-distribution (OOD) samples. In this paper, we unveil the hidden costs associated with intrusive fine-tuning techniques. Specifically, we demonstrate that commonly used fine-tuning methods not only distort the representations necessary for generalizing to covariate-shifted OOD samples (OOD generalization) but also distort the representations necessary for detecting semantically-shifted OOD samples (OOD detection). To address these challenges, we introduce a new model reprogramming approach for fine-tuning, which we name Reprogrammer. Reprogrammer aims to improve the holistic performance of the downstream model across ID, OOD generalization, and OOD detection tasks. Our empirical evidence reveals that Reprogrammer is less intrusive and yields superior downstream models. Furthermore, we demonstrate that by appending an additional representation residual connection to Reprogrammer, we can further preserve pre-training representations, resulting in an even more safe and robust downstream model capable of excelling in many ID classification, OOD generalization, and OOD detection settings.

Related papers

Non-Linear Outlier Synthesis for Out-of-Distribution Detection [5.019613806273252]
We present NCIS, which enhances the quality of synthetic outliers by operating directly in the diffusion's model embedding space. We demonstrate that these improvements yield new state-of-the-art OOD detection results on standard ImageNet100 and CIFAR100 benchmarks.
arXiv Detail & Related papers (2024-11-20T09:47:29Z)
Towards Calibrated Robust Fine-Tuning of Vision-Language Models [97.19901765814431]
This work proposes a robust fine-tuning method that improves both OOD accuracy and confidence calibration simultaneously in vision language models. We show that both OOD classification and OOD calibration errors have a shared upper bound consisting of two terms of ID data. Based on this insight, we design a novel framework that conducts fine-tuning with a constrained multimodal contrastive loss enforcing a larger smallest singular value.
arXiv Detail & Related papers (2023-11-03T05:41:25Z)
Unsupervised Out-of-Distribution Detection by Restoring Lossy Inputs with Variational Autoencoder [3.498694457257263]
We propose a novel VAE-based score called Error Reduction (ER) for OOD detection. ER is based on a VAE that takes a lossy version of the training set as inputs and the original set as targets.
arXiv Detail & Related papers (2023-09-05T09:42:15Z)
Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection Capability [70.72426887518517]
Out-of-distribution (OOD) detection is an indispensable aspect of secure AI when deploying machine learning models in real-world applications. We propose a novel method, Unleashing Mask, which aims to restore the OOD discriminative capabilities of the well-trained model with ID data. Our method utilizes a mask to figure out the memorized atypical samples, and then finetune the model or prune it with the introduced mask to forget them.
arXiv Detail & Related papers (2023-06-06T14:23:34Z)
AUTO: Adaptive Outlier Optimization for Test-Time OOD Detection [79.51071170042972]
Out-of-distribution (OOD) detection aims to detect test samples that do not fall into any training in-distribution (ID) classes.<n>Data safety and privacy make it infeasible to collect task-specific outliers in advance for different scenarios.<n>We present test-time OOD detection, which allows the deployed model to utilize real OOD data from the unlabeled data stream during testing.
arXiv Detail & Related papers (2023-03-22T02:28:54Z)
Watermarking for Out-of-distribution Detection [76.20630986010114]
Out-of-distribution (OOD) detection aims to identify OOD data based on representations extracted from well-trained deep models. We propose a general methodology named watermarking in this paper. We learn a unified pattern that is superimposed onto features of original data, and the model's detection capability is largely boosted after watermarking.
arXiv Detail & Related papers (2022-10-27T06:12:32Z)
Pseudo-OOD training for robust language models [78.15712542481859]
OOD detection is a key component of a reliable machine-learning model for any industry-scale application. We propose POORE - POsthoc pseudo-Ood REgularization, that generates pseudo-OOD samples using in-distribution (IND) data. We extensively evaluate our framework on three real-world dialogue systems, achieving new state-of-the-art in OOD detection.
arXiv Detail & Related papers (2022-10-17T14:32:02Z)
How Useful are Gradients for OOD Detection Really? [5.459639971144757]
Out of distribution (OOD) detection is a critical challenge in deploying highly performant machine learning models in real-life applications. We provide an in-depth analysis and comparison of gradient based methods for OOD detection. We propose a general, non-gradient based method of OOD detection which improves over previous baselines in both performance and computational efficiency.
arXiv Detail & Related papers (2022-05-20T21:10:05Z)
Energy-bounded Learning for Robust Models of Code [16.592638312365164]
In programming, learning code representations has a variety of applications, including code classification, code search, comment generation, bug prediction, and so on. We propose the use of an energy-bounded learning objective function to assign a higher score to in-distribution samples and a lower score to out-of-distribution samples in order to incorporate such out-of-distribution samples into the training process of source code models.
arXiv Detail & Related papers (2021-12-20T06:28:56Z)
Unsupervised Anomaly Detection with Adversarial Mirrored AutoEncoders [51.691585766702744]
We propose a variant of Adversarial Autoencoder which uses a mirrored Wasserstein loss in the discriminator to enforce better semantic-level reconstruction. We put forward an alternative measure of anomaly score to replace the reconstruction-based metric. Our method outperforms the current state-of-the-art methods for anomaly detection on several OOD detection benchmarks.
arXiv Detail & Related papers (2020-03-24T08:26:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.