Model Reprogramming: Resource-Efficient Cross-Domain Machine Learning
- URL: http://arxiv.org/abs/2202.10629v4
- Date: Tue, 5 Dec 2023 21:01:05 GMT
- Title: Model Reprogramming: Resource-Efficient Cross-Domain Machine Learning
- Authors: Pin-Yu Chen
- Abstract summary: In data-rich domains such as vision, language, and speech, deep learning prevails to deliver high-performance task-specific models.
Deep learning in resource-limited domains still faces multiple challenges including (i) limited data, (ii) constrained model development cost, and (iii) lack of adequate pre-trained models for effective finetuning.
Model reprogramming enables resource-efficient cross-domain machine learning by repurposing a well-developed pre-trained model from a source domain to solve tasks in a target domain without model finetuning.
- Score: 65.268245109828
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In data-rich domains such as vision, language, and speech, deep learning
prevails to deliver high-performance task-specific models and can even learn
general task-agnostic representations for efficient finetuning to downstream
tasks. However, deep learning in resource-limited domains still faces multiple
challenges including (i) limited data, (ii) constrained model development cost,
and (iii) lack of adequate pre-trained models for effective finetuning. This
paper provides an overview of model reprogramming to bridge this gap. Model
reprogramming enables resource-efficient cross-domain machine learning by
repurposing and reusing a well-developed pre-trained model from a source domain
to solve tasks in a target domain without model finetuning, where the source
and target domains can be vastly different. In many applications, model
reprogramming outperforms transfer learning and training from scratch. This
paper elucidates the methodology of model reprogramming, summarizes existing
use cases, provides a theoretical explanation of the success of model
reprogramming, and concludes with a discussion on open-ended research questions
and opportunities. A list of model reprogramming studies is actively maintained
and updated at https://github.com/IBM/model-reprogramming.
Related papers
- RedTest: Towards Measuring Redundancy in Deep Neural Networks Effectively [10.812755570974929]
We use Model Structural Redundancy Score (MSRS) to measure the degree of redundancy in a deep learning model structure.
MSRS is effective in both revealing and assessing the redundancy issues in many state-of-the-art models.
We design a novel redundancy-aware algorithm to guide the search for the optimal model structure.
arXiv Detail & Related papers (2024-11-15T14:36:07Z) - On the Utility of Domain Modeling Assistance with Large Language Models [2.874893537471256]
This paper presents a study to evaluate the usefulness of a novel approach utilizing large language models (LLMs) and few-shot prompt learning to assist in domain modeling.
The aim of this approach is to overcome the need for extensive training of AI-based completion models on scarce domain-specific datasets.
arXiv Detail & Related papers (2024-10-16T13:55:34Z) - A Framework for Monitoring and Retraining Language Models in Real-World
Applications [3.566775910781198]
continuous model monitoring and model retraining is required in many real-world applications.
There are multiple reasons for retraining, including data or concept drift, which may be reflected on the model performance as monitored by an appropriate metric.
We examine the impact of various retraining decision points on crucial factors, such as model performance and resource utilization, in the context of Multilabel Classification models.
arXiv Detail & Related papers (2023-11-16T14:32:18Z) - Adapting Large Language Models for Content Moderation: Pitfalls in Data
Engineering and Supervised Fine-tuning [79.53130089003986]
Large Language Models (LLMs) have become a feasible solution for handling tasks in various domains.
In this paper, we introduce how to fine-tune a LLM model that can be privately deployed for content moderation.
arXiv Detail & Related papers (2023-10-05T09:09:44Z) - Towards Efficient Task-Driven Model Reprogramming with Foundation Models [52.411508216448716]
Vision foundation models exhibit impressive power, benefiting from the extremely large model capacity and broad training data.
However, in practice, downstream scenarios may only support a small model due to the limited computational resources or efficiency considerations.
This brings a critical challenge for the real-world application of foundation models: one has to transfer the knowledge of a foundation model to the downstream task.
arXiv Detail & Related papers (2023-04-05T07:28:33Z) - Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - DST: Dynamic Substitute Training for Data-free Black-box Attack [79.61601742693713]
We propose a novel dynamic substitute training attack method to encourage substitute model to learn better and faster from the target model.
We introduce a task-driven graph-based structure information learning constrain to improve the quality of generated training data.
arXiv Detail & Related papers (2022-04-03T02:29:11Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z) - CALM: Continuous Adaptive Learning for Language Modeling [18.72860206714457]
Training large language representation models has become a standard in the natural language processing community.
We demonstrate that in practice these pre-trained models present performance deterioration in the form of catastrophic forgetting.
We propose CALM, Continuous Adaptive Learning for Language Modeling: techniques to render models which retain knowledge across multiple domains.
arXiv Detail & Related papers (2020-04-08T03:51:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.