Cost-Effective Retraining of Machine Learning Models
- URL: http://arxiv.org/abs/2310.04216v1
- Date: Fri, 6 Oct 2023 13:02:29 GMT
- Title: Cost-Effective Retraining of Machine Learning Models
- Authors: Ananth Mahadevan and Michael Mathioudakis
- Abstract summary: It is important to retrain a machine learning (ML) model in order to maintain its performance as the data changes over time.
This creates a trade-off between retraining too frequently, which leads to unnecessary computing costs, and not retraining often enough.
We propose ML systems that make automated and cost-effective decisions about when to retrain an ML model.
- Score: 2.9461360639852914
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is important to retrain a machine learning (ML) model in order to maintain
its performance as the data changes over time. However, this can be costly as
it usually requires processing the entire dataset again. This creates a
trade-off between retraining too frequently, which leads to unnecessary
computing costs, and not retraining often enough, which results in stale and
inaccurate ML models. To address this challenge, we propose ML systems that
make automated and cost-effective decisions about when to retrain an ML model.
We aim to optimize the trade-off by considering the costs associated with each
decision. Our research focuses on determining whether to retrain or keep an
existing ML model based on various factors, including the data, the model, and
the predictive queries answered by the model. Our main contribution is a
Cost-Aware Retraining Algorithm called Cara, which optimizes the trade-off over
streams of data and queries. To evaluate the performance of Cara, we analyzed
synthetic datasets and demonstrated that Cara can adapt to different data
drifts and retraining costs while performing similarly to an optimal
retrospective algorithm. We also conducted experiments with real-world datasets
and showed that Cara achieves better accuracy than drift detection baselines
while making fewer retraining decisions, ultimately resulting in lower total
costs.
Related papers
- What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? [83.83230167222852]
We find that a model's generalization behavior can be effectively characterized by a training metric we call pre-memorization train accuracy.
By connecting a model's learning behavior to its generalization, pre-memorization train accuracy can guide targeted improvements to training strategies.
arXiv Detail & Related papers (2024-11-12T09:52:40Z) - Compute-Constrained Data Selection [77.06528009072967]
We formalize the problem of data selection with a cost-aware utility function, and model the problem as trading off initial-selection cost for training gain.
We run a comprehensive sweep of experiments across multiple tasks, varying compute budget by scaling finetuning tokens, model sizes, and data selection compute.
arXiv Detail & Related papers (2024-10-21T17:11:21Z) - Distilled Datamodel with Reverse Gradient Matching [74.75248610868685]
We introduce an efficient framework for assessing data impact, comprising offline training and online evaluation stages.
Our proposed method achieves comparable model behavior evaluation while significantly speeding up the process compared to the direct retraining method.
arXiv Detail & Related papers (2024-04-22T09:16:14Z) - Unlearnable Algorithms for In-context Learning [36.895152458323764]
In this paper, we focus on efficient unlearning methods for the task adaptation phase of a pretrained large language model.
We observe that an LLM's ability to do in-context learning for task adaptation allows for efficient exact unlearning of task adaptation training data.
We propose a new holistic measure of unlearning cost which accounts for varying inference costs.
arXiv Detail & Related papers (2024-02-01T16:43:04Z) - Task-Aware Machine Unlearning and Its Application in Load Forecasting [4.00606516946677]
This paper introduces the concept of machine unlearning which is specifically designed to remove the influence of part of the dataset on an already trained forecaster.
A performance-aware algorithm is proposed by evaluating the sensitivity of local model parameter change using influence function and sample re-weighting.
We tested the unlearning algorithms on linear, CNN, andMixer based load forecasters with a realistic load dataset.
arXiv Detail & Related papers (2023-08-28T08:50:12Z) - Fast Machine Unlearning Without Retraining Through Selective Synaptic
Dampening [51.34904967046097]
Selective Synaptic Dampening (SSD) is a fast, performant, and does not require long-term storage of the training data.
We present a novel two-step, post hoc, retrain-free approach to machine unlearning which is fast, performant, and does not require long-term storage of the training data.
arXiv Detail & Related papers (2023-08-15T11:30:45Z) - Recommendation Unlearning via Influence Function [42.4931807753579]
We propose a new Influence Function-based Recommendation Unlearning (IFRU) framework, which efficiently updates the model without retraining.
IFRU achieves more than 250 times acceleration compared to retraining-based methods with recommendation performance comparable to full retraining.
arXiv Detail & Related papers (2023-07-05T09:42:51Z) - Mitigating ML Model Decay in Continuous Integration with Data Drift
Detection: An Empirical Study [7.394099294390271]
This study aims to investigate the performance of using data drift detection techniques for automatically detecting the retraining points for ML models for TCP in CI environments.
We employed the Hellinger distance to identify changes in both the values and distribution of input data and leveraged these changes as retraining points for the ML model.
Our experimental evaluation of the Hellinger distance-based method demonstrated its efficacy and efficiency in detecting retraining points and reducing the associated costs.
arXiv Detail & Related papers (2023-05-22T05:55:23Z) - AI Model Disgorgement: Methods and Choices [127.54319351058167]
We introduce a taxonomy of possible disgorgement methods that are applicable to modern machine learning systems.
We investigate the meaning of "removing the effects" of data in the trained model in a way that does not require retraining from scratch.
arXiv Detail & Related papers (2023-04-07T08:50:18Z) - AutoSimulate: (Quickly) Learning Synthetic Data Generation [70.82315853981838]
We propose an efficient alternative for optimal synthetic data generation based on a novel differentiable approximation of the objective.
We demonstrate that the proposed method finds the optimal data distribution faster (up to $50times$), with significantly reduced training data generation (up to $30times$) and better accuracy ($+8.7%$) on real-world test datasets than previous methods.
arXiv Detail & Related papers (2020-08-16T11:36:11Z) - Approximate Data Deletion from Machine Learning Models [31.689174311625084]
Deleting data from a trained machine learning (ML) model is a critical task in many applications.
We propose a new approximate deletion method for linear and logistic models.
We also develop a new feature-injection test to evaluate the thoroughness of data deletion from ML models.
arXiv Detail & Related papers (2020-02-24T05:12:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.