Aggregate Representation Measure for Predictive Model Reusability
- URL: http://arxiv.org/abs/2405.09600v1
- Date: Wed, 15 May 2024 14:14:34 GMT
- Title: Aggregate Representation Measure for Predictive Model Reusability
- Authors: Vishwesh Sangarya, Richard Bradford, Jung-Eun Kim,
- Abstract summary: We propose a predictive quantifier to estimate the retraining cost of a trained model in distribution shifts.
The proposed Aggregated Representation Measure (ARM) quantifies the change in the model's representation from the old to new data distribution.
- Score: 2.93774265594295
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper, we propose a predictive quantifier to estimate the retraining cost of a trained model in distribution shifts. The proposed Aggregated Representation Measure (ARM) quantifies the change in the model's representation from the old to new data distribution. It provides, before actually retraining the model, a single concise index of resources - epochs, energy, and carbon emissions - required for the retraining. This enables reuse of a model with a much lower cost than training a new model from scratch. The experimental results indicate that ARM reasonably predicts retraining costs for varying noise intensities and enables comparisons among multiple model architectures to determine the most cost-effective and sustainable option.
Related papers
- Quantile Regression for Distributional Reward Models in RLHF [1.8130068086063336]
We introduce Quantile Reward Models (QRMs), a novel approach to reward modeling that learns a distribution over rewards instead of a single scalar value.
Our method uses quantile regression to estimate a full, potentially multimodal distribution over preferences, providing a more powerful and nuanced representation of preferences.
Our experimental results show that QRM outperforms comparable traditional point-estimate models on RewardBench.
arXiv Detail & Related papers (2024-09-16T10:54:04Z) - Semi-Supervised Reward Modeling via Iterative Self-Training [52.48668920483908]
We propose Semi-Supervised Reward Modeling (SSRM), an approach that enhances RM training using unlabeled data.
We demonstrate that SSRM significantly improves reward models without incurring additional labeling costs.
Overall, SSRM substantially reduces the dependency on large volumes of human-annotated data, thereby decreasing the overall cost and time involved in training effective reward models.
arXiv Detail & Related papers (2024-09-10T22:57:58Z) - RewardBench: Evaluating Reward Models for Language Modeling [100.28366840977966]
We present RewardBench, a benchmark dataset and code-base for evaluation of reward models.
The dataset is a collection of prompt-chosen-rejected trios spanning chat, reasoning, and safety.
On the RewardBench leaderboard, we evaluate reward models trained with a variety of methods.
arXiv Detail & Related papers (2024-03-20T17:49:54Z) - How to Estimate Model Transferability of Pre-Trained Speech Models? [84.11085139766108]
"Score-based assessment" framework for estimating transferability of pre-trained speech models.
We leverage upon two representation theories, Bayesian likelihood estimation and optimal transport, to generate rank scores for the PSM candidates.
Our framework efficiently computes transferability scores without actual fine-tuning of candidate models or layers.
arXiv Detail & Related papers (2023-06-01T04:52:26Z) - Maintaining Stability and Plasticity for Predictive Churn Reduction [8.971668467496055]
We propose a solution called Accumulated Model Combination (AMC)
AMC is a general technique and we propose several instances of it, each having their own advantages depending on the model and data properties.
arXiv Detail & Related papers (2023-05-06T20:56:20Z) - Measuring and Reducing Model Update Regression in Structured Prediction
for NLP [31.86240946966003]
backward compatibility requires that the new model does not regress on cases that were correctly handled by its predecessor.
This work studies model update regression in structured prediction tasks.
We propose a simple and effective method, Backward-Congruent Re-ranking (BCR), by taking into account the characteristics of structured output.
arXiv Detail & Related papers (2022-02-07T07:04:54Z) - Model retraining and information sharing in a supply chain with
long-term fluctuating demands [0.0]
This study examines the effects of updating models in a supply chain using a minimal setting.
We demonstrate that when each party in the supply chain has its own forecasting model, uncoordinated model retraining causes the bullwhip effect.
Our results also indicate that sharing the forecasting model among the parties involved significantly reduces the bullwhip effect.
arXiv Detail & Related papers (2021-09-04T04:16:04Z) - Model-Augmented Q-learning [112.86795579978802]
We propose a MFRL framework that is augmented with the components of model-based RL.
Specifically, we propose to estimate not only the $Q$-values but also both the transition and the reward with a shared network.
We show that the proposed scheme, called Model-augmented $Q$-learning (MQL), obtains a policy-invariant solution which is identical to the solution obtained by learning with true reward.
arXiv Detail & Related papers (2021-02-07T17:56:50Z) - Generative Temporal Difference Learning for Infinite-Horizon Prediction [101.59882753763888]
We introduce the $gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon.
We discuss how its training reflects an inescapable tradeoff between training-time and testing-time compounding errors.
arXiv Detail & Related papers (2020-10-27T17:54:12Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Model Repair: Robust Recovery of Over-Parameterized Statistical Models [24.319310729283636]
A new type of robust estimation problem is introduced where the goal is to recover a statistical model that has been corrupted after it has been estimated from data.
Methods are proposed for "repairing" the model using only the design and not the response values used to fit the model in a supervised learning setting.
arXiv Detail & Related papers (2020-05-20T08:41:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.