Related papers: Model Stability with Continuous Data Updates

Model Stability with Continuous Data Updates

URL: http://arxiv.org/abs/2201.05692v1
Date: Fri, 14 Jan 2022 22:11:16 GMT
Title: Model Stability with Continuous Data Updates
Authors: Huiting Liu, Avinesh P.V.S., Siddharth Patwardhan, Peter Grasch, Sachin Agarwal
Abstract summary: We study the "stability" of machine learning (ML) models within the context of larger, complex NLP systems. We find that model design choices, including network architecture and input representation, have a critical impact on stability. We recommend ML model designers account for trade-offs in accuracy and jitter when making modeling choices.
Score: 2.439909645714735
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In this paper, we study the "stability" of machine learning (ML) models within the context of larger, complex NLP systems with continuous training data updates. For this study, we propose a methodology for the assessment of model stability (which we refer to as jitter under various experimental conditions. We find that model design choices, including network architecture and input representation, have a critical impact on stability through experiments on four text classification tasks and two sequence labeling tasks. In classification tasks, non-RNN-based models are observed to be more stable than RNN-based ones, while the encoder-decoder model is less stable in sequence labeling tasks. Moreover, input representations based on pre-trained fastText embeddings contribute to more stability than other choices. We also show that two learning strategies -- ensemble models and incremental training -- have a significant influence on stability. We recommend ML model designers account for trade-offs in accuracy and jitter when making modeling choices.

Related papers

Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios. We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples. Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z)
Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences [6.067007470552307]
We propose a methodology for finding sequences of machine learning models that are stable across retraining iterations. We develop a mixed-integer optimization formulation that is guaranteed to recover optimal models. Our method shows stronger stability than greedily trained models with a small, controllable sacrifice in predictive power.
arXiv Detail & Related papers (2024-03-28T22:45:38Z)
Weighted Ensemble Models Are Strong Continual Learners [20.62749699589017]
We study the problem of continual learning (CL) where the goal is to learn a model on a sequence of tasks. CL is essentially a balancing act between being able to learn on the new task and maintaining the performance on the previously learned concepts. Intending to address the stability-plasticity trade-off, we propose to perform weight-ensembling of the model parameters of the previous and current tasks.
arXiv Detail & Related papers (2023-12-14T14:26:57Z)
QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement. QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights. We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z)
Towards Plastic and Stable Exemplar-Free Incremental Learning: A Dual-Learner Framework with Cumulative Parameter Averaging [12.168402195820649]
We propose a Dual-Learner framework with Cumulative. Averaging (DLCPA) We show that DLCPA outperforms several state-of-the-art exemplar-free baselines in both Task-IL and Class-IL settings.
arXiv Detail & Related papers (2023-10-28T08:48:44Z)
Learning Objective-Specific Active Learning Strategies with Attentive Neural Processes [72.75421975804132]
Learning Active Learning (LAL) suggests to learn the active learning strategy itself, allowing it to adapt to the given setting. We propose a novel LAL method for classification that exploits symmetry and independence properties of the active learning problem. Our approach is based on learning from a myopic oracle, which gives our model the ability to adapt to non-standard objectives.
arXiv Detail & Related papers (2023-09-11T14:16:37Z)
New metrics for analyzing continual learners [27.868967961503962]
Continual Learning (CL) poses challenges to standard learning algorithms. This stability-plasticity dilemma remains central to CL and multiple metrics have been proposed to adequately measure stability and plasticity separately. We propose new metrics that account for the task's increasing difficulty.
arXiv Detail & Related papers (2023-09-01T13:53:33Z)
Online Continual Learning for Robust Indoor Object Recognition [24.316047317028143]
Vision systems mounted on home robots need to interact with unseen classes in changing environments. We propose RobOCLe, which constructs an enriched feature space computing high order statistical moments. We show that different moments allow RobOCLe to capture different properties of deformations, providing higher robustness with no decrease of inference speed.
arXiv Detail & Related papers (2023-07-19T08:32:59Z)
Robust Learning with Progressive Data Expansion Against Spurious Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features. Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process. We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z)
Entailment as Robust Self-Learner [14.86757876218415]
We design a prompting strategy that formulates a number of different NLU tasks as contextual entailment. We propose the Simple Pseudo-Label Editing (SimPLE) algorithm for better pseudo-labeling quality in self-training.
arXiv Detail & Related papers (2023-05-26T18:41:23Z)
On the Stability-Plasticity Dilemma of Class-Incremental Learning [50.863180812727244]
A primary goal of class-incremental learning is to strike a balance between stability and plasticity. This paper aims to shed light on how effectively recent class-incremental learning algorithms address the stability-plasticity trade-off.
arXiv Detail & Related papers (2023-04-04T09:34:14Z)
Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP) What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining. How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.