Measuring and Reducing Model Update Regression in Structured Prediction
for NLP
- URL: http://arxiv.org/abs/2202.02976v1
- Date: Mon, 7 Feb 2022 07:04:54 GMT
- Title: Measuring and Reducing Model Update Regression in Structured Prediction
for NLP
- Authors: Deng Cai and Elman Mansimov and Yi-An Lai and Yixuan Su and Lei Shu
and Yi Zhang
- Abstract summary: backward compatibility requires that the new model does not regress on cases that were correctly handled by its predecessor.
This work studies model update regression in structured prediction tasks.
We propose a simple and effective method, Backward-Congruent Re-ranking (BCR), by taking into account the characteristics of structured output.
- Score: 31.86240946966003
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advance in deep learning has led to rapid adoption of machine learning
based NLP models in a wide range of applications. Despite the continuous gain
in accuracy, backward compatibility is also an important aspect for industrial
applications, yet it received little research attention. Backward compatibility
requires that the new model does not regress on cases that were correctly
handled by its predecessor. This work studies model update regression in
structured prediction tasks. We choose syntactic dependency parsing and
conversational semantic parsing as representative examples of structured
prediction tasks in NLP. First, we measure and analyze model update regression
in different model update settings. Next, we explore and benchmark existing
techniques for reducing model update regression including model ensemble and
knowledge distillation. We further propose a simple and effective method,
Backward-Congruent Re-ranking (BCR), by taking into account the characteristics
of structured output. Experiments show that BCR can better mitigate model
update regression than model ensemble and knowledge distillation approaches.
Related papers
- Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences [6.067007470552307]
We propose a methodology for finding sequences of machine learning models that are stable across retraining iterations.
We develop a mixed-integer optimization formulation that is guaranteed to recover optimal models.
Our method shows stronger stability than greedily trained models with a small, controllable sacrifice in predictive power.
arXiv Detail & Related papers (2024-03-28T22:45:38Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - Consensus-Adaptive RANSAC [104.87576373187426]
We propose a new RANSAC framework that learns to explore the parameter space by considering the residuals seen so far via a novel attention layer.
The attention mechanism operates on a batch of point-to-model residuals, and updates a per-point estimation state to take into account the consensus found through a lightweight one-step transformer.
arXiv Detail & Related papers (2023-07-26T08:25:46Z) - Predictable MDP Abstraction for Unsupervised Model-Based RL [93.91375268580806]
We propose predictable MDP abstraction (PMA)
Instead of training a predictive model on the original MDP, we train a model on a transformed MDP with a learned action space.
We theoretically analyze PMA and empirically demonstrate that PMA leads to significant improvements over prior unsupervised model-based RL approaches.
arXiv Detail & Related papers (2023-02-08T07:37:51Z) - ResMem: Learn what you can and memorize the rest [79.19649788662511]
We propose the residual-memorization (ResMem) algorithm to augment an existing prediction model.
By construction, ResMem can explicitly memorize the training labels.
We show that ResMem consistently improves the test set generalization of the original prediction model.
arXiv Detail & Related papers (2023-02-03T07:12:55Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing
Regressions In NLP Model Updates [68.09049111171862]
This work focuses on quantifying, reducing and analyzing regression errors in the NLP model updates.
We formulate the regression-free model updates into a constrained optimization problem.
We empirically analyze how model ensemble reduces regression.
arXiv Detail & Related papers (2021-05-07T03:33:00Z) - Model Repair: Robust Recovery of Over-Parameterized Statistical Models [24.319310729283636]
A new type of robust estimation problem is introduced where the goal is to recover a statistical model that has been corrupted after it has been estimated from data.
Methods are proposed for "repairing" the model using only the design and not the response values used to fit the model in a supervised learning setting.
arXiv Detail & Related papers (2020-05-20T08:41:56Z) - A Locally Adaptive Interpretable Regression [7.4267694612331905]
Linear regression is one of the most interpretable prediction models.
In this work, we introduce a locally adaptive interpretable regression (LoAIR)
Our model achieves comparable or better predictive performance than the other state-of-the-art baselines.
arXiv Detail & Related papers (2020-05-07T09:26:14Z) - PrIU: A Provenance-Based Approach for Incrementally Updating Regression
Models [9.496524884855559]
This paper presents an efficient provenance-based approach, PrIU, for incrementally updating model parameters without sacrificing prediction accuracy.
We prove the correctness and convergence of the incrementally updated model parameters, and validate it experimentally.
Experimental results show that up to two orders of magnitude speed-ups can be achieved by PrIU-opt compared to simply retraining the model from scratch, yet obtaining highly similar models.
arXiv Detail & Related papers (2020-02-26T21:04:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.