Related papers: Automatic Generation of Attention Rules For Containment of Machine Learning Model Errors

Automatic Generation of Attention Rules For Containment of Machine Learning Model Errors

URL: http://arxiv.org/abs/2305.08115v1
Date: Sun, 14 May 2023 10:15:35 GMT
Title: Automatic Generation of Attention Rules For Containment of Machine Learning Model Errors
Authors: Samuel Ackerman, Axel Bendavid, Eitan Farchi, Orna Raz
Abstract summary: We present several algorithms (strategies') for determining optimal rules to separate observations. In particular, we prefer strategies that use feature-based slicing because they are human-interpretable, model-agnostic, and require minimal supplementary inputs or knowledge. To evaluate strategies, we introduce metrics to measure various desired qualities, such as their performance, stability, and generalizability to unseen data.
Score: 1.4987559345379062
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine learning (ML) solutions are prevalent in many applications. However, many challenges exist in making these solutions business-grade. For instance, maintaining the error rate of the underlying ML models at an acceptably low level. Typically, the true relationship between feature inputs and the target feature to be predicted is uncertain, and hence statistical in nature. The approach we propose is to separate the observations that are the most likely to be predicted incorrectly into 'attention sets'. These can directly aid model diagnosis and improvement, and be used to decide on alternative courses of action for these problematic observations. We present several algorithms (`strategies') for determining optimal rules to separate these observations. In particular, we prefer strategies that use feature-based slicing because they are human-interpretable, model-agnostic, and require minimal supplementary inputs or knowledge. In addition, we show that these strategies outperform several common baselines, such as selecting observations with prediction confidence below a threshold. To evaluate strategies, we introduce metrics to measure various desired qualities, such as their performance, stability, and generalizability to unseen data; the strategies are evaluated on several publicly-available datasets. We use TOPSIS, a Multiple Criteria Decision Making method, to aggregate these metrics into a single quality score for each strategy, to allow comparison.

Related papers

Realistic Evaluation of Test-Time Adaptation Algorithms: Unsupervised Hyperparameter Selection [1.4530711901349282]
Test-Time Adaptation (TTA) has emerged as a promising strategy for tackling the problem of machine learning model robustness under distribution shifts. We evaluate existing TTA methods using surrogate-based hp-selection strategies to obtain a more realistic evaluation of their performance.
arXiv Detail & Related papers (2024-07-19T11:58:30Z)
Decoupling Decision-Making in Fraud Prevention through Classifier Calibration for Business Logic Action [1.8289218434318257]
We use calibration strategies as strategy for decoupling machine learning (ML) classifiers from score-based actions within business logic frameworks. Our findings highlight the trade-offs and performance implications of the approach. In particular, the Isotonic and Beta calibration methods stand out for scenarios in which there is shift between training and testing data.
arXiv Detail & Related papers (2024-01-10T16:13:21Z)
Bring Your Own Data! Self-Supervised Evaluation for Large Language Models [52.15056231665816]
We propose a framework for self-supervised evaluation of Large Language Models (LLMs) We demonstrate self-supervised evaluation strategies for measuring closed-book knowledge, toxicity, and long-range context dependence. We find strong correlations between self-supervised and human-supervised evaluations.
arXiv Detail & Related papers (2023-06-23T17:59:09Z)
In Search of Insights, Not Magic Bullets: Towards Demystification of the Model Selection Dilemma in Heterogeneous Treatment Effect Estimation [92.51773744318119]
This paper empirically investigates the strengths and weaknesses of different model selection criteria. We highlight that there is a complex interplay between selection strategies, candidate estimators and the data used for comparing them.
arXiv Detail & Related papers (2023-02-06T16:55:37Z)
An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system. Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches. This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z)
Selecting the suitable resampling strategy for imbalanced data classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class. This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples. Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z)
Feature Selection Using Reinforcement Learning [0.0]
The space of variables or features that can be used to characterize a particular predictor of interest continues to grow exponentially. Identifying the most characterizing features that minimizes the variance without jeopardizing the bias of our models is critical to successfully training a machine learning model.
arXiv Detail & Related papers (2021-01-23T09:24:37Z)
Reducing Confusion in Active Learning for Part-Of-Speech Tagging [100.08742107682264]
Active learning (AL) uses a data selection algorithm to select useful training samples to minimize annotation cost. We study the problem of selecting instances which maximally reduce the confusion between particular pairs of output tags. Our proposed AL strategy outperforms other AL strategies by a significant margin.
arXiv Detail & Related papers (2020-11-02T06:24:58Z)
Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples. We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries. We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.