Related papers: Classification Performance Metric Elicitation and its Applications

Classification Performance Metric Elicitation and its Applications

URL: http://arxiv.org/abs/2208.09142v1
Date: Fri, 19 Aug 2022 03:57:17 GMT
Title: Classification Performance Metric Elicitation and its Applications
Authors: Gaurush Hiranandani
Abstract summary: Despite its practical interest, there is limited formal guidance on how to select metrics for machine learning applications. This thesis outlines metric elicitation as a principled framework for selecting the performance metric that best reflects implicit user preferences.
Score: 5.5637552942511155
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Given a learning problem with real-world tradeoffs, which cost function should the model be trained to optimize? This is the metric selection problem in machine learning. Despite its practical interest, there is limited formal guidance on how to select metrics for machine learning applications. This thesis outlines metric elicitation as a principled framework for selecting the performance metric that best reflects implicit user preferences. Once specified, the evaluation metric can be used to compare and train models. In this manuscript, we formalize the problem of Metric Elicitation and devise novel strategies for eliciting classification performance metrics using pairwise preference feedback over classifiers. Specifically, we provide novel strategies for eliciting linear and linear-fractional metrics for binary and multiclass classification problems, which are then extended to a framework that elicits group-fair performance metrics in the presence of multiple sensitive groups. All the elicitation strategies that we discuss are robust to both finite sample and feedback noise, thus are useful in practice for real-world applications. Using the tools and the geometric characterizations of the feasible confusion statistics sets from the binary, multiclass, and multiclass-multigroup classification setups, we further provide strategies to elicit from a wider range of complex, modern multiclass metrics defined by quadratic functions of confusion statistics by exploiting their local linear structure. From application perspective, we also propose to use the metric elicitation framework in optimizing complex black box metrics that is amenable to deep network training. Lastly, to bring theory closer to practice, we conduct a preliminary real-user study that shows the efficacy of the metric elicitation framework in recovering the users' preferred performance metric in a binary classification setup.

Related papers

Interactive Classification Metrics: A graphical application to build robust intuition for classification model evaluation [0.0]
Interactive Classification Metrics (ICM) is an application to visualize and explore the relationships between different evaluation metrics. The user changes the distribution statistics and explores corresponding changes across a suite of evaluation metrics.
arXiv Detail & Related papers (2024-12-22T15:36:15Z)
Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric. We propose a single framework built on their equivalence in learning scenarios. Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z)
MISS: Multiclass Interpretable Scoring Systems [13.902264070785986]
We present a machine-learning approach for constructing Multiclass Interpretable Scoring Systems (MISS) MISS is a fully data-driven methodology for single, sparse, and user-friendly scoring systems for multiclass classification problems. Results indicate that our approach is competitive with other machine learning models in terms of classification performance metrics and provides well-calibrated class probabilities.
arXiv Detail & Related papers (2024-01-10T10:57:12Z)
Adaptive Neural Ranking Framework: Toward Maximized Business Goal for Cascade Ranking Systems [33.46891569350896]
Cascade ranking is widely used for large-scale top-k selection problems in online advertising and recommendation systems. Previous works on learning-to-rank usually focus on letting the model learn the complete order or top-k order. We name this method as Adaptive Neural Ranking Framework (abbreviated as ARF)
arXiv Detail & Related papers (2023-10-16T14:43:02Z)
Exploring validation metrics for offline model-based optimisation with diffusion models [50.404829846182764]
In model-based optimisation (MBO) we are interested in using machine learning to design candidates that maximise some measure of reward with respect to a black box function called the (ground truth) oracle. While an approximation to the ground oracle can be trained and used in place of it during model validation to measure the mean reward over generated candidates, the evaluation is approximate and vulnerable to adversarial examples. This is encapsulated under our proposed evaluation framework which is also designed to measure extrapolation.
arXiv Detail & Related papers (2022-11-19T16:57:37Z)
An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system. Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches. This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z)
How Fine-Tuning Allows for Effective Meta-Learning [50.17896588738377]
We present a theoretical framework for analyzing representations derived from a MAML-like algorithm. We provide risk bounds on the best predictor found by fine-tuning via gradient descent, demonstrating that the algorithm can provably leverage the shared structure. This separation result underscores the benefit of fine-tuning-based methods, such as MAML, over methods with "frozen representation" objectives in few-shot learning.
arXiv Detail & Related papers (2021-05-05T17:56:00Z)
Optimizing Black-box Metrics with Iterative Example Weighting [32.682652530189266]
We consider learning to optimize a classification metric defined by a black-box function of the confusion matrix. Our approach is to adaptively learn example weights on the training dataset such that the resulting weighted objective best approximates the metric on the validation sample.
arXiv Detail & Related papers (2021-02-18T17:19:09Z)
Quadratic Metric Elicitation for Fairness and Beyond [28.1407078984806]
This paper develops a strategy for eliciting more flexible multiclass metrics defined by quadratic functions of rates. We show its application in eliciting quadratic violation-based group-fair metrics.
arXiv Detail & Related papers (2020-11-03T07:22:15Z)
On Learning Text Style Transfer with Direct Rewards [101.97136885111037]
Lack of parallel corpora makes it impossible to directly train supervised models for the text style transfer task. We leverage semantic similarity metrics originally used for fine-tuning neural machine translation models. Our model provides significant gains in both automatic and human evaluation over strong baselines.
arXiv Detail & Related papers (2020-10-24T04:30:02Z)
Bridging the Gap: Unifying the Training and Evaluation of Neural Network Binary Classifiers [0.4893345190925178]
We propose a unifying approach to training neural network binary classifiers that combines a differentiable approximation of the Heaviside function with a probabilistic view of the typical confusion matrix values using soft sets. Our theoretical analysis shows the benefit of using our method to optimize for a given evaluation metric, such as $F_$-Score, with soft sets.
arXiv Detail & Related papers (2020-09-02T22:13:26Z)
A Multilayer Framework for Online Metric Learning [71.31889711244739]
This paper proposes a multilayer framework for online metric learning to capture the nonlinear similarities among instances. A new Mahalanobis-based Online Metric Learning (MOML) algorithm is presented based on the passive-aggressive strategy and one-pass triplet construction strategy. The proposed MLOML enjoys several nice properties, indeed learns a metric progressively, and performs better on the benchmark datasets.
arXiv Detail & Related papers (2018-05-15T01:10:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.