Classification Performance Metric Elicitation and its Applications
- URL: http://arxiv.org/abs/2208.09142v1
- Date: Fri, 19 Aug 2022 03:57:17 GMT
- Title: Classification Performance Metric Elicitation and its Applications
- Authors: Gaurush Hiranandani
- Abstract summary: Despite its practical interest, there is limited formal guidance on how to select metrics for machine learning applications.
This thesis outlines metric elicitation as a principled framework for selecting the performance metric that best reflects implicit user preferences.
- Score: 5.5637552942511155
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Given a learning problem with real-world tradeoffs, which cost function
should the model be trained to optimize? This is the metric selection problem
in machine learning. Despite its practical interest, there is limited formal
guidance on how to select metrics for machine learning applications. This
thesis outlines metric elicitation as a principled framework for selecting the
performance metric that best reflects implicit user preferences. Once
specified, the evaluation metric can be used to compare and train models. In
this manuscript, we formalize the problem of Metric Elicitation and devise
novel strategies for eliciting classification performance metrics using
pairwise preference feedback over classifiers. Specifically, we provide novel
strategies for eliciting linear and linear-fractional metrics for binary and
multiclass classification problems, which are then extended to a framework that
elicits group-fair performance metrics in the presence of multiple sensitive
groups. All the elicitation strategies that we discuss are robust to both
finite sample and feedback noise, thus are useful in practice for real-world
applications. Using the tools and the geometric characterizations of the
feasible confusion statistics sets from the binary, multiclass, and
multiclass-multigroup classification setups, we further provide strategies to
elicit from a wider range of complex, modern multiclass metrics defined by
quadratic functions of confusion statistics by exploiting their local linear
structure. From application perspective, we also propose to use the metric
elicitation framework in optimizing complex black box metrics that is amenable
to deep network training. Lastly, to bring theory closer to practice, we
conduct a preliminary real-user study that shows the efficacy of the metric
elicitation framework in recovering the users' preferred performance metric in
a binary classification setup.
Related papers
- Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric.
We propose a single framework built on their equivalence in learning scenarios.
Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z) - MISS: Multiclass Interpretable Scoring Systems [13.902264070785986]
We present a machine-learning approach for constructing Multiclass Interpretable Scoring Systems (MISS)
MISS is a fully data-driven methodology for single, sparse, and user-friendly scoring systems for multiclass classification problems.
Results indicate that our approach is competitive with other machine learning models in terms of classification performance metrics and provides well-calibrated class probabilities.
arXiv Detail & Related papers (2024-01-10T10:57:12Z) - Adaptive Neural Ranking Framework: Toward Maximized Business Goal for
Cascade Ranking Systems [33.46891569350896]
Cascade ranking is widely used for large-scale top-k selection problems in online advertising and recommendation systems.
Previous works on learning-to-rank usually focus on letting the model learn the complete order or top-k order.
We name this method as Adaptive Neural Ranking Framework (abbreviated as ARF)
arXiv Detail & Related papers (2023-10-16T14:43:02Z) - Exploring validation metrics for offline model-based optimisation with
diffusion models [50.404829846182764]
In model-based optimisation (MBO) we are interested in using machine learning to design candidates that maximise some measure of reward with respect to a black box function called the (ground truth) oracle.
While an approximation to the ground oracle can be trained and used in place of it during model validation to measure the mean reward over generated candidates, the evaluation is approximate and vulnerable to adversarial examples.
This is encapsulated under our proposed evaluation framework which is also designed to measure extrapolation.
arXiv Detail & Related papers (2022-11-19T16:57:37Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - How Fine-Tuning Allows for Effective Meta-Learning [50.17896588738377]
We present a theoretical framework for analyzing representations derived from a MAML-like algorithm.
We provide risk bounds on the best predictor found by fine-tuning via gradient descent, demonstrating that the algorithm can provably leverage the shared structure.
This separation result underscores the benefit of fine-tuning-based methods, such as MAML, over methods with "frozen representation" objectives in few-shot learning.
arXiv Detail & Related papers (2021-05-05T17:56:00Z) - Optimizing Black-box Metrics with Iterative Example Weighting [32.682652530189266]
We consider learning to optimize a classification metric defined by a black-box function of the confusion matrix.
Our approach is to adaptively learn example weights on the training dataset such that the resulting weighted objective best approximates the metric on the validation sample.
arXiv Detail & Related papers (2021-02-18T17:19:09Z) - Quadratic Metric Elicitation for Fairness and Beyond [28.1407078984806]
This paper develops a strategy for eliciting more flexible multiclass metrics defined by quadratic functions of rates.
We show its application in eliciting quadratic violation-based group-fair metrics.
arXiv Detail & Related papers (2020-11-03T07:22:15Z) - On Learning Text Style Transfer with Direct Rewards [101.97136885111037]
Lack of parallel corpora makes it impossible to directly train supervised models for the text style transfer task.
We leverage semantic similarity metrics originally used for fine-tuning neural machine translation models.
Our model provides significant gains in both automatic and human evaluation over strong baselines.
arXiv Detail & Related papers (2020-10-24T04:30:02Z) - Bridging the Gap: Unifying the Training and Evaluation of Neural Network
Binary Classifiers [0.4893345190925178]
We propose a unifying approach to training neural network binary classifiers that combines a differentiable approximation of the Heaviside function with a probabilistic view of the typical confusion matrix values using soft sets.
Our theoretical analysis shows the benefit of using our method to optimize for a given evaluation metric, such as $F_$-Score, with soft sets.
arXiv Detail & Related papers (2020-09-02T22:13:26Z) - A Multilayer Framework for Online Metric Learning [71.31889711244739]
This paper proposes a multilayer framework for online metric learning to capture the nonlinear similarities among instances.
A new Mahalanobis-based Online Metric Learning (MOML) algorithm is presented based on the passive-aggressive strategy and one-pass triplet construction strategy.
The proposed MLOML enjoys several nice properties, indeed learns a metric progressively, and performs better on the benchmark datasets.
arXiv Detail & Related papers (2018-05-15T01:10:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.