Optimizing ROC Curves with a Sort-Based Surrogate Loss Function for
Binary Classification and Changepoint Detection
- URL: http://arxiv.org/abs/2107.01285v1
- Date: Fri, 2 Jul 2021 21:21:19 GMT
- Title: Optimizing ROC Curves with a Sort-Based Surrogate Loss Function for
Binary Classification and Changepoint Detection
- Authors: Jonathan Hillman and Toby Dylan Hocking
- Abstract summary: We propose a new convex loss function called the AUM, short for Under Min(FP, FN)
We show that our new AUM learning results in improved AUC and comparable relative to previous baselines.
- Score: 1.332560004325655
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Receiver Operating Characteristic (ROC) curves are plots of true positive
rate versus false positive rate which are useful for evaluating binary
classification models, but difficult to use for learning since the Area Under
the Curve (AUC) is non-convex. ROC curves can also be used in other problems
that have false positive and true positive rates such as changepoint detection.
We show that in this more general context, the ROC curve can have loops, points
with highly sub-optimal error rates, and AUC greater than one. This observation
motivates a new optimization objective: rather than maximizing the AUC, we
would like a monotonic ROC curve with AUC=1 that avoids points with large
values for Min(FP,FN). We propose a convex relaxation of this objective that
results in a new surrogate loss function called the AUM, short for Area Under
Min(FP, FN). Whereas previous loss functions are based on summing over all
labeled examples or pairs, the AUM requires a sort and a sum over the sequence
of points on the ROC curve. We show that AUM directional derivatives can be
efficiently computed and used in a gradient descent learning algorithm. In our
empirical study of supervised binary classification and changepoint detection
problems, we show that our new AUM minimization learning algorithm results in
improved AUC and comparable speed relative to previous baselines.
Related papers
- Efficient line search for optimizing Area Under the ROC Curve in gradient descent [2.094821665776961]
We study the piecewise linear/constant nature of the Area Under Min (AUM) of false positive and false negative rates.
We propose new efficient path-following algorithms for choosing the learning rate which is optimal for each step of descent.
arXiv Detail & Related papers (2024-10-11T08:59:06Z) - Adaptive $k$-nearest neighbor classifier based on the local estimation of the shape operator [49.87315310656657]
We introduce a new adaptive $k$-nearest neighbours ($kK$-NN) algorithm that explores the local curvature at a sample to adaptively defining the neighborhood size.
Results on many real-world datasets indicate that the new $kK$-NN algorithm yields superior balanced accuracy compared to the established $k$-NN method.
arXiv Detail & Related papers (2024-09-08T13:08:45Z) - Neural Gradient Learning and Optimization for Oriented Point Normal
Estimation [53.611206368815125]
We propose a deep learning approach to learn gradient vectors with consistent orientation from 3D point clouds for normal estimation.
We learn an angular distance field based on local plane geometry to refine the coarse gradient vectors.
Our method efficiently conducts global gradient approximation while achieving better accuracy and ability generalization of local feature description.
arXiv Detail & Related papers (2023-09-17T08:35:11Z) - A Log-linear Gradient Descent Algorithm for Unbalanced Binary
Classification using the All Pairs Squared Hinge Loss [0.0]
We propose a new functional representation of the square loss and squared hinge loss, which results in algorithms that compute the gradient in either linear or log-linear time.
Our new algorithm can achieve higher test AUC values on imbalanced data sets than previous algorithms, and make use of larger batch sizes than were previously feasible.
arXiv Detail & Related papers (2023-02-21T23:35:00Z) - Asymptotically Unbiased Instance-wise Regularized Partial AUC
Optimization: Theory and Algorithm [101.44676036551537]
One-way Partial AUC (OPAUC) and Two-way Partial AUC (TPAUC) measures the average performance of a binary classifier.
Most of the existing methods could only optimize PAUC approximately, leading to inevitable biases that are not controllable.
We present a simpler reformulation of the PAUC problem via distributional robust optimization AUC.
arXiv Detail & Related papers (2022-10-08T08:26:22Z) - Large-scale Optimization of Partial AUC in a Range of False Positive
Rates [51.12047280149546]
The area under the ROC curve (AUC) is one of the most widely used performance measures for classification models in machine learning.
We develop an efficient approximated gradient descent method based on recent practical envelope smoothing technique.
Our proposed algorithm can also be used to minimize the sum of some ranked range loss, which also lacks efficient solvers.
arXiv Detail & Related papers (2022-03-03T03:46:18Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - Resolving learning rates adaptively by locating Stochastic Non-Negative
Associated Gradient Projection Points using line searches [0.0]
Learning rates in neural network training are currently determined a priori to training using expensive manual or automated tuning.
This study proposes gradient-only line searches to resolve the learning rate for neural network training algorithms.
arXiv Detail & Related papers (2020-01-15T03:08:07Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.