Asymptotically Unbiased Instance-wise Regularized Partial AUC
Optimization: Theory and Algorithm
- URL: http://arxiv.org/abs/2210.03967v2
- Date: Tue, 11 Oct 2022 03:19:57 GMT
- Title: Asymptotically Unbiased Instance-wise Regularized Partial AUC
Optimization: Theory and Algorithm
- Authors: Huiyang Shao, Qianqian Xu, Zhiyong Yang, Shilong Bao, Qingming Huang
- Abstract summary: One-way Partial AUC (OPAUC) and Two-way Partial AUC (TPAUC) measures the average performance of a binary classifier.
Most of the existing methods could only optimize PAUC approximately, leading to inevitable biases that are not controllable.
We present a simpler reformulation of the PAUC problem via distributional robust optimization AUC.
- Score: 101.44676036551537
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Partial Area Under the ROC Curve (PAUC), typically including One-way
Partial AUC (OPAUC) and Two-way Partial AUC (TPAUC), measures the average
performance of a binary classifier within a specific false positive rate and/or
true positive rate interval, which is a widely adopted measure when decision
constraints must be considered. Consequently, PAUC optimization has naturally
attracted increasing attention in the machine learning community within the
last few years. Nonetheless, most of the existing methods could only optimize
PAUC approximately, leading to inevitable biases that are not controllable.
Fortunately, a recent work presents an unbiased formulation of the PAUC
optimization problem via distributional robust optimization. However, it is
based on the pair-wise formulation of AUC, which suffers from the limited
scalability w.r.t. sample size and a slow convergence rate, especially for
TPAUC. To address this issue, we present a simpler reformulation of the problem
in an asymptotically unbiased and instance-wise manner. For both OPAUC and
TPAUC, we come to a nonconvex strongly concave minimax regularized problem of
instance-wise functions. On top of this, we employ an efficient solver enjoys a
linear per-iteration computational complexity w.r.t. the sample size and a
time-complexity of $O(\epsilon^{-1/3})$ to reach a $\epsilon$ stationary point.
Furthermore, we find that the minimax reformulation also facilitates the
theoretical analysis of generalization error as a byproduct. Compared with the
existing results, we present new error bounds that are much easier to prove and
could deal with hypotheses with real-valued outputs. Finally, extensive
experiments on several benchmark datasets demonstrate the effectiveness of our
method.
Related papers
- Versatile Single-Loop Method for Gradient Estimator: First and Second
Order Optimality, and its Application to Federated Learning [45.78238792836363]
We present a single-loop algorithm named SLEDGE (Single-Loop-E Gradient Estimator) for periodic convergence.
Unlike existing methods, SLEDGE has the advantage of versatility; (ii) second-order optimal, (ii) in the PL region, and (iii) smaller complexity under less of data.
arXiv Detail & Related papers (2022-09-01T11:05:26Z) - Balanced Self-Paced Learning for AUC Maximization [88.53174245457268]
Existing self-paced methods are limited to pointwise AUC.
Our algorithm converges to a stationary point on the basis of closed-form solutions.
arXiv Detail & Related papers (2022-07-08T02:09:32Z) - Large-scale Optimization of Partial AUC in a Range of False Positive
Rates [51.12047280149546]
The area under the ROC curve (AUC) is one of the most widely used performance measures for classification models in machine learning.
We develop an efficient approximated gradient descent method based on recent practical envelope smoothing technique.
Our proposed algorithm can also be used to minimize the sum of some ranked range loss, which also lacks efficient solvers.
arXiv Detail & Related papers (2022-03-03T03:46:18Z) - Accelerated and instance-optimal policy evaluation with linear function
approximation [17.995515643150657]
Existing algorithms fail to match at least one of these lower bounds.
We develop an accelerated, variance-reduced fast temporal difference algorithm that simultaneously matches both lower bounds and attains a strong notion of instance-optimality.
arXiv Detail & Related papers (2021-12-24T17:21:04Z) - Learning with Multiclass AUC: Theory and Algorithms [141.63211412386283]
Area under the ROC curve (AUC) is a well-known ranking metric for problems such as imbalanced learning and recommender systems.
In this paper, we start an early trial to consider the problem of learning multiclass scoring functions via optimizing multiclass AUC metrics.
arXiv Detail & Related papers (2021-07-28T05:18:10Z) - Distributionally Robust Federated Averaging [19.875176871167966]
We present communication efficient distributed algorithms for robust learning periodic averaging with adaptive sampling.
We give corroborating experimental evidence for our theoretical results in federated learning settings.
arXiv Detail & Related papers (2021-02-25T03:32:09Z) - Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality [131.45028999325797]
We develop a doubly robust off-policy AC (DR-Off-PAC) for discounted MDP.
DR-Off-PAC adopts a single timescale structure, in which both actor and critics are updated simultaneously with constant stepsize.
We study the finite-time convergence rate and characterize the sample complexity for DR-Off-PAC to attain an $epsilon$-accurate optimal policy.
arXiv Detail & Related papers (2021-02-23T18:56:13Z) - Approximation Algorithms for Sparse Principal Component Analysis [57.5357874512594]
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics.
Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis.
We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
arXiv Detail & Related papers (2020-06-23T04:25:36Z) - Unified Analysis of Stochastic Gradient Methods for Composite Convex and
Smooth Optimization [15.82816385434718]
We present a unified theorem for the convergence analysis of gradient algorithms for minimizing a smooth and convex loss plus a convex regularizer.
We do this by extending the unified analysis of Gorbunov, Hanzely & Richt'arik ( 2020) and dropping the requirement that the loss function be strongly convex.
Our unified analysis applies to a host of existing algorithms such as proximal SGD, variance reduced methods, quantization and some coordinate descent type methods.
arXiv Detail & Related papers (2020-06-20T13:40:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.