Linear Classifiers Under Infinite Imbalance
- URL: http://arxiv.org/abs/2106.05797v2
- Date: Fri, 12 May 2023 17:06:27 GMT
- Title: Linear Classifiers Under Infinite Imbalance
- Authors: Paul Glasserman, Mike Li
- Abstract summary: We study the behavior of linear discriminant functions for binary classification in the infinite-imbalance limit.
We show that for a broad class of weight functions, the intercept diverges but the rest of the coefficient vector has a finite almost sure limit under infinite imbalance.
- Score: 1.370633147306388
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the behavior of linear discriminant functions for binary
classification in the infinite-imbalance limit, where the sample size of one
class grows without bound while the sample size of the other remains fixed. The
coefficients of the classifier minimize an empirical loss specified through a
weight function. We show that for a broad class of weight functions, the
intercept diverges but the rest of the coefficient vector has a finite almost
sure limit under infinite imbalance, extending prior work on logistic
regression. The limit depends on the left-tail growth rate of the weight
function, for which we distinguish two cases: subexponential and exponential.
The limiting coefficient vectors reflect robustness or conservatism properties
in the sense that they optimize against certain worst-case alternatives. In the
subexponential case, the limit is equivalent to an implicit choice of
upsampling distribution for the minority class. We apply these ideas in a
credit risk setting, with particular emphasis on performance in the
high-sensitivity and high-specificity regions.
Related papers
- The Implicit Bias of Gradient Descent on Separable Multiclass Data [38.05903703331163]
We employ the framework of Permutation Equivariant and Relative Margin-based (PERM) losses to introduce a multiclass extension of the exponential tail property.
Our proof techniques closely mirror those of the binary case, thus illustrating the power of the PERM framework for bridging the binary-multiclass gap.
arXiv Detail & Related papers (2024-11-02T19:39:21Z) - Refined Risk Bounds for Unbounded Losses via Transductive Priors [58.967816314671296]
We revisit the sequential variants of linear regression with the squared loss, classification problems with hinge loss, and logistic regression.
Our key tools are based on the exponential weights algorithm with carefully chosen transductive priors.
arXiv Detail & Related papers (2024-10-29T00:01:04Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Off-policy estimation of linear functionals: Non-asymptotic theory for
semi-parametric efficiency [59.48096489854697]
The problem of estimating a linear functional based on observational data is canonical in both the causal inference and bandit literatures.
We prove non-asymptotic upper bounds on the mean-squared error of such procedures.
We establish its instance-dependent optimality in finite samples via matching non-asymptotic local minimax lower bounds.
arXiv Detail & Related papers (2022-09-26T23:50:55Z) - Data-Driven Influence Functions for Optimization-Based Causal Inference [105.5385525290466]
We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing.
We study the case where probability distributions are not known a priori but need to be estimated from data.
arXiv Detail & Related papers (2022-08-29T16:16:22Z) - Inference on Strongly Identified Functionals of Weakly Identified
Functions [71.42652863687117]
We study a novel condition for the functional to be strongly identified even when the nuisance function is not.
We propose penalized minimax estimators for both the primary and debiasing nuisance functions.
arXiv Detail & Related papers (2022-08-17T13:38:31Z) - High-dimensional limit theorems for SGD: Effective dynamics and critical
scaling [6.950316788263433]
We prove limit theorems for the trajectories of summary statistics of gradient descent (SGD)
We show a critical scaling regime for the step-size, below which the effective ballistic dynamics matches gradient flow for the population loss.
About the fixed points of this effective dynamics, the corresponding diffusive limits can be quite complex and even degenerate.
arXiv Detail & Related papers (2022-06-08T17:42:18Z) - Asymptotic Inference for Infinitely Imbalanced Logistic Regression [4.981260380070016]
We show that the variance of the the limiting slope depends exponentially on the z-score of the average of the minority class's points with respect to the majority class's distribution.
We confirm our results by Monte Carlo simulations.
arXiv Detail & Related papers (2022-04-27T23:52:42Z) - Origins of Low-dimensional Adversarial Perturbations [17.17170592140042]
We study the phenomenon of low-dimensional adversarial perturbations in classification.
The goal is to fool the classifier into flipping its decision on a nonzero fraction of inputs from a designated class.
We compute lowerbounds for the fooling rate of any subspace.
arXiv Detail & Related papers (2022-03-25T17:02:49Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - Lipschitz Bounded Equilibrium Networks [3.2872586139884623]
This paper introduces new parameterizations of equilibrium neural networks, i.e. networks defined by implicit equations.
The new parameterization admits a Lipschitz bound during training via unconstrained optimization.
In image classification experiments we show that the Lipschitz bounds are very accurate and improve robustness to adversarial attacks.
arXiv Detail & Related papers (2020-10-05T01:00:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.