An Analysis of Loss Functions for Binary Classification and Regression
- URL: http://arxiv.org/abs/2301.07638v1
- Date: Wed, 18 Jan 2023 16:26:57 GMT
- Title: An Analysis of Loss Functions for Binary Classification and Regression
- Authors: Jeffrey Buzas
- Abstract summary: This paper explores connections between margin-based loss functions and consistency in binary classification and regression applications.
A simple characterization for conformable (consistent) loss functions is given, which allows for straightforward comparison of different losses.
A relation between the margin and standardized logistic regression residuals is derived, demonstrating that all margin-based loss can be viewed as loss functions of squared standardized logistic regression residuals.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper explores connections between margin-based loss functions and
consistency in binary classification and regression applications. It is shown
that a large class of margin-based loss functions for binary
classification/regression result in estimating scores equivalent to
log-likelihood scores weighted by an even function. A simple characterization
for conformable (consistent) loss functions is given, which allows for
straightforward comparison of different losses, including exponential loss,
logistic loss, and others. The characterization is used to construct a new
Huber-type loss function for the logistic model. A simple relation between the
margin and standardized logistic regression residuals is derived, demonstrating
that all margin-based loss can be viewed as loss functions of squared
standardized logistic regression residuals. The relation provides new,
straightforward interpretations for exponential and logistic loss, and aids in
understanding why exponential loss is sensitive to outliers. In particular, it
is shown that minimizing empirical exponential loss is equivalent to minimizing
the sum of squared standardized logistic regression residuals. The relation
also provides new insight into the AdaBoost algorithm.
Related papers
- On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics.
The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z) - Implicit Bias of Gradient Descent for Logistic Regression at the Edge of
Stability [69.01076284478151]
In machine learning optimization, gradient descent (GD) often operates at the edge of stability (EoS)
This paper studies the convergence and implicit bias of constant-stepsize GD for logistic regression on linearly separable data in the EoS regime.
arXiv Detail & Related papers (2023-05-19T16:24:47Z) - Cross-Entropy Loss Functions: Theoretical Analysis and Applications [27.3569897539488]
We present a theoretical analysis of a broad family of loss functions, that includes cross-entropy (or logistic loss), generalized cross-entropy, the mean absolute error and other cross-entropy-like loss functions.
We show that these loss functions are beneficial in the adversarial setting by proving that they admit $H$-consistency bounds.
This leads to new adversarial robustness algorithms that consist of minimizing a regularized smooth adversarial comp-sum loss.
arXiv Detail & Related papers (2023-04-14T17:58:23Z) - Soft-SVM Regression For Binary Classification [0.0]
We introduce a new exponential family based on a convex relaxation of the hinge loss function using softness and class-separation parameters.
This new family, denoted Soft-SVM, allows us to prescribe a generalized linear model that effectively bridges between logistic regression and SVM classification.
arXiv Detail & Related papers (2022-05-24T03:01:35Z) - Nonconvex Extension of Generalized Huber Loss for Robust Learning and
Pseudo-Mode Statistics [0.0]
We show that using the log-exp together with the logistic function, we can create a loss combines.
We show a robust generalization that can be utilized to minimize the exponential convergence.
arXiv Detail & Related papers (2022-02-22T19:32:02Z) - Leveraging Unlabeled Data for Entity-Relation Extraction through
Probabilistic Constraint Satisfaction [54.06292969184476]
We study the problem of entity-relation extraction in the presence of symbolic domain knowledge.
Our approach employs semantic loss which captures the precise meaning of a logical sentence.
With a focus on low-data regimes, we show that semantic loss outperforms the baselines by a wide margin.
arXiv Detail & Related papers (2021-03-20T00:16:29Z) - A Symmetric Loss Perspective of Reliable Machine Learning [87.68601212686086]
We review how a symmetric loss can yield robust classification from corrupted labels in balanced error rate (BER) minimization.
We demonstrate how the robust AUC method can benefit natural language processing in the problem where we want to learn only from relevant keywords.
arXiv Detail & Related papers (2021-01-05T06:25:47Z) - A Framework of Learning Through Empirical Gain Maximization [8.834480010537229]
We develop a framework of empirical gain (EGM) to address the robust regression problem.
The Tukey's biweight loss can be derived from other triunderstood non-established loss functions.
arXiv Detail & Related papers (2020-09-29T18:36:26Z) - An Equivalence between Loss Functions and Non-Uniform Sampling in
Experience Replay [72.23433407017558]
We show that any loss function evaluated with non-uniformly sampled data can be transformed into another uniformly sampled loss function.
Surprisingly, we find in some environments PER can be replaced entirely by this new loss function without impact to empirical performance.
arXiv Detail & Related papers (2020-07-12T17:45:24Z) - Approximation Schemes for ReLU Regression [80.33702497406632]
We consider the fundamental problem of ReLU regression.
The goal is to output the best fitting ReLU with respect to square loss given to draws from some unknown distribution.
arXiv Detail & Related papers (2020-05-26T16:26:17Z) - The Implicit Bias of Gradient Descent on Separable Data [44.98410310356165]
We show the predictor converges to the direction of the max-margin (hard margin SVM) solution.
This can help explain the benefit of continuing to optimize the logistic or cross-entropy loss even after the training error is zero.
arXiv Detail & Related papers (2017-10-27T21:47:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.