Related papers: An Analysis of Loss Functions for Binary Classification and Regression

An Analysis of Loss Functions for Binary Classification and Regression

URL: http://arxiv.org/abs/2301.07638v1
Date: Wed, 18 Jan 2023 16:26:57 GMT
Title: An Analysis of Loss Functions for Binary Classification and Regression
Authors: Jeffrey Buzas
Abstract summary: This paper explores connections between margin-based loss functions and consistency in binary classification and regression applications. A simple characterization for conformable (consistent) loss functions is given, which allows for straightforward comparison of different losses. A relation between the margin and standardized logistic regression residuals is derived, demonstrating that all margin-based loss can be viewed as loss functions of squared standardized logistic regression residuals.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper explores connections between margin-based loss functions and consistency in binary classification and regression applications. It is shown that a large class of margin-based loss functions for binary classification/regression result in estimating scores equivalent to log-likelihood scores weighted by an even function. A simple characterization for conformable (consistent) loss functions is given, which allows for straightforward comparison of different losses, including exponential loss, logistic loss, and others. The characterization is used to construct a new Huber-type loss function for the logistic model. A simple relation between the margin and standardized logistic regression residuals is derived, demonstrating that all margin-based loss can be viewed as loss functions of squared standardized logistic regression residuals. The relation provides new, straightforward interpretations for exponential and logistic loss, and aids in understanding why exponential loss is sensitive to outliers. In particular, it is shown that minimizing empirical exponential loss is equivalent to minimizing the sum of squared standardized logistic regression residuals. The relation also provides new insight into the AdaBoost algorithm.

Related papers

Any-stepsize Gradient Descent for Separable Data under Fenchel--Young Losses [17.835960292396255]
We show arbitrary-stepsize gradient convergence for a general loss function based on the framework of emphFenchel--Young losses. We argue that these better rate is possible because of emphseparation margin of loss functions, instead of the self-bounding property.
arXiv Detail & Related papers (2025-02-07T12:52:12Z)
On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics. The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z)
Distributional Reinforcement Learning with Dual Expectile-Quantile Regression [51.87411935256015]
quantile regression approach to distributional RL provides flexible and effective way of learning arbitrary return distributions.<n>We show that distributional estimation guarantees vanish, and we empirically observe that the estimated distribution rapidly collapses to its mean.<n>Motivated by the efficiency of $L$-based learning, we propose to jointly learn expectiles and quantiles of the return distribution in a way that allows efficient learning.
arXiv Detail & Related papers (2023-05-26T12:30:05Z)
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability [69.01076284478151]
In machine learning optimization, gradient descent (GD) often operates at the edge of stability (EoS) This paper studies the convergence and implicit bias of constant-stepsize GD for logistic regression on linearly separable data in the EoS regime.
arXiv Detail & Related papers (2023-05-19T16:24:47Z)
Cross-Entropy Loss Functions: Theoretical Analysis and Applications [27.3569897539488]
We present a theoretical analysis of a broad family of loss functions, that includes cross-entropy (or logistic loss), generalized cross-entropy, the mean absolute error and other cross-entropy-like loss functions. We show that these loss functions are beneficial in the adversarial setting by proving that they admit $H$-consistency bounds. This leads to new adversarial robustness algorithms that consist of minimizing a regularized smooth adversarial comp-sum loss.
arXiv Detail & Related papers (2023-04-14T17:58:23Z)
Soft-SVM Regression For Binary Classification [0.0]
We introduce a new exponential family based on a convex relaxation of the hinge loss function using softness and class-separation parameters. This new family, denoted Soft-SVM, allows us to prescribe a generalized linear model that effectively bridges between logistic regression and SVM classification.
arXiv Detail & Related papers (2022-05-24T03:01:35Z)
Nonconvex Extension of Generalized Huber Loss for Robust Learning and Pseudo-Mode Statistics [0.0]
We show that using the log-exp together with the logistic function, we can create a loss combines. We show a robust generalization that can be utilized to minimize the exponential convergence.
arXiv Detail & Related papers (2022-02-22T19:32:02Z)
Leveraging Unlabeled Data for Entity-Relation Extraction through Probabilistic Constraint Satisfaction [54.06292969184476]
We study the problem of entity-relation extraction in the presence of symbolic domain knowledge. Our approach employs semantic loss which captures the precise meaning of a logical sentence. With a focus on low-data regimes, we show that semantic loss outperforms the baselines by a wide margin.
arXiv Detail & Related papers (2021-03-20T00:16:29Z)
A Symmetric Loss Perspective of Reliable Machine Learning [87.68601212686086]
We review how a symmetric loss can yield robust classification from corrupted labels in balanced error rate (BER) minimization. We demonstrate how the robust AUC method can benefit natural language processing in the problem where we want to learn only from relevant keywords.
arXiv Detail & Related papers (2021-01-05T06:25:47Z)
A Framework of Learning Through Empirical Gain Maximization [8.834480010537229]
We develop a framework of empirical gain (EGM) to address the robust regression problem. The Tukey's biweight loss can be derived from other triunderstood non-established loss functions.
arXiv Detail & Related papers (2020-09-29T18:36:26Z)
An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay [72.23433407017558]
We show that any loss function evaluated with non-uniformly sampled data can be transformed into another uniformly sampled loss function. Surprisingly, we find in some environments PER can be replaced entirely by this new loss function without impact to empirical performance.
arXiv Detail & Related papers (2020-07-12T17:45:24Z)
Approximation Schemes for ReLU Regression [80.33702497406632]
We consider the fundamental problem of ReLU regression. The goal is to output the best fitting ReLU with respect to square loss given to draws from some unknown distribution.
arXiv Detail & Related papers (2020-05-26T16:26:17Z)
The Implicit Bias of Gradient Descent on Separable Data [44.98410310356165]
We show the predictor converges to the direction of the max-margin (hard margin SVM) solution. This can help explain the benefit of continuing to optimize the logistic or cross-entropy loss even after the training error is zero.
arXiv Detail & Related papers (2017-10-27T21:47:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.