A surrogate loss function for optimization of $F_\beta$ score in binary
classification with imbalanced data
- URL: http://arxiv.org/abs/2104.01459v1
- Date: Sat, 3 Apr 2021 18:36:23 GMT
- Title: A surrogate loss function for optimization of $F_\beta$ score in binary
classification with imbalanced data
- Authors: Namgil Lee, Heejung Yang, Hojin Yoo
- Abstract summary: The gradient paths of the proposed surrogate $F_beta$ loss function approximate the gradient paths of the large sample limit of the $F_beta$ score.
It is demonstrated that the proposed surrogate $F_beta$ loss function is effective for optimizing $F_beta$ scores under class imbalances.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The $F_\beta$ score is a commonly used measure of classification performance,
which plays crucial roles in classification tasks with imbalanced data sets.
However, the $F_\beta$ score cannot be used as a loss function by
gradient-based learning algorithms for optimizing neural network parameters due
to its non-differentiability. On the other hand, commonly used loss functions
such as the binary cross-entropy (BCE) loss are not directly related to
performance measures such as the $F_\beta$ score, so that neural networks
optimized by using the loss functions may not yield optimal performance
measures. In this study, we investigate a relationship between classification
performance measures and loss functions in terms of the gradients with respect
to the model parameters. Then, we propose a differentiable surrogate loss
function for the optimization of the $F_\beta$ score. We show that the gradient
paths of the proposed surrogate $F_\beta$ loss function approximate the
gradient paths of the large sample limit of the $F_\beta$ score. Through
numerical experiments using ResNets and benchmark image data sets, it is
demonstrated that the proposed surrogate $F_\beta$ loss function is effective
for optimizing $F_\beta$ scores under class imbalances in binary classification
tasks compared with other loss functions.
Related papers
- Newton Losses: Using Curvature Information for Learning with Differentiable Algorithms [80.37846867546517]
We show how to train eight different neural networks with custom objectives.
We exploit their second-order information via their empirical Fisherssian matrices.
We apply Loss Lossiable algorithms to achieve significant improvements for less differentiable algorithms.
arXiv Detail & Related papers (2024-10-24T18:02:11Z) - $α$-Divergence Loss Function for Neural Density Ratio Estimation [0.0]
Density ratio estimation (DRE) is a fundamental machine learning technique for capturing relationships between two probability distributions.
Existing methods face optimization challenges, such as overfitting due to lower-unbounded loss functions, biased mini-batch gradients, vanishing training loss gradients, and high sample requirements for Kullback-Leibler (KL) divergence loss functions.
We propose a novel loss function for DRE, the $alpha$-divergence loss function ($alpha$-Div), which is concise but offers stable and effective optimization for DRE.
arXiv Detail & Related papers (2024-02-03T05:33:01Z) - Universal Online Learning with Gradient Variations: A Multi-layer Online Ensemble Approach [57.92727189589498]
We propose an online convex optimization approach with two different levels of adaptivity.
We obtain $mathcalO(log V_T)$, $mathcalO(d log V_T)$ and $hatmathcalO(sqrtV_T)$ regret bounds for strongly convex, exp-concave and convex loss functions.
arXiv Detail & Related papers (2023-07-17T09:55:35Z) - Alternate Loss Functions for Classification and Robust Regression Can Improve the Accuracy of Artificial Neural Networks [6.452225158891343]
This paper shows that training speed and final accuracy of neural networks can significantly depend on the loss function used to train neural networks.
Two new classification loss functions that significantly improve performance on a wide variety of benchmark tasks are proposed.
arXiv Detail & Related papers (2023-03-17T12:52:06Z) - Xtreme Margin: A Tunable Loss Function for Binary Classification
Problems [0.0]
We provide an overview of a novel loss function, the Xtreme Margin loss function.
Unlike the binary cross-entropy and the hinge loss functions, this loss function provides researchers and practitioners flexibility with their training process.
arXiv Detail & Related papers (2022-10-31T22:39:32Z) - Reformulating van Rijsbergen's $F_{\beta}$ metric for weighted binary
cross-entropy [0.0]
This paper investigates incorporating a performance metric alongside differentiable loss functions to inform training outcomes.
The focus is on van Rijsbergens $F_beta$ metric -- a popular choice for gauging classification performance.
arXiv Detail & Related papers (2022-10-29T01:21:42Z) - Gradient-Free Methods for Deterministic and Stochastic Nonsmooth
Nonconvex Optimization [94.19177623349947]
Non-smooth non optimization problems emerge in machine learning and business making.
Two core challenges impede the development of efficient methods with finitetime convergence guarantee.
Two-phase versions of GFM and SGFM are also proposed and proven to achieve improved large-deviation results.
arXiv Detail & Related papers (2022-09-12T06:53:24Z) - Neural Greedy Pursuit for Feature Selection [72.4121881681861]
We propose a greedy algorithm to select $N$ important features among $P$ input features for a non-linear prediction problem.
We use neural networks as predictors in the algorithm to compute the loss.
arXiv Detail & Related papers (2022-07-19T16:39:16Z) - Binarizing by Classification: Is soft function really necessary? [4.329951775163721]
We propose to tackle network binarization as a binary classification problem.
We also take binarization as a lightweighting approach for pose estimation models.
The proposed method enables binary networks to achieve a mAP of up to $60.6$ for the first time.
arXiv Detail & Related papers (2022-05-16T02:47:41Z) - Do Lessons from Metric Learning Generalize to Image-Caption Retrieval? [67.45267657995748]
The triplet loss with semi-hard negatives has become the de facto choice for image-caption retrieval (ICR) methods that are optimized from scratch.
Recent progress in metric learning has given rise to new loss functions that outperform the triplet loss on tasks such as image retrieval and representation learning.
We ask whether these findings generalize to the setting of ICR by comparing three loss functions on two ICR methods.
arXiv Detail & Related papers (2022-02-14T15:18:00Z) - Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation [56.343646789922545]
We propose to automate the design of metric-specific loss functions by searching differentiable surrogate losses for each metric.
Experiments on PASCAL VOC and Cityscapes demonstrate that the searched surrogate losses outperform the manually designed loss functions consistently.
arXiv Detail & Related papers (2020-10-15T17:59:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.