The Real-World-Weight Cross-Entropy Loss Function: Modeling the Costs of
Mislabeling
- URL: http://arxiv.org/abs/2001.00570v1
- Date: Fri, 3 Jan 2020 08:54:42 GMT
- Title: The Real-World-Weight Cross-Entropy Loss Function: Modeling the Costs of
Mislabeling
- Authors: Yaoshiang Ho, Samuel Wookey
- Abstract summary: We introduce the Real-World- Weight Crossentropy loss function, in both binary and single-label classification variants.
Both variants allow direct input of real world costs as weights.
For single-label, multicategory classification, our loss function also allows directization of probabilistic false positives, weighted by label, during the training of a machine learning model.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a new metric to measure goodness-of-fit for
classifiers, the Real World Cost function. This metric factors in information
about a real world problem, such as financial impact, that other measures like
accuracy or F1 do not. This metric is also more directly interpretable for
users. To optimize for this metric, we introduce the Real-World- Weight
Crossentropy loss function, in both binary and single-label classification
variants. Both variants allow direct input of real world costs as weights. For
single-label, multicategory classification, our loss function also allows
direct penalization of probabilistic false positives, weighted by label, during
the training of a machine learning model. We compare the design of our loss
function to the binary crossentropy and categorical crossentropy functions, as
well as their weighted variants, to discuss the potential for improvement in
handling a variety of known shortcomings of machine learning, ranging from
imbalanced classes to medical diagnostic error to reinforcement of social bias.
We create scenarios that emulate those issues using the MNIST data set and
demonstrate empirical results of our new loss function. Finally, we sketch a
proof of this function based on Maximum Likelihood Estimation and discuss
future directions.
Related papers
- AnyLoss: Transforming Classification Metrics into Loss Functions [21.34290540936501]
evaluation metrics can be used to assess the performance of models in binary classification tasks.
Most metrics are derived from a confusion matrix in a non-differentiable form, making it difficult to generate a differentiable loss function that could directly optimize them.
We propose a general-purpose approach that transforms any confusion matrix-based metric into a loss function, textitAnyLoss, that is available in optimization processes.
arXiv Detail & Related papers (2024-05-23T16:14:16Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Dual Compensation Residual Networks for Class Imbalanced Learning [98.35401757647749]
We propose Dual Compensation Residual Networks to better fit both tail and head classes.
An important factor causing overfitting is that there is severe feature drift between training and test data on tail classes.
We also propose a Residual Balanced Multi-Proxies classifier to alleviate the under-fitting issue.
arXiv Detail & Related papers (2023-08-25T04:06:30Z) - Contrastive losses as generalized models of global epistasis [0.5461938536945721]
Fitness functions map large spaces of biological sequences to properties of interest.
Global epistasis models assume that a sparse latent function is transformed by a monotonic nonlinearity to emit measurable fitness.
We show that contrastive losses are able to accurately estimate a ranking function from limited data even in regimes where MSE is ineffective.
arXiv Detail & Related papers (2023-05-04T20:33:05Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Xtreme Margin: A Tunable Loss Function for Binary Classification
Problems [0.0]
We provide an overview of a novel loss function, the Xtreme Margin loss function.
Unlike the binary cross-entropy and the hinge loss functions, this loss function provides researchers and practitioners flexibility with their training process.
arXiv Detail & Related papers (2022-10-31T22:39:32Z) - Reformulating van Rijsbergen's $F_{\beta}$ metric for weighted binary
cross-entropy [0.0]
This paper investigates incorporating a performance metric alongside differentiable loss functions to inform training outcomes.
The focus is on van Rijsbergens $F_beta$ metric -- a popular choice for gauging classification performance.
arXiv Detail & Related papers (2022-10-29T01:21:42Z) - FairIF: Boosting Fairness in Deep Learning via Influence Functions with
Validation Set Sensitive Attributes [51.02407217197623]
We propose a two-stage training algorithm named FAIRIF.
It minimizes the loss over the reweighted data set where the sample weights are computed.
We show that FAIRIF yields models with better fairness-utility trade-offs against various types of bias.
arXiv Detail & Related papers (2022-01-15T05:14:48Z) - Adaptive Weighted Discriminator for Training Generative Adversarial
Networks [11.68198403603969]
We introduce a new family of discriminator loss functions that adopts a weighted sum of real and fake parts.
Our method can be potentially applied to any discriminator model with a loss that is a sum of the real and fake parts.
arXiv Detail & Related papers (2020-12-05T23:55:42Z) - Deep F-measure Maximization for End-to-End Speech Understanding [52.36496114728355]
We propose a differentiable approximation to the F-measure and train the network with this objective using standard backpropagation.
We perform experiments on two standard fairness datasets, Adult, Communities and Crime, and also on speech-to-intent detection on the ATIS dataset and speech-to-image concept classification on the Speech-COCO dataset.
In all four of these tasks, F-measure results in improved micro-F1 scores, with absolute improvements of up to 8% absolute, as compared to models trained with the cross-entropy loss function.
arXiv Detail & Related papers (2020-08-08T03:02:27Z) - Piecewise Linear Regression via a Difference of Convex Functions [50.89452535187813]
We present a new piecewise linear regression methodology that utilizes fitting a difference of convex functions (DC functions) to the data.
We empirically validate the method, showing it to be practically implementable, and to have comparable performance to existing regression/classification methods on real-world datasets.
arXiv Detail & Related papers (2020-07-05T18:58:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.