Related papers: The Real-World-Weight Cross-Entropy Loss Function: Modeling the Costs of Mislabeling

The Real-World-Weight Cross-Entropy Loss Function: Modeling the Costs of Mislabeling

URL: http://arxiv.org/abs/2001.00570v1
Date: Fri, 3 Jan 2020 08:54:42 GMT
Title: The Real-World-Weight Cross-Entropy Loss Function: Modeling the Costs of Mislabeling
Authors: Yaoshiang Ho, Samuel Wookey
Abstract summary: We introduce the Real-World- Weight Crossentropy loss function, in both binary and single-label classification variants. Both variants allow direct input of real world costs as weights. For single-label, multicategory classification, our loss function also allows directization of probabilistic false positives, weighted by label, during the training of a machine learning model.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we propose a new metric to measure goodness-of-fit for classifiers, the Real World Cost function. This metric factors in information about a real world problem, such as financial impact, that other measures like accuracy or F1 do not. This metric is also more directly interpretable for users. To optimize for this metric, we introduce the Real-World- Weight Crossentropy loss function, in both binary and single-label classification variants. Both variants allow direct input of real world costs as weights. For single-label, multicategory classification, our loss function also allows direct penalization of probabilistic false positives, weighted by label, during the training of a machine learning model. We compare the design of our loss function to the binary crossentropy and categorical crossentropy functions, as well as their weighted variants, to discuss the potential for improvement in handling a variety of known shortcomings of machine learning, ranging from imbalanced classes to medical diagnostic error to reinforcement of social bias. We create scenarios that emulate those issues using the MNIST data set and demonstrate empirical results of our new loss function. Finally, we sketch a proof of this function based on Maximum Likelihood Estimation and discuss future directions.

Related papers

What should an AI assessor optimise for? [57.96463917842822]
An AI assessor is an external, ideally indepen-dent system that predicts an indicator, e.g., a loss value, of another AI system. Here we address the question: is it always optimal to train the assessor for the target metric? We experimentally explore this question for, respectively, regression losses and classification scores with monotonic and non-monotonic mappings.
arXiv Detail & Related papers (2025-02-01T08:41:57Z)
AnyLoss: Transforming Classification Metrics into Loss Functions [21.34290540936501]
evaluation metrics can be used to assess the performance of models in binary classification tasks. Most metrics are derived from a confusion matrix in a non-differentiable form, making it difficult to generate a differentiable loss function that could directly optimize them. We propose a general-purpose approach that transforms any confusion matrix-based metric into a loss function, textitAnyLoss, that is available in optimization processes.
arXiv Detail & Related papers (2024-05-23T16:14:16Z)
Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class. Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z)
Dual Compensation Residual Networks for Class Imbalanced Learning [98.35401757647749]
We propose Dual Compensation Residual Networks to better fit both tail and head classes. An important factor causing overfitting is that there is severe feature drift between training and test data on tail classes. We also propose a Residual Balanced Multi-Proxies classifier to alleviate the under-fitting issue.
arXiv Detail & Related papers (2023-08-25T04:06:30Z)
Contrastive losses as generalized models of global epistasis [0.5461938536945721]
Fitness functions map large spaces of biological sequences to properties of interest. Global epistasis models assume that a sparse latent function is transformed by a monotonic nonlinearity to emit measurable fitness. We show that contrastive losses are able to accurately estimate a ranking function from limited data even in regimes where MSE is ineffective.
arXiv Detail & Related papers (2023-05-04T20:33:05Z)
Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data. We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z)
Xtreme Margin: A Tunable Loss Function for Binary Classification Problems [0.0]
We provide an overview of a novel loss function, the Xtreme Margin loss function. Unlike the binary cross-entropy and the hinge loss functions, this loss function provides researchers and practitioners flexibility with their training process.
arXiv Detail & Related papers (2022-10-31T22:39:32Z)
Reformulating van Rijsbergen's $F_{\beta}$ metric for weighted binary cross-entropy [0.0]
This paper investigates incorporating a performance metric alongside differentiable loss functions to inform training outcomes. The focus is on van Rijsbergens $F_beta$ metric -- a popular choice for gauging classification performance.
arXiv Detail & Related papers (2022-10-29T01:21:42Z)
FairIF: Boosting Fairness in Deep Learning via Influence Functions with Validation Set Sensitive Attributes [51.02407217197623]
We propose a two-stage training algorithm named FAIRIF. It minimizes the loss over the reweighted data set where the sample weights are computed. We show that FAIRIF yields models with better fairness-utility trade-offs against various types of bias.
arXiv Detail & Related papers (2022-01-15T05:14:48Z)
Adaptive Weighted Discriminator for Training Generative Adversarial Networks [11.68198403603969]
We introduce a new family of discriminator loss functions that adopts a weighted sum of real and fake parts. Our method can be potentially applied to any discriminator model with a loss that is a sum of the real and fake parts.
arXiv Detail & Related papers (2020-12-05T23:55:42Z)
Deep F-measure Maximization for End-to-End Speech Understanding [52.36496114728355]
We propose a differentiable approximation to the F-measure and train the network with this objective using standard backpropagation. We perform experiments on two standard fairness datasets, Adult, Communities and Crime, and also on speech-to-intent detection on the ATIS dataset and speech-to-image concept classification on the Speech-COCO dataset. In all four of these tasks, F-measure results in improved micro-F1 scores, with absolute improvements of up to 8% absolute, as compared to models trained with the cross-entropy loss function.
arXiv Detail & Related papers (2020-08-08T03:02:27Z)
Piecewise Linear Regression via a Difference of Convex Functions [50.89452535187813]
We present a new piecewise linear regression methodology that utilizes fitting a difference of convex functions (DC functions) to the data. We empirically validate the method, showing it to be practically implementable, and to have comparable performance to existing regression/classification methods on real-world datasets.
arXiv Detail & Related papers (2020-07-05T18:58:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.