AnyLoss: Transforming Classification Metrics into Loss Functions
- URL: http://arxiv.org/abs/2405.14745v1
- Date: Thu, 23 May 2024 16:14:16 GMT
- Title: AnyLoss: Transforming Classification Metrics into Loss Functions
- Authors: Doheon Han, Nuno Moniz, Nitesh V Chawla,
- Abstract summary: evaluation metrics can be used to assess the performance of models in binary classification tasks.
Most metrics are derived from a confusion matrix in a non-differentiable form, making it difficult to generate a differentiable loss function that could directly optimize them.
We propose a general-purpose approach that transforms any confusion matrix-based metric into a loss function, textitAnyLoss, that is available in optimization processes.
- Score: 21.34290540936501
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many evaluation metrics can be used to assess the performance of models in binary classification tasks. However, most of them are derived from a confusion matrix in a non-differentiable form, making it very difficult to generate a differentiable loss function that could directly optimize them. The lack of solutions to bridge this challenge not only hinders our ability to solve difficult tasks, such as imbalanced learning, but also requires the deployment of computationally expensive hyperparameter search processes in model selection. In this paper, we propose a general-purpose approach that transforms any confusion matrix-based metric into a loss function, \textit{AnyLoss}, that is available in optimization processes. To this end, we use an approximation function to make a confusion matrix represented in a differentiable form, and this approach enables any confusion matrix-based metric to be directly used as a loss function. The mechanism of the approximation function is provided to ensure its operability and the differentiability of our loss functions is proved by suggesting their derivatives. We conduct extensive experiments under diverse neural networks with many datasets, and we demonstrate their general availability to target any confusion matrix-based metrics. Our method, especially, shows outstanding achievements in dealing with imbalanced datasets, and its competitive learning speed, compared to multiple baseline models, underscores its efficiency.
Related papers
- Relational Surrogate Loss Learning [41.61184221367546]
This paper revisits the surrogate loss learning, where a deep neural network is employed to approximate the evaluation metrics.
In this paper, we show that directly maintaining the relation of models between surrogate losses and metrics suffices.
Our method is much easier to optimize and enjoys significant efficiency and performance gains.
arXiv Detail & Related papers (2022-02-26T17:32:57Z) - Efficient Multidimensional Functional Data Analysis Using Marginal
Product Basis Systems [2.4554686192257424]
We propose a framework for learning continuous representations from a sample of multidimensional functional data.
We show that the resulting estimation problem can be solved efficiently by the tensor decomposition.
We conclude with a real data application in neuroimaging.
arXiv Detail & Related papers (2021-07-30T16:02:15Z) - Solving weakly supervised regression problem using low-rank manifold
regularization [77.34726150561087]
We solve a weakly supervised regression problem.
Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources.
In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation [56.343646789922545]
We propose to automate the design of metric-specific loss functions by searching differentiable surrogate losses for each metric.
Experiments on PASCAL VOC and Cityscapes demonstrate that the searched surrogate losses outperform the manually designed loss functions consistently.
arXiv Detail & Related papers (2020-10-15T17:59:08Z) - Bridging the Gap: Unifying the Training and Evaluation of Neural Network
Binary Classifiers [0.4893345190925178]
We propose a unifying approach to training neural network binary classifiers that combines a differentiable approximation of the Heaviside function with a probabilistic view of the typical confusion matrix values using soft sets.
Our theoretical analysis shows the benefit of using our method to optimize for a given evaluation metric, such as $F_$-Score, with soft sets.
arXiv Detail & Related papers (2020-09-02T22:13:26Z) - A Flexible Optimization Framework for Regularized Matrix-Tensor
Factorizations with Linear Couplings [5.079136838868448]
We propose a flexible algorithmic framework for coupled matrix and tensor factorizations.
The framework facilitates the use of a variety of constraints, loss functions and couplings with linear transformations.
arXiv Detail & Related papers (2020-07-19T06:49:59Z) - Eigendecomposition-Free Training of Deep Networks for Linear
Least-Square Problems [107.3868459697569]
We introduce an eigendecomposition-free approach to training a deep network.
We show that our approach is much more robust than explicit differentiation of the eigendecomposition.
Our method has better convergence properties and yields state-of-the-art results.
arXiv Detail & Related papers (2020-04-15T04:29:34Z) - Implicit differentiation of Lasso-type models for hyperparameter
optimization [82.73138686390514]
We introduce an efficient implicit differentiation algorithm, without matrix inversion, tailored for Lasso-type problems.
Our approach scales to high-dimensional data by leveraging the sparsity of the solutions.
arXiv Detail & Related papers (2020-02-20T18:43:42Z) - Learning with Differentiable Perturbed Optimizers [54.351317101356614]
We propose a systematic method to transform operations into operations that are differentiable and never locally constant.
Our approach relies on perturbeds, and can be used readily together with existing solvers.
We show how this framework can be connected to a family of losses developed in structured prediction, and give theoretical guarantees for their use in learning tasks.
arXiv Detail & Related papers (2020-02-20T11:11:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.