Benchmarking Deep AUROC Optimization: Loss Functions and Algorithmic
Choices
- URL: http://arxiv.org/abs/2203.14177v2
- Date: Tue, 29 Mar 2022 03:39:26 GMT
- Title: Benchmarking Deep AUROC Optimization: Loss Functions and Algorithmic
Choices
- Authors: Dixian Zhu, Xiaodong Wu, Tianbao Yang
- Abstract summary: We benchmark a variety of loss functions with different algorithmic choices for deep AUROC optimization problem.
We highlight the essential choices such as positive sampling rate, regularization, normalization/activation, and weights.
Our findings show that although Adam-type method is more competitive from training perspective, but it does not outperform others from testing perspective.
- Score: 37.559461866831754
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The area under the ROC curve (AUROC) has been vigorously applied for
imbalanced classification and moreover combined with deep learning techniques.
However, there is no existing work that provides sound information for peers to
choose appropriate deep AUROC maximization techniques. In this work, we fill
this gap from three aspects. (i) We benchmark a variety of loss functions with
different algorithmic choices for deep AUROC optimization problem. We study the
loss functions in two categories: pairwise loss and composite loss, which
includes a total of 10 loss functions. Interestingly, we find composite loss,
as an innovative loss function class, shows more competitive performance than
pairwise loss from both training convergence and testing generalization
perspectives. Nevertheless, data with more corrupted labels favors a pairwise
symmetric loss. (ii) Moreover, we benchmark and highlight the essential
algorithmic choices such as positive sampling rate, regularization,
normalization/activation, and optimizers. Key findings include: higher positive
sampling rate is likely to be beneficial for deep AUROC maximization; different
datasets favors different weights of regularizations; appropriate normalization
techniques, such as sigmoid and $\ell_2$ score normalization, could improve
model performance. (iii) For optimization aspect, we benchmark SGD-type,
Momentum-type, and Adam-type optimizers for both pairwise and composite loss.
Our findings show that although Adam-type method is more competitive from
training perspective, but it does not outperform others from testing
perspective.
Related papers
- Stochastic Optimal Control Matching [53.156277491861985]
Our work introduces Optimal Control Matching (SOCM), a novel Iterative Diffusion Optimization (IDO) technique for optimal control.
The control is learned via a least squares problem by trying to fit a matching vector field.
Experimentally, our algorithm achieves lower error than all the existing IDO techniques for optimal control.
arXiv Detail & Related papers (2023-12-04T16:49:43Z) - Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs.
We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z) - Support Vector Machines with the Hard-Margin Loss: Optimal Training via
Combinatorial Benders' Cuts [8.281391209717105]
We show how to train the hard-margin SVM model to global optimality.
We introduce an iterative sampling and sub decomposition algorithm that solves the problem.
arXiv Detail & Related papers (2022-07-15T18:21:51Z) - Label Distributionally Robust Losses for Multi-class Classification:
Consistency, Robustness and Adaptivity [55.29408396918968]
We study a family of loss functions named label-distributionally robust (LDR) losses for multi-class classification.
Our contributions include both consistency and robustness by establishing top-$k$ consistency of LDR losses for multi-class classification.
We propose a new adaptive LDR loss that automatically adapts the individualized temperature parameter to the noise degree of class label of each instance.
arXiv Detail & Related papers (2021-12-30T00:27:30Z) - Ensemble of Loss Functions to Improve Generalizability of Deep Metric
Learning methods [0.609170287691728]
We propose novel approaches to combine different losses built on top of a shared deep feature extractor.
We evaluate our methods on some popular datasets from the machine vision domain in conventional Zero-Shot-Learning (ZSL) settings.
arXiv Detail & Related papers (2021-07-02T15:19:46Z) - Rethinking and Reweighting the Univariate Losses for Multi-Label
Ranking: Consistency and Generalization [44.73295800450414]
(Partial) ranking loss is a commonly used evaluation measure for multi-label classification.
There is a gap between existing theory and practice -- some pairwise losses can lead to promising performance but lack consistency.
arXiv Detail & Related papers (2021-05-10T09:23:27Z) - Loss Function Discovery for Object Detection via Convergence-Simulation
Driven Search [101.73248560009124]
We propose an effective convergence-simulation driven evolutionary search algorithm, CSE-Autoloss, for speeding up the search progress.
We conduct extensive evaluations of loss function search on popular detectors and validate the good generalization capability of searched losses.
Our experiments show that the best-discovered loss function combinations outperform default combinations by 1.1% and 0.8% in terms of mAP for two-stage and one-stage detectors.
arXiv Detail & Related papers (2021-02-09T08:34:52Z) - Adversarially Robust Learning via Entropic Regularization [31.6158163883893]
We propose a new family of algorithms, ATENT, for training adversarially robust deep neural networks.
Our approach achieves competitive (or better) performance in terms of robust classification accuracy.
arXiv Detail & Related papers (2020-08-27T18:54:43Z) - Beyond Triplet Loss: Meta Prototypical N-tuple Loss for Person
Re-identification [118.72423376789062]
We introduce a multi-class classification loss, i.e., N-tuple loss, to jointly consider multiple (N) instances for per-query optimization.
With the multi-class classification incorporated, our model achieves the state-of-the-art performance on the benchmark person ReID datasets.
arXiv Detail & Related papers (2020-06-08T23:34:08Z) - Circle Loss: A Unified Perspective of Pair Similarity Optimization [42.33948436767691]
We find a majority of loss functions, including the triplet loss and the softmax plus cross-entropy loss.
We show that the Circle loss offers a more flexible optimization approach towards a more definite convergence target.
arXiv Detail & Related papers (2020-02-25T13:56:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.