Related papers: Benchmarking Deep AUROC Optimization: Loss Functions and Algorithmic Choices

Benchmarking Deep AUROC Optimization: Loss Functions and Algorithmic Choices

URL: http://arxiv.org/abs/2203.14177v2
Date: Tue, 29 Mar 2022 03:39:26 GMT
Title: Benchmarking Deep AUROC Optimization: Loss Functions and Algorithmic Choices
Authors: Dixian Zhu, Xiaodong Wu, Tianbao Yang
Abstract summary: We benchmark a variety of loss functions with different algorithmic choices for deep AUROC optimization problem. We highlight the essential choices such as positive sampling rate, regularization, normalization/activation, and weights. Our findings show that although Adam-type method is more competitive from training perspective, but it does not outperform others from testing perspective.
Score: 37.559461866831754
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The area under the ROC curve (AUROC) has been vigorously applied for imbalanced classification and moreover combined with deep learning techniques. However, there is no existing work that provides sound information for peers to choose appropriate deep AUROC maximization techniques. In this work, we fill this gap from three aspects. (i) We benchmark a variety of loss functions with different algorithmic choices for deep AUROC optimization problem. We study the loss functions in two categories: pairwise loss and composite loss, which includes a total of 10 loss functions. Interestingly, we find composite loss, as an innovative loss function class, shows more competitive performance than pairwise loss from both training convergence and testing generalization perspectives. Nevertheless, data with more corrupted labels favors a pairwise symmetric loss. (ii) Moreover, we benchmark and highlight the essential algorithmic choices such as positive sampling rate, regularization, normalization/activation, and optimizers. Key findings include: higher positive sampling rate is likely to be beneficial for deep AUROC maximization; different datasets favors different weights of regularizations; appropriate normalization techniques, such as sigmoid and $\ell_2$ score normalization, could improve model performance. (iii) For optimization aspect, we benchmark SGD-type, Momentum-type, and Adam-type optimizers for both pairwise and composite loss. Our findings show that although Adam-type method is more competitive from training perspective, but it does not outperform others from testing perspective.

Related papers

Evaluating Loss Functions for Graph Neural Networks: Towards Pretraining and Generalization [1.2522462543913029]
The study looked at both inductive and transductive settings.<n>We meticulously analyzed the top ten model-loss combinations for each metric based on their average rank.<n>The GIN architecture always showed the highest-level average performance, especially with Cross-Entropy loss.
arXiv Detail & Related papers (2025-06-17T02:12:19Z)
Stochastic Primal-Dual Double Block-Coordinate for Two-way Partial AUC Maximization [56.805574957824135]
Two-way partial AUCAUC is a critical performance metric for binary classification with imbalanced data.<n>Existing algorithms for TPAUC optimization remain under-explored.<n>We introduce two innovative double-coordinate block-coordinate algorithms for TPAUC optimization.
arXiv Detail & Related papers (2025-05-28T03:55:05Z)
Stochastic Optimal Control Matching [53.156277491861985]
Our work introduces Optimal Control Matching (SOCM), a novel Iterative Diffusion Optimization (IDO) technique for optimal control. The control is learned via a least squares problem by trying to fit a matching vector field. Experimentally, our algorithm achieves lower error than all the existing IDO techniques for optimal control.
arXiv Detail & Related papers (2023-12-04T16:49:43Z)
Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs. We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z)
Support Vector Machines with the Hard-Margin Loss: Optimal Training via Combinatorial Benders' Cuts [8.281391209717105]
We show how to train the hard-margin SVM model to global optimality. We introduce an iterative sampling and sub decomposition algorithm that solves the problem.
arXiv Detail & Related papers (2022-07-15T18:21:51Z)
Label Distributionally Robust Losses for Multi-class Classification: Consistency, Robustness and Adaptivity [55.29408396918968]
We study a family of loss functions named label-distributionally robust (LDR) losses for multi-class classification. Our contributions include both consistency and robustness by establishing top-$k$ consistency of LDR losses for multi-class classification. We propose a new adaptive LDR loss that automatically adapts the individualized temperature parameter to the noise degree of class label of each instance.
arXiv Detail & Related papers (2021-12-30T00:27:30Z)
Ensemble of Loss Functions to Improve Generalizability of Deep Metric Learning methods [0.609170287691728]
We propose novel approaches to combine different losses built on top of a shared deep feature extractor. We evaluate our methods on some popular datasets from the machine vision domain in conventional Zero-Shot-Learning (ZSL) settings.
arXiv Detail & Related papers (2021-07-02T15:19:46Z)
Rethinking and Reweighting the Univariate Losses for Multi-Label Ranking: Consistency and Generalization [44.73295800450414]
(Partial) ranking loss is a commonly used evaluation measure for multi-label classification. There is a gap between existing theory and practice -- some pairwise losses can lead to promising performance but lack consistency.
arXiv Detail & Related papers (2021-05-10T09:23:27Z)
Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search [101.73248560009124]
We propose an effective convergence-simulation driven evolutionary search algorithm, CSE-Autoloss, for speeding up the search progress. We conduct extensive evaluations of loss function search on popular detectors and validate the good generalization capability of searched losses. Our experiments show that the best-discovered loss function combinations outperform default combinations by 1.1% and 0.8% in terms of mAP for two-stage and one-stage detectors.
arXiv Detail & Related papers (2021-02-09T08:34:52Z)
Adversarially Robust Learning via Entropic Regularization [31.6158163883893]
We propose a new family of algorithms, ATENT, for training adversarially robust deep neural networks. Our approach achieves competitive (or better) performance in terms of robust classification accuracy.
arXiv Detail & Related papers (2020-08-27T18:54:43Z)
Beyond Triplet Loss: Meta Prototypical N-tuple Loss for Person Re-identification [118.72423376789062]
We introduce a multi-class classification loss, i.e., N-tuple loss, to jointly consider multiple (N) instances for per-query optimization. With the multi-class classification incorporated, our model achieves the state-of-the-art performance on the benchmark person ReID datasets.
arXiv Detail & Related papers (2020-06-08T23:34:08Z)
Circle Loss: A Unified Perspective of Pair Similarity Optimization [42.33948436767691]
We find a majority of loss functions, including the triplet loss and the softmax plus cross-entropy loss. We show that the Circle loss offers a more flexible optimization approach towards a more definite convergence target.
arXiv Detail & Related papers (2020-02-25T13:56:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.