Related papers: Relational Surrogate Loss Learning

Relational Surrogate Loss Learning

URL: http://arxiv.org/abs/2202.13197v1
Date: Sat, 26 Feb 2022 17:32:57 GMT
Title: Relational Surrogate Loss Learning
Authors: Tao Huang, Zekang Li, Hua Lu, Yong Shan, Shusheng Yang, Yang Feng, Fei Wang, Shan You, Chang Xu
Abstract summary: This paper revisits the surrogate loss learning, where a deep neural network is employed to approximate the evaluation metrics. In this paper, we show that directly maintaining the relation of models between surrogate losses and metrics suffices. Our method is much easier to optimize and enjoys significant efficiency and performance gains.
Score: 41.61184221367546
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Evaluation metrics in machine learning are often hardly taken as loss functions, as they could be non-differentiable and non-decomposable, e.g., average precision and F1 score. This paper aims to address this problem by revisiting the surrogate loss learning, where a deep neural network is employed to approximate the evaluation metrics. Instead of pursuing an exact recovery of the evaluation metric through a deep neural network, we are reminded of the purpose of the existence of these evaluation metrics, which is to distinguish whether one model is better or worse than another. In this paper, we show that directly maintaining the relation of models between surrogate losses and metrics suffices, and propose a rank correlation-based optimization method to maximize this relation and learn surrogate losses. Compared to previous works, our method is much easier to optimize and enjoys significant efficiency and performance gains. Extensive experiments show that our method achieves improvements on various tasks including image classification and neural machine translation, and even outperforms state-of-the-art methods on human pose estimation and machine reading comprehension tasks. Code is available at: https://github.com/hunto/ReLoss.

Related papers

Efficient Machine Unlearning via Influence Approximation [75.31015485113993]
Influence-based unlearning has emerged as a prominent approach to estimate the impact of individual training samples on model parameters without retraining.<n>This paper establishes a theoretical link between memorizing (incremental learning) and forgetting (unlearning)<n>We introduce the Influence Approximation Unlearning algorithm for efficient machine unlearning from the incremental perspective.
arXiv Detail & Related papers (2025-07-31T05:34:27Z)
ReLearn: Unlearning via Learning for Large Language Models [64.2802606302194]
We propose ReLearn, a data augmentation and fine-tuning pipeline for effective unlearning. This framework introduces Knowledge Forgetting Rate (KFR) and Knowledge Retention Rate (KRR) to measure knowledge-level preservation. Our experiments show that ReLearn successfully achieves targeted forgetting while preserving high-quality output.
arXiv Detail & Related papers (2025-02-16T16:31:00Z)
Newton Losses: Using Curvature Information for Learning with Differentiable Algorithms [80.37846867546517]
We show how to train eight different neural networks with custom objectives. We exploit their second-order information via their empirical Fisherssian matrices. We apply Loss Lossiable algorithms to achieve significant improvements for less differentiable algorithms.
arXiv Detail & Related papers (2024-10-24T18:02:11Z)
Towards Effective Evaluations and Comparisons for LLM Unlearning Methods [97.2995389188179]
This paper seeks to refine the evaluation of machine unlearning for large language models. It addresses two key challenges -- the robustness of evaluation metrics and the trade-offs between competing goals.
arXiv Detail & Related papers (2024-06-13T14:41:00Z)
Aligning Multiclass Neural Network Classifier Criterion with Task Performance Metrics [2.8583357090792703]
Multiclass neural network classifiers are typically trained using cross-entropy loss but evaluated using metrics derived from the confusion matrix.<n>This mismatch between the training objective and evaluation metric can lead to suboptimal performance.<n>We propose Evaluation Aligned Surrogate Training (EAST), a novel approach to train multiclass classifiers using close surrogates of confusion-matrix based metrics.
arXiv Detail & Related papers (2024-05-31T15:54:01Z)
AnyLoss: Transforming Classification Metrics into Loss Functions [21.34290540936501]
evaluation metrics can be used to assess the performance of models in binary classification tasks. Most metrics are derived from a confusion matrix in a non-differentiable form, making it difficult to generate a differentiable loss function that could directly optimize them. We propose a general-purpose approach that transforms any confusion matrix-based metric into a loss function, textitAnyLoss, that is available in optimization processes.
arXiv Detail & Related papers (2024-05-23T16:14:16Z)
Reduced Jeffries-Matusita distance: A Novel Loss Function to Improve Generalization Performance of Deep Classification Models [0.0]
We introduce a distance called Reduced Jeffries-Matusita as a loss function for training deep classification models to reduce the over-fitting issue. The results show that the new distance measure stabilizes the training process significantly, enhances the generalization ability, and improves the performance of the models in the Accuracy and F1-score metrics.
arXiv Detail & Related papers (2024-03-13T10:51:38Z)
Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class. Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z)
SuSana Distancia is all you need: Enforcing class separability in metric learning via two novel distance-based loss functions for few-shot image classification [0.9236074230806579]
We propose two loss functions which consider the importance of the embedding vectors by looking at the intra-class and inter-class distance between the few data. Our results show a significant improvement in accuracy in the miniImagenNet benchmark compared to other metric-based few-shot learning methods by a margin of 2%.
arXiv Detail & Related papers (2023-05-15T23:12:09Z)
Recall@k Surrogate Loss with Large Batches and Similarity Mixup [62.67458021725227]
Direct optimization, by gradient descent, of an evaluation metric is not possible when it is non-differentiable. In this work, a differentiable surrogate loss for the recall is proposed. The proposed method achieves state-of-the-art results in several image retrieval benchmarks.
arXiv Detail & Related papers (2021-08-25T11:09:11Z)
Learning with Multiclass AUC: Theory and Algorithms [141.63211412386283]
Area under the ROC curve (AUC) is a well-known ranking metric for problems such as imbalanced learning and recommender systems. In this paper, we start an early trial to consider the problem of learning multiclass scoring functions via optimizing multiclass AUC metrics.
arXiv Detail & Related papers (2021-07-28T05:18:10Z)
MetricOpt: Learning to Optimize Black-Box Evaluation Metrics [21.608384691401238]
We study the problem of optimizing arbitrary non-differentiable task evaluation metrics such as misclassification rate and recall. Our method, named MetricOpt, operates in a black-box setting where the computational details of the target metric are unknown. We achieve this by learning a differentiable value function, which maps compact task-specific model parameters to metric observations.
arXiv Detail & Related papers (2021-04-21T16:50:01Z)
A Unified Framework of Surrogate Loss by Refactoring and Interpolation [65.60014616444623]
We introduce UniLoss, a unified framework to generate surrogate losses for training deep networks with gradient descent. We validate the effectiveness of UniLoss on three tasks and four datasets.
arXiv Detail & Related papers (2020-07-27T21:16:51Z)
On the Outsized Importance of Learning Rates in Local Update Methods [2.094022863940315]
We study a family of algorithms, which we refer to as local update methods, that generalize many federated learning and meta-learning algorithms. We prove that for quadratic objectives, local update methods perform gradient descent on a surrogate loss function which we exactly characterize. We show that the choice of client learning rate controls the condition number of that surrogate loss, as well as the distance between the minimizers of the surrogate and true loss functions.
arXiv Detail & Related papers (2020-07-02T04:45:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.