Related papers: Ensemble of Loss Functions to Improve Generalizability of Deep Metric Learning methods

Ensemble of Loss Functions to Improve Generalizability of Deep Metric Learning methods

URL: http://arxiv.org/abs/2107.01130v1
Date: Fri, 2 Jul 2021 15:19:46 GMT
Title: Ensemble of Loss Functions to Improve Generalizability of Deep Metric Learning methods
Authors: Davood Zabihzadeh
Abstract summary: We propose novel approaches to combine different losses built on top of a shared deep feature extractor. We evaluate our methods on some popular datasets from the machine vision domain in conventional Zero-Shot-Learning (ZSL) settings.
Score: 0.609170287691728
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Metric Learning (DML) learns a non-linear semantic embedding from input data that brings similar pairs together while keeps dissimilar data away from each other. To this end, many different methods are proposed in the last decade with promising results in various applications. The success of a DML algorithm greatly depends on its loss function. However, no loss function is perfect, and it deals only with some aspects of an optimal similarity embedding. Besides, the generalizability of the DML on unseen categories during the test stage is an important matter that is not considered by existing loss functions. To address these challenges, we propose novel approaches to combine different losses built on top of a shared deep feature extractor. The proposed ensemble of losses enforces the deep model to extract features that are consistent with all losses. Since the selected losses are diverse and each emphasizes different aspects of an optimal semantic embedding, our effective combining methods yield a considerable improvement over any individual loss and generalize well on unseen categories. Here, there is no limitation in choosing loss functions, and our methods can work with any set of existing ones. Besides, they can optimize each loss function as well as its weight in an end-to-end paradigm with no need to adjust any hyper-parameter. We evaluate our methods on some popular datasets from the machine vision domain in conventional Zero-Shot-Learning (ZSL) settings. The results are very encouraging and show that our methods outperform all baseline losses by a large margin in all datasets.

Related papers

EnsLoss: Stochastic Calibrated Loss Ensembles for Preventing Overfitting in Classification [1.3778851745408134]
We propose a novel ensemble method, namely EnsLoss, to combine loss functions within the Empirical risk minimization framework. We first transform the CC conditions of losses into loss-derivatives, thereby bypassing the need for explicit loss functions. We theoretically establish the statistical consistency of our approach and provide insights into its benefits.
arXiv Detail & Related papers (2024-09-02T02:40:42Z)
Noise-Robust Loss Functions: Enhancing Bounded Losses for Large-Scale Noisy Data Learning [0.0]
Large annotated datasets inevitably contain noisy labels, which poses a major challenge for training deep neural networks as they easily memorize the labels. Noise-robust loss functions have emerged as a notable strategy to counteract this issue, but it remains challenging to create a robust loss function which is not susceptible to underfitting. We propose a novel method denoted as logit bias, which adds a real number $epsilon$ to the logit at the position of the correct class.
arXiv Detail & Related papers (2023-06-08T18:38:55Z)
Alternate Loss Functions for Classification and Robust Regression Can Improve the Accuracy of Artificial Neural Networks [6.452225158891343]
This paper shows that training speed and final accuracy of neural networks can significantly depend on the loss function used to train neural networks. Two new classification loss functions that significantly improve performance on a wide variety of benchmark tasks are proposed.
arXiv Detail & Related papers (2023-03-17T12:52:06Z)
Benchmarking Deep AUROC Optimization: Loss Functions and Algorithmic Choices [37.559461866831754]
We benchmark a variety of loss functions with different algorithmic choices for deep AUROC optimization problem. We highlight the essential choices such as positive sampling rate, regularization, normalization/activation, and weights. Our findings show that although Adam-type method is more competitive from training perspective, but it does not outperform others from testing perspective.
arXiv Detail & Related papers (2022-03-27T00:47:00Z)
A Decidability-Based Loss Function [2.5919311269669003]
Biometric problems often use deep learning models to extract features from images, also known as embeddings. In this work, a loss function based on the decidability index is proposed to improve the quality of embeddings for the verification routine. The proposed approach is compared against the Softmax (cross-entropy), Triplets Soft-Hard, and the Multi Similarity losses in four different benchmarks.
arXiv Detail & Related papers (2021-09-12T14:26:27Z)
Universal Online Convex Optimization Meets Second-order Bounds [74.0120666722487]
We propose a simple strategy for universal online convex optimization. The key idea is to construct a set of experts to process the original online functions, and deploy a meta-algorithm over the linearized losses. In this way, we can plug in off-the-shelf online solvers as black-box experts to deliver problem-dependent regret bounds.
arXiv Detail & Related papers (2021-05-08T11:43:49Z)
Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search [101.73248560009124]
We propose an effective convergence-simulation driven evolutionary search algorithm, CSE-Autoloss, for speeding up the search progress. We conduct extensive evaluations of loss function search on popular detectors and validate the good generalization capability of searched losses. Our experiments show that the best-discovered loss function combinations outperform default combinations by 1.1% and 0.8% in terms of mAP for two-stage and one-stage detectors.
arXiv Detail & Related papers (2021-02-09T08:34:52Z)
Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation [56.343646789922545]
We propose to automate the design of metric-specific loss functions by searching differentiable surrogate losses for each metric. Experiments on PASCAL VOC and Cityscapes demonstrate that the searched surrogate losses outperform the manually designed loss functions consistently.
arXiv Detail & Related papers (2020-10-15T17:59:08Z)
Learning by Minimizing the Sum of Ranked Range [58.24935359348289]
We introduce the sum of ranked range (SoRR) as a general approach to form learning objectives. A ranked range is a consecutive sequence of sorted values of a set of real numbers. We explore two applications in machine learning of the minimization of the SoRR framework, namely the AoRR aggregate loss for binary classification and the TKML individual loss for multi-label/multi-class classification.
arXiv Detail & Related papers (2020-10-05T01:58:32Z)
Rethinking preventing class-collapsing in metric learning with margin-based losses [81.22825616879936]
Metric learning seeks embeddings where visually similar instances are close and dissimilar instances are apart. margin-based losses tend to project all samples of a class onto a single point in the embedding space. We propose a simple modification to the embedding losses such that each sample selects its nearest same-class counterpart in a batch.
arXiv Detail & Related papers (2020-06-09T09:59:25Z)
All your loss are belong to Bayes [28.393499629583786]
Loss functions are a cornerstone of machine learning and the starting point of most algorithms. We introduce a trick on squared Gaussian Processes to obtain a random process whose paths are compliant source functions. Experimental results demonstrate substantial improvements over the state of the art.
arXiv Detail & Related papers (2020-06-08T14:31:21Z)
Learning Adaptive Loss for Robust Learning with Noisy Labels [59.06189240645958]
Robust loss is an important strategy for handling robust learning issue. We propose a meta-learning method capable of robust hyper tuning. Four kinds of SOTA loss functions are attempted to be minimization, general availability and effectiveness.
arXiv Detail & Related papers (2020-02-16T00:53:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.