Full Kullback-Leibler-Divergence Loss for Hyperparameter-free Label
Distribution Learning
- URL: http://arxiv.org/abs/2209.02055v1
- Date: Mon, 5 Sep 2022 17:01:46 GMT
- Title: Full Kullback-Leibler-Divergence Loss for Hyperparameter-free Label
Distribution Learning
- Authors: Maurice G\"under, Nico Piatkowski, Christian Bauckhage
- Abstract summary: The concept of Label Distribution Learning (LDL) is a technique to stabilize classification and regression problems.
The main idea is the joint regression of the label distribution and its expectation value.
We introduce a loss function for DLDL whose components are completely defined by Kullback-Leibler divergences.
- Score: 3.0745536448480326
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The concept of Label Distribution Learning (LDL) is a technique to stabilize
classification and regression problems with ambiguous and/or imbalanced labels.
A prototypical use-case of LDL is human age estimation based on profile images.
Regarding this regression problem, a so called Deep Label Distribution Learning
(DLDL) method has been developed. The main idea is the joint regression of the
label distribution and its expectation value. However, the original DLDL method
uses loss components with different mathematical motivation and, thus,
different scales, which is why the use of a hyperparameter becomes necessary.
In this work, we introduce a loss function for DLDL whose components are
completely defined by Kullback-Leibler (KL) divergences and, thus, are directly
comparable to each other without the need of additional hyperparameters. It
generalizes the concept of DLDL with regard to further use-cases, in particular
for multi-dimensional or multi-scale distribution learning tasks.
Related papers
- Multi-Granularity Semantic Revision for Large Language Model Distillation [66.03746866578274]
We propose a multi-granularity semantic revision method for LLM distillation.
At the sequence level, we propose a sequence correction and re-generation strategy.
At the token level, we design a distribution adaptive clipping Kullback-Leibler loss as the distillation objective function.
At the span level, we leverage the span priors of a sequence to compute the probability correlations within spans, and constrain the teacher and student's probability correlations to be consistent.
arXiv Detail & Related papers (2024-07-14T03:51:49Z) - Inaccurate Label Distribution Learning with Dependency Noise [52.08553913094809]
We introduce the Dependent Noise-based Inaccurate Label Distribution Learning (DN-ILDL) framework to tackle the challenges posed by noise in label distribution learning.
We show that DN-ILDL effectively addresses the ILDL problem and outperforms existing LDL methods.
arXiv Detail & Related papers (2024-05-26T07:58:07Z) - Label Distribution Learning from Logical Label [19.632157794117553]
Label distribution learning (LDL) is an effective method to predict the label description degree (a.k.a. label distribution) of a sample.
But annotating label distribution for training samples is extremely costly.
We propose a novel method to learn an LDL model directly from the logical label, which unifies LE and LDL into a joint model.
arXiv Detail & Related papers (2023-03-13T04:31:35Z) - Inaccurate Label Distribution Learning [56.89970970094207]
Label distribution learning (LDL) trains a model to predict the relevance of a set of labels (called label distribution (LD)) to an instance.
This paper investigates the problem of inaccurate LDL, i.e., developing an LDL model with noisy LDs.
arXiv Detail & Related papers (2023-02-25T06:23:45Z) - Unimodal-Concentrated Loss: Fully Adaptive Label Distribution Learning
for Ordinal Regression [32.35098925000738]
We argue that existing ALDL algorithms do not fully exploit the intrinsic properties of ordinal regression.
We propose a novel loss function for fully adaptive label distribution learning, namely unimodal-concentrated loss.
arXiv Detail & Related papers (2022-04-01T09:40:11Z) - Instance-Dependent Partial Label Learning [69.49681837908511]
Partial label learning is a typical weakly supervised learning problem.
Most existing approaches assume that the incorrect labels in each training example are randomly picked as the candidate labels.
In this paper, we consider instance-dependent and assume that each example is associated with a latent label distribution constituted by the real number of each label.
arXiv Detail & Related papers (2021-10-25T12:50:26Z) - Disturbing Target Values for Neural Network Regularization [1.5574423250822542]
Directional DisturbLabel (DDL) is a novel regularization technique that makes use of the class probabilities to infer the confident labels.
DDL uses the model behavior during training to regularize it in a more directed manner.
In this paper, 6 and 8 datasets are used to validate the robustness of our methods in classification and regression tasks respectively.
arXiv Detail & Related papers (2021-10-11T05:14:02Z) - PLM: Partial Label Masking for Imbalanced Multi-label Classification [59.68444804243782]
Neural networks trained on real-world datasets with long-tailed label distributions are biased towards frequent classes and perform poorly on infrequent classes.
We propose a method, Partial Label Masking (PLM), which utilizes this ratio during training.
Our method achieves strong performance when compared to existing methods on both multi-label (MultiMNIST and MSCOCO) and single-label (imbalanced CIFAR-10 and CIFAR-100) image classification datasets.
arXiv Detail & Related papers (2021-05-22T18:07:56Z) - Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning.
Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch.
ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z) - Bidirectional Loss Function for Label Enhancement and Distribution
Learning [23.61708127340584]
Two challenges exist in LDL: how to address the dimensional gap problem during the learning process and how to recover label distributions from logical labels.
This study considers bidirectional projections function which can be applied in LE and LDL problems simultaneously.
Experiments on several real-world datasets are carried out to demonstrate the superiority of the proposed method for both LE and LDL.
arXiv Detail & Related papers (2020-07-07T03:02:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.