Data Augmentation For Label Enhancement
- URL: http://arxiv.org/abs/2303.11698v1
- Date: Tue, 21 Mar 2023 09:36:58 GMT
- Title: Data Augmentation For Label Enhancement
- Authors: Zhiqiang Kou, Yuheng Jia, Jing Wang, Boyu Shi, Xin Geng
- Abstract summary: Label enhancement (LE) has emerged to recover Label Distribution (LD) from logical label.
We propose a novel supervised LE dimensionality reduction approach, which projects the original data into a lower dimensional feature space.
The results show that our method consistently outperforms the other five comparing approaches.
- Score: 45.3351754830424
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Label distribution (LD) uses the description degree to describe instances,
which provides more fine-grained supervision information when learning with
label ambiguity. Nevertheless, LD is unavailable in many real-world
applications. To obtain LD, label enhancement (LE) has emerged to recover LD
from logical label. Existing LE approach have the following problems:
(\textbf{i}) They use logical label to train mappings to LD, but the
supervision information is too loose, which can lead to inaccurate model
prediction; (\textbf{ii}) They ignore feature redundancy and use the collected
features directly. To solve (\textbf{i}), we use the topology of the feature
space to generate more accurate label-confidence. To solve (\textbf{ii}), we
proposed a novel supervised LE dimensionality reduction approach, which
projects the original data into a lower dimensional feature space. Combining
the above two, we obtain the augmented data for LE. Further, we proposed a
novel nonlinear LE model based on the label-confidence and reduced features.
Extensive experiments on 12 real-world datasets are conducted and the results
show that our method consistently outperforms the other five comparing
approaches.
Related papers
- Towards Better Performance in Incomplete LDL: Addressing Data Imbalance [48.54894491724677]
We propose textIncomplete and Imbalance Label Distribution Learning (I(2)LDL), a framework that simultaneously handles incomplete labels and imbalanced label distributions.
Our method decomposes the label distribution matrix into a low-rank component for frequent labels and a sparse component for rare labels, effectively capturing the structure of both head and tail labels.
arXiv Detail & Related papers (2024-10-17T14:12:57Z) - CELDA: Leveraging Black-box Language Model as Enhanced Classifier
without Labels [14.285609493077965]
Clustering-enhanced Linear Discriminative Analysis, a novel approach that improves the text classification accuracy with a very weak-supervision signal.
Our framework draws a precise decision boundary without accessing weights or gradients of the LM model or data labels.
arXiv Detail & Related papers (2023-06-05T08:35:31Z) - Label Distribution Learning from Logical Label [19.632157794117553]
Label distribution learning (LDL) is an effective method to predict the label description degree (a.k.a. label distribution) of a sample.
But annotating label distribution for training samples is extremely costly.
We propose a novel method to learn an LDL model directly from the logical label, which unifies LE and LDL into a joint model.
arXiv Detail & Related papers (2023-03-13T04:31:35Z) - Inaccurate Label Distribution Learning [56.89970970094207]
Label distribution learning (LDL) trains a model to predict the relevance of a set of labels (called label distribution (LD)) to an instance.
This paper investigates the problem of inaccurate LDL, i.e., developing an LDL model with noisy LDs.
arXiv Detail & Related papers (2023-02-25T06:23:45Z) - Losses over Labels: Weakly Supervised Learning via Direct Loss
Construction [71.11337906077483]
Programmable weak supervision is a growing paradigm within machine learning.
We propose Losses over Labels (LoL) as it creates losses directly from ofs without going through the intermediate step of a label.
We show that LoL improves upon existing weak supervision methods on several benchmark text and image classification tasks.
arXiv Detail & Related papers (2022-12-13T22:29:14Z) - Unsupervised Domain Adaptive Salient Object Detection Through
Uncertainty-Aware Pseudo-Label Learning [104.00026716576546]
We propose to learn saliency from synthetic but clean labels, which naturally has higher pixel-labeling quality without the effort of manual annotations.
We show that our proposed method outperforms the existing state-of-the-art deep unsupervised SOD methods on several benchmark datasets.
arXiv Detail & Related papers (2022-02-26T16:03:55Z) - Gradient Imitation Reinforcement Learning for Low Resource Relation
Extraction [52.63803634033647]
Low-resource relation Extraction (LRE) aims to extract relation facts from limited labeled corpora when human annotation is scarce.
We develop a Gradient Imitation Reinforcement Learning method to encourage pseudo label data to imitate the gradient descent direction on labeled data.
We also propose a framework called GradLRE, which handles two major scenarios in low-resource relation extraction.
arXiv Detail & Related papers (2021-09-14T03:51:15Z) - Bidirectional Loss Function for Label Enhancement and Distribution
Learning [23.61708127340584]
Two challenges exist in LDL: how to address the dimensional gap problem during the learning process and how to recover label distributions from logical labels.
This study considers bidirectional projections function which can be applied in LE and LDL problems simultaneously.
Experiments on several real-world datasets are carried out to demonstrate the superiority of the proposed method for both LE and LDL.
arXiv Detail & Related papers (2020-07-07T03:02:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.