Weakly Supervised Label Smoothing
- URL: http://arxiv.org/abs/2012.08575v1
- Date: Tue, 15 Dec 2020 19:36:52 GMT
- Title: Weakly Supervised Label Smoothing
- Authors: Gustavo Penha and Claudia Hauff
- Abstract summary: We study Label Smoothing (LS), a widely used regularization technique, in the context of neural learning to rank (L2R) models.
Inspired by our investigation of LS in the context of neural L2R models, we propose a novel technique called Weakly Supervised Label Smoothing (WSLS)
- Score: 15.05158252504978
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study Label Smoothing (LS), a widely used regularization technique, in the
context of neural learning to rank (L2R) models. LS combines the ground-truth
labels with a uniform distribution, encouraging the model to be less confident
in its predictions. We analyze the relationship between the non-relevant
documents-specifically how they are sampled-and the effectiveness of LS,
discussing how LS can be capturing "hidden similarity knowledge" between the
relevantand non-relevant document classes. We further analyze LS by testing if
a curriculum-learning approach, i.e., starting with LS and after anumber of
iterations using only ground-truth labels, is beneficial. Inspired by our
investigation of LS in the context of neural L2R models, we propose a novel
technique called Weakly Supervised Label Smoothing (WSLS) that takes advantage
of the retrieval scores of the negative sampled documents as a weak supervision
signal in the process of modifying the ground-truth labels. WSLS is simple to
implement, requiring no modification to the neural ranker architecture. Our
experiments across three retrieval tasks-passage retrieval, similar question
retrieval and conversation response ranking-show that WSLS for pointwise
BERT-based rankers leads to consistent effectiveness gains. The source code is
available at
https://anonymous.4open.science/r/dac85d48-6f71-4261-a7d8-040da6021c52/.
Related papers
- Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels.
By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data.
The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z) - Linguistic Steganalysis via LLMs: Two Modes for Efficient Detection of Strongly Concealed Stego [6.99735992267331]
We design a novel LS with two modes called LSGC.
In the generation mode, we created an LS-task "description"
In the classification mode, LSGC deleted the LS-task "description" and used the "causalLM" LLMs to extract steganographic features.
arXiv Detail & Related papers (2024-06-06T16:18:02Z) - Latent space configuration for improved generalization in supervised
autoencoder neural networks [0.0]
We propose two methods for obtaining LS with desired topology, called LS configuration.
Knowing LS configuration allows to define similarity measure in LS to predict labels or estimate similarity for multiple inputs.
We show that SAE trained for clothes texture classification using the proposed method generalizes well to unseen data from LIP, Market1501, and WildTrack datasets without fine-tuning.
arXiv Detail & Related papers (2024-02-13T13:25:51Z) - OpenLDN: Learning to Discover Novel Classes for Open-World
Semi-Supervised Learning [110.40285771431687]
Semi-supervised learning (SSL) is one of the dominant approaches to address the annotation bottleneck of supervised learning.
Recent SSL methods can effectively leverage a large repository of unlabeled data to improve performance while relying on a small set of labeled data.
This work introduces OpenLDN that utilizes a pairwise similarity loss to discover novel classes.
arXiv Detail & Related papers (2022-07-05T18:51:05Z) - ALASCA: Rethinking Label Smoothing for Deep Learning Under Label Noise [10.441880303257468]
We propose our framework, coined as Adaptive LAbel smoothing on Sub-Cl-Assifier (ALASCA)
We derive that the label smoothing (LS) incurs implicit Lipschitz regularization (LR)
Based on these derivations, we apply the adaptive LS (ALS) on sub-classifiers architectures for the practical application of adaptive LR on intermediate layers.
arXiv Detail & Related papers (2022-06-15T03:37:51Z) - L2B: Learning to Bootstrap Robust Models for Combating Label Noise [52.02335367411447]
This paper introduces a simple and effective method, named Learning to Bootstrap (L2B)
It enables models to bootstrap themselves using their own predictions without being adversely affected by erroneous pseudo-labels.
It achieves this by dynamically adjusting the importance weight between real observed and generated labels, as well as between different samples through meta-learning.
arXiv Detail & Related papers (2022-02-09T05:57:08Z) - Self-supervised Learning is More Robust to Dataset Imbalance [65.84339596595383]
We investigate self-supervised learning under dataset imbalance.
Off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations.
We devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets.
arXiv Detail & Related papers (2021-10-11T06:29:56Z) - Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for
Open-Set Semi-Supervised Learning [101.28281124670647]
Open-set semi-supervised learning (open-set SSL) investigates a challenging but practical scenario where out-of-distribution (OOD) samples are contained in the unlabeled data.
We propose a novel training mechanism that could effectively exploit the presence of OOD data for enhanced feature learning.
Our approach substantially lifts the performance on open-set SSL and outperforms the state-of-the-art by a large margin.
arXiv Detail & Related papers (2021-08-12T09:14:44Z) - Understanding (Generalized) Label Smoothing when Learning with Noisy
Labels [57.37057235894054]
Label smoothing (LS) is an arising learning paradigm that uses the positively weighted average of both the hard training labels and uniformly distributed soft labels.
We provide understandings for the properties of generalized label smoothing (GLS) when learning with noisy labels.
arXiv Detail & Related papers (2021-06-08T07:32:29Z) - Regularization via Adaptive Pairwise Label Smoothing [19.252319300590653]
This paper introduces a novel label smoothing technique called Pairwise Label Smoothing (PLS)
Unlike current LS methods, which typically require to find a global smoothing distribution mass through cross-validation search, PLS automatically learns the distribution mass for each input pair during training.
We empirically show that PLS significantly outperforms LS and the baseline models, achieving up to 30% of relative classification error reduction.
arXiv Detail & Related papers (2020-12-02T22:08:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.