UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation
- URL: http://arxiv.org/abs/2303.05668v2
- Date: Thu, 18 May 2023 01:28:01 GMT
- Title: UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation
- Authors: Ashish Seth and Sreyan Ghosh and S. Umesh and Dinesh Manocha
- Abstract summary: We introduce UnFuSeD, a novel approach to leverage self-supervised learning for audio classification.
We use the encoder to generate pseudo-labels for unsupervised fine-tuning before the actual fine-tuning step.
UnFuSeD achieves state-of-the-art results on the LAPE Benchmark, significantly outperforming all our baselines.
- Score: 53.06337011259031
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we introduce UnFuSeD, a novel approach to leverage
self-supervised learning and reduce the need for large amounts of labeled data
for audio classification. Unlike prior works, which directly fine-tune a
self-supervised pre-trained encoder on a target dataset, we use the encoder to
generate pseudo-labels for unsupervised fine-tuning before the actual
fine-tuning step. We first train an encoder using a novel self-supervised
learning algorithm (SSL) on an unlabeled audio dataset. Then, we use that
encoder to generate pseudo-labels on our target task dataset via clustering the
extracted representations. These pseudo-labels are then used to guide
self-distillation on a randomly initialized model, which we call unsupervised
fine-tuning. Finally, the resultant encoder is then fine-tuned on our target
task dataset. Through UnFuSeD, we propose the first system that moves away from
generic SSL paradigms in literature, which pre-train and fine-tune the same
encoder, and present a novel self-distillation-based system to leverage SSL
pre-training for low-resource audio classification. In practice, UnFuSeD
achieves state-of-the-art results on the LAPE Benchmark, significantly
outperforming all our baselines. Additionally, UnFuSeD allows us to achieve
this at a 40% reduction in the number of parameters over the previous
state-of-the-art system. We make all our codes publicly available.
Related papers
- Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels.
By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data.
The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z) - Improved Out-of-Scope Intent Classification with Dual Encoding and Threshold-based Re-Classification [6.975902383951604]
Current methodologies face difficulties with the unpredictable distribution of outliers.
We present the Dual for Threshold-Based Re-Classification (DETER) to address these challenges.
Our model outperforms previous benchmarks, increasing up to 13% and 5% in F1 score for known and unknown intents.
arXiv Detail & Related papers (2024-05-30T11:46:42Z) - Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models [84.8919069953397]
Self-TAught Recognizer (STAR) is an unsupervised adaptation framework for speech recognition systems.
We show that STAR achieves an average of 13.5% relative reduction in word error rate across 14 target domains.
STAR exhibits high data efficiency that only requires less than one-hour unlabeled data.
arXiv Detail & Related papers (2024-05-23T04:27:11Z) - Neural Networks Against (and For) Self-Training: Classification with
Small Labeled and Large Unlabeled Sets [11.385682758047775]
One of the weaknesses of self-training is the semantic drift problem.
We reshape the role of pseudo-labels and create a hierarchical order of information.
A crucial step in self-training is to use the confidence prediction to select the best candidate pseudo-labels.
arXiv Detail & Related papers (2023-12-31T19:25:34Z) - Revisit Few-shot Intent Classification with PLMs: Direct Fine-tuning vs. Continual Pre-training [20.98770732015944]
Few-shot intent detection involves training a deep learning model to classify utterances based on their underlying intents using only a small amount of labeled data.
We show that continual pre-training may not be essential, since the overfitting problem of PLMs on this task may not be as serious as expected.
To maximize the utilization of the limited available data, we propose a context augmentation method and leverage sequential self-distillation to boost performance.
arXiv Detail & Related papers (2023-06-08T15:26:52Z) - ProtoCon: Pseudo-label Refinement via Online Clustering and Prototypical
Consistency for Efficient Semi-supervised Learning [60.57998388590556]
ProtoCon is a novel method for confidence-based pseudo-labeling.
Online nature of ProtoCon allows it to utilise the label history of the entire dataset in one training cycle.
It delivers significant gains and faster convergence over state-of-the-art datasets.
arXiv Detail & Related papers (2023-03-22T23:51:54Z) - SLICER: Learning universal audio representations using low-resource
self-supervised pre-training [53.06337011259031]
We present a new Self-Supervised Learning approach to pre-train encoders on unlabeled audio data.
Our primary aim is to learn audio representations that can generalize across a large variety of speech and non-speech tasks.
arXiv Detail & Related papers (2022-11-02T23:45:33Z) - Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck.
We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network.
We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.