Federated Self-Training for Semi-Supervised Audio Recognition
- URL: http://arxiv.org/abs/2107.06877v1
- Date: Wed, 14 Jul 2021 17:40:10 GMT
- Title: Federated Self-Training for Semi-Supervised Audio Recognition
- Authors: Vasileios Tsouvalas, Aaqib Saeed, Tanir Ozcelebi
- Abstract summary: In this work, we study the problem of semi-supervised learning of audio models via self-training.
We propose FedSTAR to exploit large-scale on-device unlabeled data to improve the generalization of audio recognition models.
- Score: 0.23633885460047763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning is a distributed machine learning paradigm dealing with
decentralized and personal datasets. Since data reside on devices like
smartphones and virtual assistants, labeling is entrusted to the clients, or
labels are extracted in an automated way. Specifically, in the case of audio
data, acquiring semantic annotations can be prohibitively expensive and
time-consuming. As a result, an abundance of audio data remains unlabeled and
unexploited on users' devices. Most existing federated learning approaches
focus on supervised learning without harnessing the unlabeled data. In this
work, we study the problem of semi-supervised learning of audio models via
self-training in conjunction with federated learning. We propose FedSTAR to
exploit large-scale on-device unlabeled data to improve the generalization of
audio recognition models. We further demonstrate that self-supervised
pre-trained models can accelerate the training of on-device models,
significantly improving convergence to within fewer training rounds. We conduct
experiments on diverse public audio classification datasets and investigate the
performance of our models under varying percentages of labeled and unlabeled
data. Notably, we show that with as little as 3% labeled data available,
FedSTAR on average can improve the recognition rate by 13.28% compared to the
fully supervised federated model.
Related papers
- Incremental Self-training for Semi-supervised Learning [56.57057576885672]
IST is simple yet effective and fits existing self-training-based semi-supervised learning methods.
We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed.
arXiv Detail & Related papers (2024-04-14T05:02:00Z) - Federated Representation Learning for Automatic Speech Recognition [20.641076546330986]
Federated Learning (FL) is a privacy-preserving paradigm, allowing edge devices to learn collaboratively without sharing data.
We bring Self-supervised Learning (SSL) and FL together to learn representations for Automatic Speech Recognition respecting data privacy constraints.
We show that the pre-trained ASR encoder in FL performs as well as a centrally pre-trained model and produces an improvement of 12-15% (WER) compared to no pre-training.
arXiv Detail & Related papers (2023-08-03T20:08:23Z) - On-the-fly Denoising for Data Augmentation in Natural Language
Understanding [101.46848743193358]
We propose an on-the-fly denoising technique for data augmentation that learns from soft augmented labels provided by an organic teacher model trained on the cleaner original data.
Our method can be applied to general augmentation techniques and consistently improve the performance on both text classification and question-answering tasks.
arXiv Detail & Related papers (2022-12-20T18:58:33Z) - Federated Self-Supervised Learning in Heterogeneous Settings: Limits of
a Baseline Approach on HAR [0.5039813366558306]
We show that standard lightweight autoencoder and standard Federated Averaging fail to learn a robust representation for Human Activity Recognition.
These findings advocate for a more intensive research effort in Federated Self Supervised Learning.
arXiv Detail & Related papers (2022-07-17T14:15:45Z) - Label-Efficient Self-Supervised Speaker Verification With Information
Maximization and Contrastive Learning [0.0]
We explore self-supervised learning for speaker verification by learning representations directly from raw audio.
Our approach is based on recent information learning frameworks and an intensive data pre-processing step.
arXiv Detail & Related papers (2022-07-12T13:01:55Z) - FedNST: Federated Noisy Student Training for Automatic Speech
Recognition [8.277567852741242]
Federated Learning (FL) enables training state-of-the-art Automatic Speech Recognition (ASR) models on user devices (clients) in distributed systems.
Key challenge facing practical adoption of FL for ASR is obtaining ground-truth labels on the clients.
A promising alternative is using semi-/self-supervised learning approaches to leverage unlabelled user data.
arXiv Detail & Related papers (2022-06-06T16:18:45Z) - Boosting Facial Expression Recognition by A Semi-Supervised Progressive
Teacher [54.50747989860957]
We propose a semi-supervised learning algorithm named Progressive Teacher (PT) to utilize reliable FER datasets as well as large-scale unlabeled expression images for effective training.
Experiments on widely-used databases RAF-DB and FERPlus validate the effectiveness of our method, which achieves state-of-the-art performance with accuracy of 89.57% on RAF-DB.
arXiv Detail & Related papers (2022-05-28T07:47:53Z) - Distantly-Supervised Named Entity Recognition with Noise-Robust Learning
and Language Model Augmented Self-Training [66.80558875393565]
We study the problem of training named entity recognition (NER) models using only distantly-labeled data.
We propose a noise-robust learning scheme comprised of a new loss function and a noisy label removal step.
Our method achieves superior performance, outperforming existing distantly-supervised NER models by significant margins.
arXiv Detail & Related papers (2021-09-10T17:19:56Z) - Federated Self-Supervised Learning of Multi-Sensor Representations for
Embedded Intelligence [8.110949636804772]
Smartphones, wearables, and Internet of Things (IoT) devices produce a wealth of data that cannot be accumulated in a centralized repository for learning supervised models.
We propose a self-supervised approach termed textitscalogram-signal correspondence learning based on wavelet transform to learn useful representations from unlabeled sensor inputs.
We extensively assess the quality of learned features with our multi-view strategy on diverse public datasets, achieving strong performance in all domains.
arXiv Detail & Related papers (2020-07-25T21:59:17Z) - Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck.
We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network.
We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.