Pretext Tasks selection for multitask self-supervised speech
representation learning
- URL: http://arxiv.org/abs/2107.00594v1
- Date: Thu, 1 Jul 2021 16:36:29 GMT
- Title: Pretext Tasks selection for multitask self-supervised speech
representation learning
- Authors: Salah Zaiem, Titouan Parcollet and Slim Essid
- Abstract summary: This paper introduces a method to select a group of pretext tasks among a set of candidates.
Experiments conducted on speaker recognition and automatic speech recognition validate our approach.
- Score: 23.39079406674442
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Through solving pretext tasks, self-supervised learning leverages unlabeled
data to extract useful latent representations replacing traditional input
features in the downstream task. In various application domains, including
computer vision, natural language processing and audio/speech signal
processing, a wide range of features where engineered through decades of
research efforts. As it turns out, learning to predict such features has proven
to be a particularly relevant pretext task leading to building useful
self-supervised representations that prove to be effective for downstream
tasks. However, methods and common practices for combining such pretext tasks,
where each task targets a different group of features for better performance on
the downstream task have not been explored and understood properly. In fact,
the process relies almost exclusively on a computationally heavy experimental
procedure, which becomes intractable with the increase of the number of pretext
tasks. This paper introduces a method to select a group of pretext tasks among
a set of candidates. The method we propose estimates properly calibrated
weights for the partial losses corresponding to the considered pretext tasks
during the self-supervised training process. The experiments conducted on
speaker recognition and automatic speech recognition validate our approach, as
the groups selected and weighted with our method perform better than classic
baselines, thus facilitating the selection and combination of relevant
pseudo-labels for self-supervised representation learning.
Related papers
- Semantic Prompting with Image-Token for Continual Learning [7.5140668729696145]
I-Prompt is a task-agnostic approach to eliminate task prediction.
Our method achieves competitive performance on four benchmarks.
We demonstrate the superiority of our method across various scenarios through extensive experiments.
arXiv Detail & Related papers (2024-03-18T07:43:14Z) - Self-Supervised Speech Representation Learning: A Review [105.1545308184483]
Self-supervised representation learning methods promise a single universal model that would benefit a wide variety of tasks and domains.
Speech representation learning is experiencing similar progress in three main categories: generative, contrastive, and predictive methods.
This review presents approaches for self-supervised speech representation learning and their connection to other research areas.
arXiv Detail & Related papers (2022-05-21T16:52:57Z) - Active Multi-Task Representation Learning [50.13453053304159]
We give the first formal study on resource task sampling by leveraging the techniques from active learning.
We propose an algorithm that iteratively estimates the relevance of each source task to the target task and samples from each source task based on the estimated relevance.
arXiv Detail & Related papers (2022-02-02T08:23:24Z) - Transfer Learning in Conversational Analysis through Reusing
Preprocessing Data as Supervisors [52.37504333689262]
Using noisy labels in single-task learning increases the risk of over-fitting.
Auxiliary tasks could improve the performance of the primary task learning during the same training.
arXiv Detail & Related papers (2021-12-02T08:40:42Z) - Weighted Training for Cross-Task Learning [71.94908559469475]
We introduce Target-Aware Weighted Training (TAWT), a weighted training algorithm for cross-task learning.
We show that TAWT is easy to implement, is computationally efficient, requires little hyper parameter tuning, and enjoys non-asymptotic learning-theoretic guarantees.
As a byproduct, the proposed representation-based task distance allows one to reason in a theoretically principled way about several critical aspects of cross-task learning.
arXiv Detail & Related papers (2021-05-28T20:27:02Z) - Reciprocal Feature Learning via Explicit and Implicit Tasks in Scene
Text Recognition [60.36540008537054]
In this work, we excavate the implicit task, character counting within the traditional text recognition, without additional labor annotation cost.
We design a two-branch reciprocal feature learning framework in order to adequately utilize the features from both the tasks.
Experiments on 7 benchmarks show the advantages of the proposed methods in both text recognition and the new-built character counting tasks.
arXiv Detail & Related papers (2021-05-13T12:27:35Z) - Conditional independence for pretext task selection in Self-supervised
speech representation learning [23.39079406674442]
Self-supervised learning (SSL) leverages unlabeled data to extract useful latent representations replacing traditional input features in the downstream task.
A common pretext task consists in pretraining a SSL model on pseudo-labels derived from the original signal.
This paper introduces a practical and theoretical framework to select relevant pseudo-labels with respect to a given downstream task.
arXiv Detail & Related papers (2021-04-15T11:32:59Z) - Adaptive Task Sampling for Meta-Learning [79.61146834134459]
Key idea of meta-learning for few-shot classification is to mimic the few-shot situations faced at test time.
We propose an adaptive task sampling method to improve the generalization performance.
arXiv Detail & Related papers (2020-07-17T03:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.