HuBERTopic: Enhancing Semantic Representation of HuBERT through
Self-supervision Utilizing Topic Model
- URL: http://arxiv.org/abs/2310.03975v1
- Date: Fri, 6 Oct 2023 02:19:09 GMT
- Title: HuBERTopic: Enhancing Semantic Representation of HuBERT through
Self-supervision Utilizing Topic Model
- Authors: Takashi Maekaku, Jiatong Shi, Xuankai Chang, Yuya Fujita, Shinji
Watanabe
- Abstract summary: We propose a new approach to enrich the semantic representation of HuBERT.
An auxiliary topic classification task is added to HuBERT by using topic labels as teachers.
Experimental results demonstrate that our method achieves comparable or better performance than the baseline in most tasks.
- Score: 62.995175485416
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, the usefulness of self-supervised representation learning (SSRL)
methods has been confirmed in various downstream tasks. Many of these models,
as exemplified by HuBERT and WavLM, use pseudo-labels generated from spectral
features or the model's own representation features. From previous studies, it
is known that the pseudo-labels contain semantic information. However, the
masked prediction task, the learning criterion of HuBERT, focuses on local
contextual information and may not make effective use of global semantic
information such as speaker, theme of speech, and so on. In this paper, we
propose a new approach to enrich the semantic representation of HuBERT. We
apply topic model to pseudo-labels to generate a topic label for each
utterance. An auxiliary topic classification task is added to HuBERT by using
topic labels as teachers. This allows additional global semantic information to
be incorporated in an unsupervised manner. Experimental results demonstrate
that our method achieves comparable or better performance than the baseline in
most tasks, including automatic speech recognition and five out of the eight
SUPERB tasks. Moreover, we find that topic labels include various information
about utterance, such as gender, speaker, and its theme. This highlights the
effectiveness of our approach in capturing multifaceted semantic nuances.
Related papers
- Data-free Multi-label Image Recognition via LLM-powered Prompt Tuning [23.671999163027284]
This paper proposes a novel framework for multi-label image recognition without any training data.
It uses knowledge of pre-trained Large Language Model to learn prompts to adapt pretrained Vision-Language Model like CLIP to multilabel classification.
Our framework presents a new way to explore the synergies between multiple pre-trained models for novel category recognition.
arXiv Detail & Related papers (2024-03-02T13:43:32Z) - KMF: Knowledge-Aware Multi-Faceted Representation Learning for Zero-Shot
Node Classification [75.95647590619929]
Zero-Shot Node Classification (ZNC) has been an emerging and crucial task in graph data analysis.
We propose a Knowledge-Aware Multi-Faceted framework (KMF) that enhances the richness of label semantics.
A novel geometric constraint is developed to alleviate the problem of prototype drift caused by node information aggregation.
arXiv Detail & Related papers (2023-08-15T02:38:08Z) - Description-Enhanced Label Embedding Contrastive Learning for Text
Classification [65.01077813330559]
Self-Supervised Learning (SSL) in model learning process and design a novel self-supervised Relation of Relation (R2) classification task.
Relation of Relation Learning Network (R2-Net) for text classification, in which text classification and R2 classification are treated as optimization targets.
external knowledge from WordNet to obtain multi-aspect descriptions for label semantic learning.
arXiv Detail & Related papers (2023-06-15T02:19:34Z) - Label Aware Speech Representation Learning For Language Identification [49.197215416945596]
We propose a novel framework of combining self-supervised representation learning with the language label information for the pre-training task.
This framework, termed as Label Aware Speech Representation (LASR) learning, uses a triplet based objective function to incorporate language labels along with the self-supervised loss function.
arXiv Detail & Related papers (2023-06-07T12:14:16Z) - Multi-layered Semantic Representation Network for Multi-label Image
Classification [8.17894017454724]
Multi-label image classification (MLIC) is a fundamental and practical task, which aims to assign multiple possible labels to an image.
In recent years, many deep convolutional neural network (CNN) based approaches have been proposed which model label correlations.
This paper advances this research direction by improving the modeling of label correlations and the learning of semantic representations.
arXiv Detail & Related papers (2021-06-22T08:04:22Z) - LTIatCMU at SemEval-2020 Task 11: Incorporating Multi-Level Features for
Multi-Granular Propaganda Span Identification [70.1903083747775]
This paper describes our submission for the task of Propaganda Span Identification in news articles.
We introduce a BERT-BiLSTM based span-level propaganda classification model that identifies which token spans within the sentence are indicative of propaganda.
arXiv Detail & Related papers (2020-08-11T16:14:47Z) - Hierarchical Image Classification using Entailment Cone Embeddings [68.82490011036263]
We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier.
We empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance.
arXiv Detail & Related papers (2020-04-02T10:22:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.