Speaker Information Can Guide Models to Better Inductive Biases: A Case
Study On Predicting Code-Switching
- URL: http://arxiv.org/abs/2203.08979v1
- Date: Wed, 16 Mar 2022 22:56:58 GMT
- Title: Speaker Information Can Guide Models to Better Inductive Biases: A Case
Study On Predicting Code-Switching
- Authors: Alissa Ostapenko, Shuly Wintner, Melinda Fricke, Yulia Tsvetkov
- Abstract summary: We show that adding sociolinguistically-grounded speaker features as prepended prompts significantly improves accuracy.
We are the first to incorporate speaker characteristics in a neural model for code-switching.
- Score: 27.68274308680201
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Natural language processing (NLP) models trained on people-generated data can
be unreliable because, without any constraints, they can learn from spurious
correlations that are not relevant to the task. We hypothesize that enriching
models with speaker information in a controlled, educated way can guide them to
pick up on relevant inductive biases. For the speaker-driven task of predicting
code-switching points in English--Spanish bilingual dialogues, we show that
adding sociolinguistically-grounded speaker features as prepended prompts
significantly improves accuracy. We find that by adding influential phrases to
the input, speaker-informed models learn useful and explainable linguistic
information. To our knowledge, we are the first to incorporate speaker
characteristics in a neural model for code-switching, and more generally, take
a step towards developing transparent, personalized models that use speaker
information in a controlled way.
Related papers
- Learning Phonotactics from Linguistic Informants [54.086544221761486]
Our model iteratively selects or synthesizes a data-point according to one of a range of information-theoretic policies.
We find that the information-theoretic policies that our model uses to select items to query the informant achieve sample efficiency comparable to, or greater than, fully supervised approaches.
arXiv Detail & Related papers (2024-05-08T00:18:56Z) - Multilingual self-supervised speech representations improve the speech
recognition of low-resource African languages with codeswitching [65.74653592668743]
Finetuning self-supervised multilingual representations reduces absolute word error rates by up to 20%.
In circumstances with limited training data finetuning self-supervised representations is a better performing and viable solution.
arXiv Detail & Related papers (2023-11-25T17:05:21Z) - Improving Speaker Diarization using Semantic Information: Joint Pairwise
Constraints Propagation [53.01238689626378]
We propose a novel approach to leverage semantic information in speaker diarization systems.
We introduce spoken language understanding modules to extract speaker-related semantic information.
We present a novel framework to integrate these constraints into the speaker diarization pipeline.
arXiv Detail & Related papers (2023-09-19T09:13:30Z) - Can Language Models Learn to Listen? [96.01685069483025]
We present a framework for generating appropriate facial responses from a listener in dyadic social interactions based on the speaker's words.
Our approach autoregressively predicts a response of a listener: a sequence of listener facial gestures, quantized using a VQ-VAE.
We show that our generated listener motion is fluent and reflective of language semantics through quantitative metrics and a qualitative user study.
arXiv Detail & Related papers (2023-08-21T17:59:02Z) - Speaking the Language of Your Listener: Audience-Aware Adaptation via
Plug-and-Play Theory of Mind [4.052000839878213]
We model a visually grounded referential game between a knowledgeable speaker and a listener with more limited visual and linguistic experience.
We endow our speaker with the ability to adapt its referring expressions via a simulation module that monitors the effectiveness of planned utterances from the listener's perspective.
arXiv Detail & Related papers (2023-05-31T15:17:28Z) - Joining the Conversation: Towards Language Acquisition for Ad Hoc Team
Play [1.370633147306388]
We propose and consider the problem of cooperative language acquisition as a particular form of the ad hoc team play problem.
We present a probabilistic model for inferring a speaker's intentions and a listener's semantics from observing communications between a team of language-users.
arXiv Detail & Related papers (2023-05-20T16:59:27Z) - Self-supervised Fine-tuning for Improved Content Representations by
Speaker-invariant Clustering [78.2927924732142]
We propose speaker-invariant clustering (Spin) as a novel self-supervised learning method.
Spin disentangles speaker information and preserves content representations with just 45 minutes of fine-tuning on a single GPU.
arXiv Detail & Related papers (2023-05-18T15:59:36Z) - Data-augmented cross-lingual synthesis in a teacher-student framework [3.2548794659022398]
Cross-lingual synthesis is the task of letting a speaker generate fluent synthetic speech in another language.
Previous research shows that many models appear to have insufficient generalization capabilities.
We propose to apply the teacher-student paradigm to cross-lingual synthesis.
arXiv Detail & Related papers (2022-03-31T20:01:32Z) - Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning
for Low-Resource Speech Recognition [159.9312272042253]
Wav-BERT is a cooperative acoustic and linguistic representation learning method.
We unify a pre-trained acoustic model (wav2vec 2.0) and a language model (BERT) into an end-to-end trainable framework.
arXiv Detail & Related papers (2021-09-19T16:39:22Z) - Improving on-device speaker verification using federated learning with
privacy [5.321241042620525]
Information on speaker characteristics can be useful as side information in improving speaker recognition accuracy.
This paper investigates how privacy-preserving learning can improve a speaker verification system.
arXiv Detail & Related papers (2020-08-06T13:37:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.