Related papers: Investigating the Representation of Backchannels and Fillers in Fine-tuned Language Models

Investigating the Representation of Backchannels and Fillers in Fine-tuned Language Models

URL: http://arxiv.org/abs/2509.20237v1
Date: Wed, 24 Sep 2025 15:27:44 GMT
Title: Investigating the Representation of Backchannels and Fillers in Fine-tuned Language Models
Authors: Yu Wang, Leyi Lao, Langchu Huang, Gabriel Skantze, Yang Xu, Hendrik Buschmeier,
Abstract summary: Backchannels and fillers are important linguistic expressions in dialogue, but are under-represented in transformer-based language models (LMs)<n>Our work studies the representation of them in language models using three fine-tuning strategies.<n>We first apply clustering analysis to the learnt representation of backchannels and fillers, and have found increased silhouette scores in representations from fine-tuned models.<n>We also use natural language generation metrics to confirm that the utterances generated by fine-tuned language models resemble human-produced utterances more closely.
Score: 11.06013049764257
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Backchannels and fillers are important linguistic expressions in dialogue, but are under-represented in modern transformer-based language models (LMs). Our work studies the representation of them in language models using three fine-tuning strategies. The models are trained on three dialogue corpora in English and Japanese, where backchannels and fillers are preserved and annotated, to investigate how fine-tuning can help LMs learn their representations. We first apply clustering analysis to the learnt representation of backchannels and fillers, and have found increased silhouette scores in representations from fine-tuned models, which suggests that fine-tuning enables LMs to distinguish the nuanced semantic variation in different backchannel and filler use. We also use natural language generation (NLG) metrics to confirm that the utterances generated by fine-tuned language models resemble human-produced utterances more closely. Our findings suggest the potentials of transforming general LMs into conversational LMs that are more capable of producing human-like languages adequately.

Related papers

Modelling the Morphology of Verbal Paradigms: A Case Study in the Tokenization of Turkish and Hebrew [1.0857263744676489]
We investigate how transformer models represent complex verb paradigms in Turkish and Modern Hebrew.<n>We show that for Turkish, both monolingual and multilingual models succeed, either when tokenization is atomic or when it breaks words into small subword units.<n>For Hebrew, instead, monolingual and multilingual models diverge.
arXiv Detail & Related papers (2026-02-05T13:31:21Z)
Beyond the Rosetta Stone: Unification Forces in Generalization Dynamics [56.145578792496714]
Large language models (LLMs) struggle with cross-lingual knowledge transfer.<n>We study the causes and dynamics of this phenomenon by training small Transformer models from scratch on synthetic multilingual datasets.
arXiv Detail & Related papers (2025-08-14T18:44:13Z)
How a Bilingual LM Becomes Bilingual: Tracing Internal Representations with Sparse Autoencoders [47.52390427719507]
We employ sparse autoencoders to analyze internal representations of bilingual language models.<n>Our analysis shows that language models first learn languages separately, and then gradually form bilingual alignments.
arXiv Detail & Related papers (2025-03-09T02:13:44Z)
Self-Supervised Models of Speech Infer Universal Articulatory Kinematics [44.27187669492598]
We show "inference of articulatory kinematics" as fundamental property of SSL models. We also show that this abstraction is largely overlapping across the language of the data used to train the model. We show that with simple affine transformations, Acoustic-to-Articulatory inversion (AAI) is transferrable across speakers, even across genders, languages, and dialects.
arXiv Detail & Related papers (2023-10-16T19:50:01Z)
Let Models Speak Ciphers: Multiagent Debate through Embeddings [84.20336971784495]
We introduce CIPHER (Communicative Inter-Model Protocol Through Embedding Representation) to address this issue. By deviating from natural language, CIPHER offers an advantage of encoding a broader spectrum of information without any modification to the model weights. This showcases the superiority and robustness of embeddings as an alternative "language" for communication among LLMs.
arXiv Detail & Related papers (2023-10-10T03:06:38Z)
Counteracts: Testing Stereotypical Representation in Pre-trained Language Models [4.211128681972148]
We use counterexamples to examine the internal stereotypical knowledge in pre-trained language models (PLMs) We evaluate 7 PLMs on 9 types of cloze-style prompt with different information and base knowledge.
arXiv Detail & Related papers (2023-01-11T07:52:59Z)
Deanthropomorphising NLP: Can a Language Model Be Conscious? [7.41244589428771]
We take the position that such a large language model cannot be sentient, or conscious, and that LaMDA in particular exhibits no advances over other similar models that would qualify it. We see the claims of sentience as part of a wider tendency to use anthropomorphic language in NLP reporting.
arXiv Detail & Related papers (2022-11-21T14:18:25Z)
Towards Language Modelling in the Speech Domain Using Sub-word Linguistic Units [56.52704348773307]
We propose a novel LSTM-based generative speech LM based on linguistic units including syllables and phonemes. With a limited dataset, orders of magnitude smaller than that required by contemporary generative models, our model closely approximates babbling speech. We show the effect of training with auxiliary text LMs, multitask learning objectives, and auxiliary articulatory features.
arXiv Detail & Related papers (2021-10-31T22:48:30Z)
SLM: Learning a Discourse Language Representation with Sentence Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation. We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z)
How Phonotactics Affect Multilingual and Zero-shot ASR Performance [74.70048598292583]
A Transformer encoder-decoder model has been shown to leverage multilingual data well in IPA transcriptions of languages presented during training. We replace the encoder-decoder with a hybrid ASR system consisting of a separate AM and LM. We show that the gain from modeling crosslingual phonotactics is limited, and imposing a too strong model can hurt the zero-shot transfer.
arXiv Detail & Related papers (2020-10-22T23:07:24Z)
Cross-lingual Spoken Language Understanding with Regularized Representation Alignment [71.53159402053392]
We propose a regularization approach to align word-level and sentence-level representations across languages without any external resource. Experiments on the cross-lingual spoken language understanding task show that our model outperforms current state-of-the-art methods in both few-shot and zero-shot scenarios.
arXiv Detail & Related papers (2020-09-30T08:56:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.