Related papers: Adapting Language Balance in Code-Switching Speech

Adapting Language Balance in Code-Switching Speech

URL: http://arxiv.org/abs/2510.18724v1
Date: Tue, 21 Oct 2025 15:23:55 GMT
Title: Adapting Language Balance in Code-Switching Speech
Authors: Enes Yavuz Ugan, Ngoc-Quan Pham, Alexander Waibel,
Abstract summary: Large foundational models still struggle against code-switching test cases.<n>We use differentiable surrogates to mitigate context bias during generation.<n>Experiments with Arabic and Chinese-English showed that the models are able to predict the switching places more correctly.
Score: 60.296574524609575
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite achieving impressive results on standard benchmarks, large foundational models still struggle against code-switching test cases. When data scarcity cannot be used as the usual justification for poor performance, the reason may lie in the infrequent occurrence of code-switched moments, where the embedding of the second language appears subtly. Instead of expecting the models to learn this infrequency on their own, it might be beneficial to provide the training process with labels. Evaluating model performance on code-switching data requires careful localization of code-switching points where recognition errors are most consequential, so that the analysis emphasizes mistakes occurring at those moments. Building on this observation, we leverage the difference between the embedded and the main language to highlight those code-switching points and thereby emphasize learning at those locations. This simple yet effective differentiable surrogate mitigates context bias during generation -- the central challenge in code-switching -- thereby improving the model's robustness. Our experiments with Arabic and Chinese-English showed that the models are able to predict the switching places more correctly, reflected by the reduced substitution error.

Related papers

Corrective Diffusion Language Models [12.724100711773593]
We study corrective behavior in diffusion language models, defined as the ability to assign lower confidence to incorrect tokens and iteratively refine them while preserving correct content.<n>We propose a correction-oriented post-training principle that explicitly supervises visible incorrect tokens, enabling error-aware confidence and targeted refinement.
arXiv Detail & Related papers (2025-12-17T17:04:38Z)
Language Confusion Gate: Language-Aware Decoding Through Model Self-Distillation [50.93756215410832]
This paper introduces the Language Confusion Gate (LCG), a lightweight, plug-in solution that filters tokens during decoding.<n>The LCG is trained using norm-adjusted self-distillation to predict appropriate language families and apply masking only when needed.
arXiv Detail & Related papers (2025-10-20T14:02:37Z)
Evaluating Line-level Localization Ability of Learning-based Code Vulnerability Detection Models [9.543689542888599]
We propose an explainability-based evaluation procedure for vulnerability detectors.<n>Our approach, defined as Detection Alignment (DA), quantifies the agreement between the input source code lines.<n>We show how the predictions of such models are consistently biased by non-vulnerable lines.
arXiv Detail & Related papers (2025-10-13T09:34:40Z)
LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers [53.43862310647276]
Large language models (LLMs) excel at natural language understanding and generation but remain vulnerable to factual errors.<n>We introduce a token-aware, layer-localized contrastive decoding method that aligns specific token types with their most influential transformer layers to improve factual generation.<n>Our method requires no additional training or model modification, and experiments demonstrate that our method consistently improves factuality across multiple LLMs and various benchmarks.
arXiv Detail & Related papers (2025-07-06T14:35:43Z)
Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training [58.696660064190475]
We find that the existence of code-switching, alternating between different languages within a context, is key to multilingual capabilities.<n>To better explore the power of code-switching for language alignment during pre-training, we investigate the strategy of synthetic code-switching.
arXiv Detail & Related papers (2025-04-02T15:09:58Z)
PIER: A Novel Metric for Evaluating What Matters in Code-Switching [15.370845263369347]
Code-switching is a significant challenge for Automatic Speech Recognition.<n>General metrics such as Word-Error-Rate (WER) are commonly used to measure performance.<n>We propose Point-of-Interest Error Rate (PIER), a variant of WER that focuses only on specific words of interest.
arXiv Detail & Related papers (2025-01-16T12:57:33Z)
Uncertainty Awareness of Large Language Models Under Code Distribution Shifts: A Benchmark Study [14.507068647009602]
Large Language Models (LLMs) have been widely employed in programming language analysis to enhance human productivity. Their reliability can be compromised by various code distribution shifts, leading to inconsistent outputs. Probability methods are known to mitigate such impact through uncertainty calibration and estimation.
arXiv Detail & Related papers (2024-01-12T00:00:32Z)
Comparison of Interactive Knowledge Base Spelling Correction Models for Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict. This work shows a comparison of a neural model and character language models with varying amounts on target language data. Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z)
On the Robustness of Language Encoders against Grammatical Errors [66.05648604987479]
We collect real grammatical errors from non-native speakers and conduct adversarial attacks to simulate these errors on clean text data. Results confirm that the performance of all tested models is affected but the degree of impact varies.
arXiv Detail & Related papers (2020-05-12T11:01:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.