Task-Agnostic Low-Rank Adapters for Unseen English Dialects
- URL: http://arxiv.org/abs/2311.00915v1
- Date: Thu, 2 Nov 2023 01:17:29 GMT
- Title: Task-Agnostic Low-Rank Adapters for Unseen English Dialects
- Authors: Zedian Xiao, William Held, Yanchen Liu, and Diyi Yang
- Abstract summary: Large Language Models (LLMs) are trained on corpora disproportionally weighted in favor of Standard American English.
By disentangling dialect-specific and cross-dialectal information, HyperLoRA improves generalization to unseen dialects in a task-agnostic fashion.
- Score: 52.88554155235167
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) are trained on corpora disproportionally
weighted in favor of Standard American English. As a result, speakers of other
dialects experience significantly more failures when interacting with these
technologies. In practice, these speakers often accommodate their speech to be
better understood. Our work shares the belief that language technologies should
be designed to accommodate the diversity in English dialects and not the other
way around. However, prior works on dialect struggle with generalizing to
evolving and emerging dialects in a scalable manner. To fill this gap, our
method, HyperLoRA, leverages expert linguistic knowledge to enable
resource-efficient adaptation via hypernetworks. By disentangling
dialect-specific and cross-dialectal information, HyperLoRA improves
generalization to unseen dialects in a task-agnostic fashion. Not only is
HyperLoRA more scalable in the number of parameters, but it also achieves the
best or most competitive performance across 5 dialects in a zero-shot setting.
In this way, our approach facilitates access to language technology for
billions of English dialect speakers who are traditionally underrepresented.
Related papers
- One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks [55.35278531907263]
We present the first study on Large Language Models' fairness and robustness to a dialect in canonical reasoning tasks.
We hire AAVE speakers to rewrite seven popular benchmarks, such as HumanEval and GSM8K.
We find that, compared to Standardized English, almost all of these widely used models show significant brittleness and unfairness to queries in AAVE.
arXiv Detail & Related papers (2024-10-14T18:44:23Z) - Lens: Rethinking Multilingual Enhancement for Large Language Models [70.85065197789639]
Lens is a novel approach to enhance multilingual capabilities of large language models (LLMs)
It operates by manipulating the hidden representations within the language-agnostic and language-specific subspaces from top layers of LLMs.
It achieves superior results with much fewer computational resources compared to existing post-training approaches.
arXiv Detail & Related papers (2024-10-06T08:51:30Z) - Disentangling Dialect from Social Bias via Multitask Learning to Improve Fairness [16.746758715820324]
We present a multitask learning approach that models dialect language as an auxiliary task to incorporate syntactic and lexical variations.
In our experiments with African-American English dialect, we provide empirical evidence that complementing common learning approaches with dialect modeling improves their fairness.
Results suggest that multitask learning achieves state-of-the-art performance and helps to detect properties of biased language more reliably.
arXiv Detail & Related papers (2024-06-14T12:39:39Z) - What Do Dialect Speakers Want? A Survey of Attitudes Towards Language Technology for German Dialects [60.8361859783634]
We survey speakers of dialects and regional languages related to German.
We find that respondents are especially in favour of potential NLP tools that work with dialectal input.
arXiv Detail & Related papers (2024-02-19T09:15:28Z) - DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules [64.93179829965072]
DADA is a modular approach to imbue SAE-trained models with multi-dialectal robustness.
We show that DADA is effective for both single task and instruction fine language models.
arXiv Detail & Related papers (2023-05-22T18:43:31Z) - Multi-VALUE: A Framework for Cross-Dialectal English NLP [49.55176102659081]
Multi- Dialect is a controllable rule-based translation system spanning 50 English dialects.
Stress tests reveal significant performance disparities for leading models on non-standard dialects.
We partner with native speakers of Chicano and Indian English to release new gold-standard variants of the popular CoQA task.
arXiv Detail & Related papers (2022-12-15T18:17:01Z) - Learning to Recognize Dialect Features [21.277962038423123]
We introduce the task of dialect feature detection, and present two multitask learning approaches.
We train our models on a small number of minimal pairs, building on how linguists typically define dialect features.
arXiv Detail & Related papers (2020-10-23T23:25:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.