Combining Language Models For Specialized Domains: A Colorful Approach
- URL: http://arxiv.org/abs/2310.19708v3
- Date: Wed, 1 Nov 2023 07:55:28 GMT
- Title: Combining Language Models For Specialized Domains: A Colorful Approach
- Authors: Daniel Eitan, Menachem Pirchi, Neta Glazer, Shai Meital, Gil Ayach,
Gidon Krendel, Aviv Shamsian, Aviv Navon, Gil Hetz, Joseph Keshet
- Abstract summary: We introduce a novel approach that integrates domain-specific or secondary LM into general-purpose LM.
This strategy involves labeling, or "coloring", each word to indicate its association with either the general or the domain-specific LM.
We develop an optimized algorithm that enhances the beam search algorithm to effectively handle inferences involving colored words.
- Score: 14.124988885323585
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: General purpose language models (LMs) encounter difficulties when processing
domain-specific jargon and terminology, which are frequently utilized in
specialized fields such as medicine or industrial settings. Moreover, they
often find it challenging to interpret mixed speech that blends general
language with specialized jargon. This poses a challenge for automatic speech
recognition systems operating within these specific domains. In this work, we
introduce a novel approach that integrates domain-specific or secondary LM into
general-purpose LM. This strategy involves labeling, or "coloring", each word
to indicate its association with either the general or the domain-specific LM.
We develop an optimized algorithm that enhances the beam search algorithm to
effectively handle inferences involving colored words. Our evaluations indicate
that this approach is highly effective in integrating jargon into language
tasks. Notably, our method substantially lowers the error rate for
domain-specific words without compromising performance in the general domain.
Related papers
- LexGen: Domain-aware Multilingual Lexicon Generation [40.97738267067852]
We propose a new model to generate dictionary words for 6 Indian languages in the multi-domain setting.
Our model consists of domain-specific and domain-generic layers that encode information.
We release a new benchmark dataset across 6 Indian languages that span 8 diverse domains.
arXiv Detail & Related papers (2024-05-18T07:02:43Z) - BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models [56.89958793648104]
Large Language Models (LLMs) are versatile and capable of addressing a diverse range of tasks.
Previous approaches either conduct continuous pre-training with domain-specific data or employ retrieval augmentation to support general LLMs.
We present a novel framework named BLADE, which enhances Black-box LArge language models with small Domain-spEcific models.
arXiv Detail & Related papers (2024-03-27T08:57:21Z) - Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains [9.600277231719874]
Large Language Models (LLMs) have demonstrated remarkable proficiency in understanding and generating natural language.
This work explores how to repurpose general LLMs into effective task solvers for specialized domains.
arXiv Detail & Related papers (2024-02-06T20:11:54Z) - Knowledge Plugins: Enhancing Large Language Models for Domain-Specific
Recommendations [50.81844184210381]
We propose a general paradigm that augments large language models with DOmain-specific KnowledgE to enhance their performance on practical applications, namely DOKE.
This paradigm relies on a domain knowledge extractor, working in three steps: 1) preparing effective knowledge for the task; 2) selecting the knowledge for each specific sample; and 3) expressing the knowledge in an LLM-understandable way.
arXiv Detail & Related papers (2023-11-16T07:09:38Z) - Domain-Controlled Prompt Learning [49.45309818782329]
Existing prompt learning methods often lack domain-awareness or domain-transfer mechanisms.
We propose a textbfDomain-Controlled Prompt Learning for the specific domains.
Our method achieves state-of-the-art performance in specific domain image recognition datasets.
arXiv Detail & Related papers (2023-09-30T02:59:49Z) - Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey [100.24095818099522]
Large language models (LLMs) have significantly advanced the field of natural language processing (NLP)
They provide a highly useful, task-agnostic foundation for a wide range of applications.
However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles.
arXiv Detail & Related papers (2023-05-30T03:00:30Z) - Compound Domain Generalization via Meta-Knowledge Encoding [55.22920476224671]
We introduce Style-induced Domain-specific Normalization (SDNorm) to re-normalize the multi-modal underlying distributions.
We harness the prototype representations, the centroids of classes, to perform relational modeling in the embedding space.
Experiments on four standard Domain Generalization benchmarks reveal that COMEN exceeds the state-of-the-art performance without the need of domain supervision.
arXiv Detail & Related papers (2022-03-24T11:54:59Z) - DS-TOD: Efficient Domain Specialization for Task Oriented Dialog [12.395323315744625]
Self-supervised dialog-specific pretraining on large conversational datasets yields substantial gains over traditional language modeling (LM) pretraining in downstream task-oriented dialog (TOD)
We investigate the effects of domain specialization of pretrained language models (PLMs) for task-oriented dialog.
We propose a resource-efficient and modular domain specialization by means of domain adapters.
arXiv Detail & Related papers (2021-10-15T22:25:51Z) - Seed Words Based Data Selection for Language Model Adaptation [11.59717828860318]
We present an approach for automatically selecting sentences, from a text corpus, that match, both semantically and morphologically, a glossary of terms furnished by the user.
The vocabulary of the baseline model is expanded and tailored, reducing the resulting OOV rate.
Results using different metrics (OOV rate, WER, precision and recall) show the effectiveness of the proposed techniques.
arXiv Detail & Related papers (2021-07-20T12:08:27Z) - Structured Latent Embeddings for Recognizing Unseen Classes in Unseen
Domains [108.11746235308046]
We propose a novel approach that learns domain-agnostic structured latent embeddings by projecting images from different domains.
Our experiments on the challenging DomainNet and DomainNet-LS benchmarks show the superiority of our approach over existing methods.
arXiv Detail & Related papers (2021-07-12T17:57:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.