Combining Language Models For Specialized Domains: A Colorful Approach
- URL: http://arxiv.org/abs/2310.19708v3
- Date: Wed, 1 Nov 2023 07:55:28 GMT
- Title: Combining Language Models For Specialized Domains: A Colorful Approach
- Authors: Daniel Eitan, Menachem Pirchi, Neta Glazer, Shai Meital, Gil Ayach,
Gidon Krendel, Aviv Shamsian, Aviv Navon, Gil Hetz, Joseph Keshet
- Abstract summary: We introduce a novel approach that integrates domain-specific or secondary LM into general-purpose LM.
This strategy involves labeling, or "coloring", each word to indicate its association with either the general or the domain-specific LM.
We develop an optimized algorithm that enhances the beam search algorithm to effectively handle inferences involving colored words.
- Score: 14.124988885323585
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: General purpose language models (LMs) encounter difficulties when processing
domain-specific jargon and terminology, which are frequently utilized in
specialized fields such as medicine or industrial settings. Moreover, they
often find it challenging to interpret mixed speech that blends general
language with specialized jargon. This poses a challenge for automatic speech
recognition systems operating within these specific domains. In this work, we
introduce a novel approach that integrates domain-specific or secondary LM into
general-purpose LM. This strategy involves labeling, or "coloring", each word
to indicate its association with either the general or the domain-specific LM.
We develop an optimized algorithm that enhances the beam search algorithm to
effectively handle inferences involving colored words. Our evaluations indicate
that this approach is highly effective in integrating jargon into language
tasks. Notably, our method substantially lowers the error rate for
domain-specific words without compromising performance in the general domain.
Related papers
- Efficient Terminology Integration for LLM-based Translation in Specialized Domains [0.0]
In specialized fields such as patent, finance, or biomedical domains, terminology is crucial for translation.
We introduce a methodology that efficiently trains models with a smaller amount of data while preserving the accuracy of terminology translation.
This methodology enhances the model's ability to handle specialized terminology and ensures high-quality translations.
arXiv Detail & Related papers (2024-10-21T07:01:25Z) - EAGLE: Towards Efficient Arbitrary Referring Visual Prompts Comprehension for Multimodal Large Language Models [80.00303150568696]
We propose a novel Multimodal Large Language Models (MLLM) that empowers comprehension of arbitrary referring visual prompts with less training efforts than existing approaches.
Our approach embeds referring visual prompts as spatial concepts conveying specific spatial areas comprehensible to the MLLM.
We also propose a Geometry-Agnostic Learning paradigm (GAL) to further disentangle the MLLM's region-level comprehension with the specific formats of referring visual prompts.
arXiv Detail & Related papers (2024-09-25T08:22:00Z) - LexGen: Domain-aware Multilingual Lexicon Generation [40.97738267067852]
We propose a new model to generate dictionary words for 6 Indian languages in the multi-domain setting.
Our model consists of domain-specific and domain-generic layers that encode information.
We release a new benchmark dataset across 6 Indian languages that span 8 diverse domains.
arXiv Detail & Related papers (2024-05-18T07:02:43Z) - Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains [9.600277231719874]
Large Language Models (LLMs) have demonstrated remarkable proficiency in understanding and generating natural language.
This work explores how to repurpose general LLMs into effective task solvers for specialized domains.
arXiv Detail & Related papers (2024-02-06T20:11:54Z) - Knowledge Plugins: Enhancing Large Language Models for Domain-Specific
Recommendations [50.81844184210381]
We propose a general paradigm that augments large language models with DOmain-specific KnowledgE to enhance their performance on practical applications, namely DOKE.
This paradigm relies on a domain knowledge extractor, working in three steps: 1) preparing effective knowledge for the task; 2) selecting the knowledge for each specific sample; and 3) expressing the knowledge in an LLM-understandable way.
arXiv Detail & Related papers (2023-11-16T07:09:38Z) - Domain-Controlled Prompt Learning [49.45309818782329]
Existing prompt learning methods often lack domain-awareness or domain-transfer mechanisms.
We propose a textbfDomain-Controlled Prompt Learning for the specific domains.
Our method achieves state-of-the-art performance in specific domain image recognition datasets.
arXiv Detail & Related papers (2023-09-30T02:59:49Z) - Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey [100.24095818099522]
Large language models (LLMs) have significantly advanced the field of natural language processing (NLP)
They provide a highly useful, task-agnostic foundation for a wide range of applications.
However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles.
arXiv Detail & Related papers (2023-05-30T03:00:30Z) - Compound Domain Generalization via Meta-Knowledge Encoding [55.22920476224671]
We introduce Style-induced Domain-specific Normalization (SDNorm) to re-normalize the multi-modal underlying distributions.
We harness the prototype representations, the centroids of classes, to perform relational modeling in the embedding space.
Experiments on four standard Domain Generalization benchmarks reveal that COMEN exceeds the state-of-the-art performance without the need of domain supervision.
arXiv Detail & Related papers (2022-03-24T11:54:59Z) - Seed Words Based Data Selection for Language Model Adaptation [11.59717828860318]
We present an approach for automatically selecting sentences, from a text corpus, that match, both semantically and morphologically, a glossary of terms furnished by the user.
The vocabulary of the baseline model is expanded and tailored, reducing the resulting OOV rate.
Results using different metrics (OOV rate, WER, precision and recall) show the effectiveness of the proposed techniques.
arXiv Detail & Related papers (2021-07-20T12:08:27Z) - Structured Latent Embeddings for Recognizing Unseen Classes in Unseen
Domains [108.11746235308046]
We propose a novel approach that learns domain-agnostic structured latent embeddings by projecting images from different domains.
Our experiments on the challenging DomainNet and DomainNet-LS benchmarks show the superiority of our approach over existing methods.
arXiv Detail & Related papers (2021-07-12T17:57:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.