Related papers: Injecting Domain-Specific Knowledge into Large Language Models: A Comprehensive Survey

Injecting Domain-Specific Knowledge into Large Language Models: A Comprehensive Survey

URL: http://arxiv.org/abs/2502.10708v1
Date: Sat, 15 Feb 2025 07:43:43 GMT
Title: Injecting Domain-Specific Knowledge into Large Language Models: A Comprehensive Survey
Authors: Zirui Song, Bin Yan, Yuhan Liu, Miao Fang, Mingzhe Li, Rui Yan, Xiuying Chen,
Abstract summary: Large Language Models (LLMs) have demonstrated remarkable success in various tasks such as natural language understanding, text summarization, and machine translation.<n>Their general-purpose nature often limits their effectiveness in domain-specific applications that require specialized knowledge, such as healthcare, chemistry, or legal analysis.<n>To address this, researchers have explored diverse methods to enhance LLMs by integrating domain-specific knowledge.
Score: 39.82566660592583
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) have demonstrated remarkable success in various tasks such as natural language understanding, text summarization, and machine translation. However, their general-purpose nature often limits their effectiveness in domain-specific applications that require specialized knowledge, such as healthcare, chemistry, or legal analysis. To address this, researchers have explored diverse methods to enhance LLMs by integrating domain-specific knowledge. In this survey, we provide a comprehensive overview of these methods, which we categorize into four key approaches: dynamic knowledge injection, static knowledge embedding, modular adapters, and prompt optimization. Each approach offers unique mechanisms to equip LLMs with domain expertise, balancing trade-offs between flexibility, scalability, and efficiency. We discuss how these methods enable LLMs to tackle specialized tasks, compare their advantages and disadvantages, evaluate domain-specific LLMs against general LLMs, and highlight the challenges and opportunities in this emerging field. For those interested in delving deeper into this area, we also summarize the commonly used datasets and benchmarks. To keep researchers updated on the latest studies, we maintain an open-source at: https://github.com/abilliyb/Knowledge_Injection_Survey_Papers, dedicated to documenting research in the field of specialized LLM.

Related papers

Domain Specific Benchmarks for Evaluating Multimodal Large Language Models [3.1546387965618337]
Large language models (LLMs) are increasingly being deployed across disciplines due to their advanced reasoning and problem solving capabilities.<n>This paper introduces a taxonomy of seven key disciplines, encompassing various domains and application areas where LLMs are extensively utilized.<n>We compile and categorize these benchmarks by domain to create an accessible resource for researchers.
arXiv Detail & Related papers (2025-06-15T20:42:45Z)
Evaluating LLM Reasoning in the Operations Research Domain with ORQA [19.72699080797411]
We introduce and apply Operations Research Question Answering (ORQA), a new benchmark designed to assess the generalization capabilities of Large Language Models (LLMs) The dataset features real-world optimization problems that demand multistep reasoning to construct their mathematical models. Our evaluations of various open source LLMs, such as LLaMA 3.1, DeepSeek, and Mixtral, reveal their modest performance, highlighting a gap in their ability to generalize to specialized technical domains.
arXiv Detail & Related papers (2024-12-22T09:10:34Z)
BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models [56.89958793648104]
Large Language Models (LLMs) are versatile and capable of addressing a diverse range of tasks. Previous approaches either conduct continuous pre-training with domain-specific data or employ retrieval augmentation to support general LLMs. We present a novel framework named BLADE, which enhances Black-box LArge language models with small Domain-spEcific models.
arXiv Detail & Related papers (2024-03-27T08:57:21Z)
LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges. Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model. This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z)
PANDA: Preference Adaptation for Enhancing Domain-Specific Abilities of LLMs [49.32067576992511]
Large language models often fall short of the performance achieved by domain-specific state-of-the-art models. One potential approach to enhance domain-specific capabilities of LLMs involves fine-tuning them using corresponding datasets. We propose Preference Adaptation for Enhancing Domain-specific Abilities of LLMs (PANDA) Our experimental results reveal that PANDA significantly enhances the domain-specific ability of LLMs on text classification and interactive decision tasks.
arXiv Detail & Related papers (2024-02-20T09:02:55Z)
Large Language Models for Generative Information Extraction: A Survey [89.71273968283616]
Large Language Models (LLMs) have demonstrated remarkable capabilities in text understanding and generation. We present an extensive overview by categorizing these works in terms of various IE subtasks and techniques. We empirically analyze the most advanced methods and discover the emerging trend of IE tasks with LLMs.
arXiv Detail & Related papers (2023-12-29T14:25:22Z)
Knowledge Plugins: Enhancing Large Language Models for Domain-Specific Recommendations [50.81844184210381]
We propose a general paradigm that augments large language models with DOmain-specific KnowledgE to enhance their performance on practical applications, namely DOKE. This paradigm relies on a domain knowledge extractor, working in three steps: 1) preparing effective knowledge for the task; 2) selecting the knowledge for each specific sample; and 3) expressing the knowledge in an LLM-understandable way.
arXiv Detail & Related papers (2023-11-16T07:09:38Z)
MechGPT, a language-based strategy for mechanics and materials modeling that connects knowledge across scales, disciplines and modalities [0.0]
We use a Large Language Model (LLM) to distill question-answer pairs from raw sources followed by fine-tuning. The resulting MechGPT LLM foundation model is used in a series of computational experiments to explore its capacity for knowledge retrieval, various language tasks, hypothesis generation, and connecting knowledge across disparate areas.
arXiv Detail & Related papers (2023-10-16T14:29:35Z)
Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey [100.24095818099522]
Large language models (LLMs) have significantly advanced the field of natural language processing (NLP) They provide a highly useful, task-agnostic foundation for a wide range of applications. However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles.
arXiv Detail & Related papers (2023-05-30T03:00:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.