A Self-enhancement Approach for Domain-specific Chatbot Training via
Knowledge Mining and Digest
- URL: http://arxiv.org/abs/2311.10614v1
- Date: Fri, 17 Nov 2023 16:09:10 GMT
- Title: A Self-enhancement Approach for Domain-specific Chatbot Training via
Knowledge Mining and Digest
- Authors: Ruohong Zhang, Luyu Gao, Chen Zheng, Zhen Fan, Guokun Lai, Zheng
Zhang, Fangzhou Ai, Yiming Yang, Hongxia Yang
- Abstract summary: Large Language Models (LLMs) often encounter challenges when dealing with intricate and knowledge-demanding queries in specific domains.
This paper introduces a novel approach to enhance LLMs by effectively extracting the relevant knowledge from domain-specific textual sources.
We train a knowledge miner, namely LLMiner, which autonomously extracts Question-Answer pairs from relevant documents.
- Score: 62.63606958140248
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs), despite their great power in language
generation, often encounter challenges when dealing with intricate and
knowledge-demanding queries in specific domains. This paper introduces a novel
approach to enhance LLMs by effectively extracting the relevant knowledge from
domain-specific textual sources, and the adaptive training of a chatbot with
domain-specific inquiries. Our two-step approach starts from training a
knowledge miner, namely LLMiner, which autonomously extracts Question-Answer
pairs from relevant documents through a chain-of-thought reasoning process.
Subsequently, we blend the mined QA pairs with a conversational dataset to
fine-tune the LLM as a chatbot, thereby enriching its domain-specific expertise
and conversational capabilities. We also developed a new evaluation benchmark
which comprises four domain-specific text corpora and associated human-crafted
QA pairs for testing. Our model shows remarkable performance improvement over
generally aligned LLM and surpasses domain-adapted models directly fine-tuned
on domain corpus. In particular, LLMiner achieves this with minimal human
intervention, requiring only 600 seed instances, thereby providing a pathway
towards self-improvement of LLMs through model-synthesized training data.
Related papers
- Injecting Domain-Specific Knowledge into Large Language Models: A Comprehensive Survey [39.82566660592583]
Large Language Models (LLMs) have demonstrated remarkable success in various tasks such as natural language understanding, text summarization, and machine translation.
Their general-purpose nature often limits their effectiveness in domain-specific applications that require specialized knowledge, such as healthcare, chemistry, or legal analysis.
To address this, researchers have explored diverse methods to enhance LLMs by integrating domain-specific knowledge.
arXiv Detail & Related papers (2025-02-15T07:43:43Z) - CPRM: A LLM-based Continual Pre-training Framework for Relevance Modeling in Commercial Search [34.08551439233784]
CPRM is a framework designed for the continual pre-training of large language models (LLMs)
Our framework includes three modules: 1) employing both queries and multi-field item to jointly pre-train for enhancing domain knowledge, 2) applying in-context pre-training, and 3) conducting reading comprehension on items to produce associated domain knowledge and background information.
arXiv Detail & Related papers (2024-12-02T08:35:54Z) - On Domain-Specific Post-Training for Multimodal Large Language Models [72.67107077850939]
We develop a visual instruction synthesizer that generates diverse visual instruction tasks from domain-specific image-caption pairs.
We apply a single-stage training pipeline to enhance task diversity for domain-specific post-training.
We conduct experiments in two domains, biomedicine and food, by post-training MLLMs of different sources and scales.
arXiv Detail & Related papers (2024-11-29T18:42:28Z) - Exploring Language Model Generalization in Low-Resource Extractive QA [57.14068405860034]
We investigate Extractive Question Answering (EQA) with Large Language Models (LLMs) under domain drift.
We devise a series of experiments to explain the performance gap empirically.
arXiv Detail & Related papers (2024-09-27T05:06:43Z) - Peering into the Mind of Language Models: An Approach for Attribution in Contextual Question Answering [9.86691461253151]
We introduce a novel method for attribution in contextual question answering, leveraging the hidden state representations of large language models (LLMs)
Our approach bypasses the need for extensive model retraining and retrieval model overhead, offering granular attributions and preserving the quality of generated answers.
We present Verifiability-granular, an attribution dataset which has token level annotations for LLM generations in the contextual question answering setup.
arXiv Detail & Related papers (2024-05-28T09:12:44Z) - Pretraining and Updates of Domain-Specific LLM: A Case Study in the Japanese Business Domain [4.133477882188227]
This paper presents our findings from training and evaluating a Japanese business domain-specific LLM.
Our pretrained model and business domain benchmark are publicly available to support further studies.
arXiv Detail & Related papers (2024-04-12T06:21:48Z) - Knowledge Plugins: Enhancing Large Language Models for Domain-Specific
Recommendations [50.81844184210381]
We propose a general paradigm that augments large language models with DOmain-specific KnowledgE to enhance their performance on practical applications, namely DOKE.
This paradigm relies on a domain knowledge extractor, working in three steps: 1) preparing effective knowledge for the task; 2) selecting the knowledge for each specific sample; and 3) expressing the knowledge in an LLM-understandable way.
arXiv Detail & Related papers (2023-11-16T07:09:38Z) - Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering [35.2883028685345]
Large language models (LLMs) are deployed to real scenarios for domain-specific question answering (QA)
This paper introduces Knowledgeable Preference AlignmenT (KnowPAT), which constructs two kinds of preference sets to tackle the two issues.
Besides, we design a new alignment objective to align the LLM preference with different human preferences uniformly.
arXiv Detail & Related papers (2023-11-11T07:56:40Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - Self-prompted Chain-of-Thought on Large Language Models for Open-domain
Multi-hop Reasoning [70.74928578278957]
In open-domain question-answering (ODQA), most existing questions require single-hop reasoning on commonsense.
Large language models (LLMs) have found significant utility in facilitating ODQA without external corpus.
We propose Self-prompted Chain-of-Thought (SP-CoT), an automated framework to mass-produce high quality CoTs.
arXiv Detail & Related papers (2023-10-20T14:51:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.