KITLM: Domain-Specific Knowledge InTegration into Language Models for
Question Answering
- URL: http://arxiv.org/abs/2308.03638v1
- Date: Mon, 7 Aug 2023 14:42:49 GMT
- Title: KITLM: Domain-Specific Knowledge InTegration into Language Models for
Question Answering
- Authors: Ankush Agarwal, Sakharam Gawade, Amar Prakash Azad and Pushpak
Bhattacharyya
- Abstract summary: Large language models (LLMs) have demonstrated remarkable performance in a wide range of natural language tasks.
We propose, KITLM, a novel knowledge base integration approach into language model through relevant information infusion.
Our proposed knowledge-infused model surpasses the performance of both GPT-3.5-turbo and the state-of-the-art knowledge infusion method, SKILL, achieving over 1.5 times improvement in exact match scores on the MetaQA.
- Score: 30.129418454426844
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have demonstrated remarkable performance in a
wide range of natural language tasks. However, as these models continue to grow
in size, they face significant challenges in terms of computational costs.
Additionally, LLMs often lack efficient domain-specific understanding, which is
particularly crucial in specialized fields such as aviation and healthcare. To
boost the domain-specific understanding, we propose, KITLM, a novel knowledge
base integration approach into language model through relevant information
infusion. By integrating pertinent knowledge, not only the performance of the
language model is greatly enhanced, but the model size requirement is also
significantly reduced while achieving comparable performance. Our proposed
knowledge-infused model surpasses the performance of both GPT-3.5-turbo and the
state-of-the-art knowledge infusion method, SKILL, achieving over 1.5 times
improvement in exact match scores on the MetaQA. KITLM showed a similar
performance boost in the aviation domain with AeroQA. The drastic performance
improvement of KITLM over the existing methods can be attributed to the
infusion of relevant knowledge while mitigating noise. In addition, we release
two curated datasets to accelerate knowledge infusion research in specialized
fields: a) AeroQA, a new benchmark dataset designed for multi-hop
question-answering within the aviation domain, and b) Aviation Corpus, a
dataset constructed from unstructured text extracted from the National
Transportation Safety Board reports. Our research contributes to advancing the
field of domain-specific language understanding and showcases the potential of
knowledge infusion techniques in improving the performance of language models
on question-answering.
Related papers
- RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks.
Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs.
In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z) - Enhancing SLM via ChatGPT and Dataset Augmentation [0.3844771221441211]
We employ knowledge distillation-based techniques and synthetic dataset augmentation to bridge the performance gap between large language models (LLMs) and small language models (SLMs)
Our methods involve two forms of rationale generation--information extraction and informed reasoning--to enrich the ANLI dataset.
Our findings reveal that the incorporation of synthetic rationales significantly improves the model's ability to comprehend natural language, leading to 1.3% and 2.3% higher classification accuracy, respectively, on the ANLI dataset.
arXiv Detail & Related papers (2024-09-19T09:24:36Z) - Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named
Entity Recognition [67.96794382040547]
$LLM-DA$ is a novel data augmentation technique based on large language models (LLMs) for the few-shot NER task.
Our approach involves employing 14 contextual rewriting strategies, designing entity replacements of the same type, and incorporating noise injection to enhance robustness.
arXiv Detail & Related papers (2024-02-22T14:19:56Z) - Augmenting LLMs with Knowledge: A survey on hallucination prevention [0.0]
This survey delves into the realm of language models (LMs) augmented with the ability to tap into external knowledge sources.
While adhering to the standard objective of predicting missing tokens, these augmented LMs leverage diverse, possibly non-parametric external modules.
arXiv Detail & Related papers (2023-09-28T14:09:58Z) - Improving Open Information Extraction with Large Language Models: A
Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text.
Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z) - Knowledge-Augmented Reasoning Distillation for Small Language Models in
Knowledge-Intensive Tasks [90.11273439036455]
Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks.
We propose Knowledge-Augmented Reasoning Distillation (KARD), a novel method that fine-tunes small LMs to generate rationales from LLMs with augmented knowledge retrieved from an external knowledge base.
We empirically show that KARD significantly improves the performance of small T5 and GPT models on the challenging knowledge-intensive reasoning datasets.
arXiv Detail & Related papers (2023-05-28T13:00:00Z) - A Cohesive Distillation Architecture for Neural Language Models [0.0]
A recent trend in Natural Language Processing is the exponential growth in Language Model (LM) size.
This study investigates methods for Knowledge Distillation (KD) to provide efficient alternatives to large-scale models.
arXiv Detail & Related papers (2023-01-12T08:01:53Z) - KAER: A Knowledge Augmented Pre-Trained Language Model for Entity
Resolution [0.6284767263654553]
We propose a novel framework named for augmenting pre-trained language models with external knowledge for entity resolution.
Our model improves on Ditto, the existing state-of-the-art entity resolution method.
arXiv Detail & Related papers (2023-01-12T00:15:40Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - LM-CORE: Language Models with Contextually Relevant External Knowledge [13.451001884972033]
We argue that storing large amounts of knowledge in the model parameters is sub-optimal given the ever-growing amounts of knowledge and resource requirements.
We present LM-CORE -- a general framework to achieve this -- that allows textitdecoupling of the language model training from the external knowledge source.
Experimental results show that LM-CORE, having access to external knowledge, achieves significant and robust outperformance over state-of-the-art knowledge-enhanced language models on knowledge probing tasks.
arXiv Detail & Related papers (2022-08-12T18:59:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.