ActKnow: Active External Knowledge Infusion Learning for Question
Answering in Low Data Regime
- URL: http://arxiv.org/abs/2112.09423v1
- Date: Fri, 17 Dec 2021 10:39:41 GMT
- Title: ActKnow: Active External Knowledge Infusion Learning for Question
Answering in Low Data Regime
- Authors: K. M. Annervaz, Pritam Kumar Nath, Ambedkar Dukkipati
- Abstract summary: We propose a technique that actively infuses knowledge from Knowledge Graphs (KG) based "on-demand" into learning for Question Answering (QA)
We show significant improvements on the ARC Challenge-set benchmark over purely text-based transformer models like RoBERTa in the low data regime.
- Score: 7.562843347215286
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Deep learning models have set benchmark results in various Natural Language
Processing tasks. However, these models require an enormous amount of training
data, which is infeasible in many practical problems. While various techniques
like domain adaptation, fewshot learning techniques address this problem, we
introduce a new technique of actively infusing external knowledge into learning
to solve low data regime problems. We propose a technique called ActKnow that
actively infuses knowledge from Knowledge Graphs (KG) based "on-demand" into
learning for Question Answering (QA). By infusing world knowledge from
Concept-Net, we show significant improvements on the ARC Challenge-set
benchmark over purely text-based transformer models like RoBERTa in the low
data regime. For example, by using only 20% training examples, we demonstrate a
4% improvement in the accuracy for both ARC-challenge and OpenBookQA,
respectively.
Related papers
- KBAlign: Efficient Self Adaptation on Specific Knowledge Bases [73.34893326181046]
Large language models (LLMs) usually rely on retrieval-augmented generation to exploit knowledge materials in an instant manner.
We propose KBAlign, an approach designed for efficient adaptation to downstream tasks involving knowledge bases.
Our method utilizes iterative training with self-annotated data such as Q&A pairs and revision suggestions, enabling the model to grasp the knowledge content efficiently.
arXiv Detail & Related papers (2024-11-22T08:21:03Z) - Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models [79.28821338925947]
Domain-Class Incremental Learning is a realistic but challenging continual learning scenario.
To handle these diverse tasks, pre-trained Vision-Language Models (VLMs) are introduced for their strong generalizability.
This incurs a new problem: the knowledge encoded in the pre-trained VLMs may be disturbed when adapting to new tasks, compromising their inherent zero-shot ability.
Existing methods tackle it by tuning VLMs with knowledge distillation on extra datasets, which demands heavy overhead.
We propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of
arXiv Detail & Related papers (2024-07-07T12:19:37Z) - Injecting New Knowledge into Large Language Models via Supervised Fine-Tuning [13.371405067535814]
This paper investigates the effectiveness ofSupervised Fine-Tuning (SFT) as a method for knowledge injection in Large Language Models (LLMs)
We compare different dataset generation strategies -- token-based and fact-based scaling -- to create training data that helps the model learn new information.
Our results show considerable performance improvements in Q&A tasks related to out-of-domain knowledge.
arXiv Detail & Related papers (2024-03-30T01:56:07Z) - A Closer Look at the Limitations of Instruction Tuning [52.587607091917214]
We show that Instruction Tuning (IT) fails to enhance knowledge or skills in large language models (LLMs)
We also show that popular methods to improve IT do not lead to performance improvements over a simple LoRA fine-tuned model.
Our findings reveal that responses generated solely from pre-trained knowledge consistently outperform responses by models that learn any form of new knowledge from IT on open-source datasets.
arXiv Detail & Related papers (2024-02-03T04:45:25Z) - Learning Objective-Specific Active Learning Strategies with Attentive
Neural Processes [72.75421975804132]
Learning Active Learning (LAL) suggests to learn the active learning strategy itself, allowing it to adapt to the given setting.
We propose a novel LAL method for classification that exploits symmetry and independence properties of the active learning problem.
Our approach is based on learning from a myopic oracle, which gives our model the ability to adapt to non-standard objectives.
arXiv Detail & Related papers (2023-09-11T14:16:37Z) - Improving Question Answering Performance Using Knowledge Distillation
and Active Learning [6.380750645368325]
We propose a novel knowledge distillation (KD) approach to reduce the parameter and model complexity of a pre-trained BERT system.
We demonstrate that our model achieves the performance of a 6-layer TinyBERT and DistilBERT, whilst using only 2% of their total parameters.
arXiv Detail & Related papers (2021-09-26T17:49:54Z) - Rectification-based Knowledge Retention for Continual Learning [49.1447478254131]
Deep learning models suffer from catastrophic forgetting when trained in an incremental learning setting.
We propose a novel approach to address the task incremental learning problem, which involves training a model on new tasks that arrive in an incremental manner.
Our approach can be used in both the zero-shot and non zero-shot task incremental learning settings.
arXiv Detail & Related papers (2021-03-30T18:11:30Z) - Bayesian active learning for production, a systematic study and a
reusable library [85.32971950095742]
In this paper, we analyse the main drawbacks of current active learning techniques.
We do a systematic study on the effects of the most common issues of real-world datasets on the deep active learning process.
We derive two techniques that can speed up the active learning loop such as partial uncertainty sampling and larger query size.
arXiv Detail & Related papers (2020-06-17T14:51:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.