Leveraging Knowledge and Reinforcement Learning for Enhanced Reliability
of Language Models
- URL: http://arxiv.org/abs/2308.13467v1
- Date: Fri, 25 Aug 2023 16:11:08 GMT
- Title: Leveraging Knowledge and Reinforcement Learning for Enhanced Reliability
of Language Models
- Authors: Nancy Tyagi, Surjodeep Sarkar, Manas Gaur
- Abstract summary: We explore a knowledge-guided LM ensembling approach that leverages reinforcement learning to integrate knowledge from ConceptNet and Wikipedia as knowledge graph embeddings.
This approach mimics human annotators resorting to external knowledge to compensate for information deficits in the datasets.
Across nine GLUE datasets, our research shows that ensembling strengthens reliability and accuracy scores, outperforming state of the art.
- Score: 10.10140327060947
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Natural Language Processing(NLP) community has been using crowd sourcing
techniques to create benchmark datasets such as General Language Understanding
and Evaluation(GLUE) for training modern Language Models such as BERT. GLUE
tasks measure the reliability scores using inter annotator metrics i.e. Cohens
Kappa. However, the reliability aspect of LMs has often been overlooked. To
counter this problem, we explore a knowledge-guided LM ensembling approach that
leverages reinforcement learning to integrate knowledge from ConceptNet and
Wikipedia as knowledge graph embeddings. This approach mimics human annotators
resorting to external knowledge to compensate for information deficits in the
datasets. Across nine GLUE datasets, our research shows that ensembling
strengthens reliability and accuracy scores, outperforming state of the art.
Related papers
- KRAIL: A Knowledge-Driven Framework for Base Human Reliability Analysis Integrating IDHEAS and Large Language Models [2.7378790256389047]
This paper introduces a novel two-stage framework for knowledge-driven reliability analysis, integrating IDHEAS and LLMs (KRAIL)
Inspired by the success of large language models (LLMs) in natural language processing, this paper introduces a novel two-stage framework for knowledge-driven reliability analysis.
arXiv Detail & Related papers (2024-12-20T06:21:34Z) - Large Language Models are Limited in Out-of-Context Knowledge Reasoning [65.72847298578071]
Large Language Models (LLMs) possess extensive knowledge and strong capabilities in performing in-context reasoning.
This paper focuses on a significant aspect of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge.
arXiv Detail & Related papers (2024-06-11T15:58:59Z) - Knowledge Graphs as Context Sources for LLM-Based Explanations of
Learning Recommendations [0.0]
Large language models (LLMs) and generative AI have recently opened new doors for generating human-like explanations.
This paper proposes an approach to utilize knowledge graphs (KG) as a source of factual context.
We utilize the semantic relations in the knowledge graph to offer curated knowledge about learning recommendations.
arXiv Detail & Related papers (2024-03-05T14:41:12Z) - InfuserKI: Enhancing Large Language Models with Knowledge Graphs via Infuser-Guided Knowledge Integration [58.61492157691623]
Methods for integrating knowledge have been developed, which augment LLMs with domain-specific knowledge graphs through external modules.
Our research focuses on a novel problem: efficiently integrating unknown knowledge into LLMs without unnecessary overlap of known knowledge.
A risk of introducing new knowledge is the potential forgetting of existing knowledge.
arXiv Detail & Related papers (2024-02-18T03:36:26Z) - Online Continual Knowledge Learning for Language Models [3.654507524092343]
Large Language Models (LLMs) serve as repositories of extensive world knowledge, enabling them to perform tasks such as question-answering and fact-checking.
Online Continual Knowledge Learning (OCKL) aims to manage the dynamic nature of world knowledge in LMs under real-time constraints.
arXiv Detail & Related papers (2023-11-16T07:31:03Z) - Beyond Factuality: A Comprehensive Evaluation of Large Language Models
as Knowledge Generators [78.63553017938911]
Large language models (LLMs) outperform information retrieval techniques for downstream knowledge-intensive tasks.
However, community concerns abound regarding the factuality and potential implications of using this uncensored knowledge.
We introduce CONNER, designed to evaluate generated knowledge from six important perspectives.
arXiv Detail & Related papers (2023-10-11T08:22:37Z) - Thrust: Adaptively Propels Large Language Models with External Knowledge [58.72867916604562]
Large-scale pre-trained language models (PTLMs) are shown to encode rich knowledge in their model parameters.
The inherent knowledge in PTLMs can be opaque or static, making external knowledge necessary.
We propose the instance-level adaptive propulsion of external knowledge (IAPEK), where we only conduct the retrieval when necessary.
arXiv Detail & Related papers (2023-07-19T20:16:46Z) - Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model.
It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model.
It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z) - Knowledge Rumination for Pre-trained Language Models [77.55888291165462]
We propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize related latent knowledge without retrieving it from the external corpus.
We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3.
arXiv Detail & Related papers (2023-05-15T15:47:09Z) - LM-CORE: Language Models with Contextually Relevant External Knowledge [13.451001884972033]
We argue that storing large amounts of knowledge in the model parameters is sub-optimal given the ever-growing amounts of knowledge and resource requirements.
We present LM-CORE -- a general framework to achieve this -- that allows textitdecoupling of the language model training from the external knowledge source.
Experimental results show that LM-CORE, having access to external knowledge, achieves significant and robust outperformance over state-of-the-art knowledge-enhanced language models on knowledge probing tasks.
arXiv Detail & Related papers (2022-08-12T18:59:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.