Language Models as a Knowledge Source for Cognitive Agents
- URL: http://arxiv.org/abs/2109.08270v2
- Date: Mon, 20 Sep 2021 15:07:31 GMT
- Title: Language Models as a Knowledge Source for Cognitive Agents
- Authors: Robert E. Wray, III and James R. Kirk and John E. Laird
- Abstract summary: Language models (LMs) are sentence-completion engines trained on massive corpora.
This paper outlines the challenges and opportunities for using language models as a new knowledge source for cognitive systems.
It also identifies possible ways to improve knowledge extraction from language models using the capabilities provided by cognitive systems.
- Score: 9.061356032792954
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Language models (LMs) are sentence-completion engines trained on massive
corpora. LMs have emerged as a significant breakthrough in natural-language
processing, providing capabilities that go far beyond sentence completion
including question answering, summarization, and natural-language inference.
While many of these capabilities have potential application to cognitive
systems, exploiting language models as a source of task knowledge, especially
for task learning, offers significant, near-term benefits. We introduce
language models and the various tasks to which they have been applied and then
review methods of knowledge extraction from language models. The resulting
analysis outlines both the challenges and opportunities for using language
models as a new knowledge source for cognitive systems. It also identifies
possible ways to improve knowledge extraction from language models using the
capabilities provided by cognitive systems. Central to success will be the
ability of a cognitive agent to itself learn an abstract model of the knowledge
implicit in the LM as well as methods to extract high-quality knowledge
effectively and efficiently. To illustrate, we introduce a hypothetical robot
agent and describe how language models could extend its task knowledge and
improve its performance and the kinds of knowledge and methods the agent can
use to exploit the knowledge within a language model.
Related papers
- Large Language Models are Limited in Out-of-Context Knowledge Reasoning [65.72847298578071]
Large Language Models (LLMs) possess extensive knowledge and strong capabilities in performing in-context reasoning.
This paper focuses on a significant aspect of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge.
arXiv Detail & Related papers (2024-06-11T15:58:59Z) - Bootstrapping Cognitive Agents with a Large Language Model [0.9971537447334835]
Large language models contain noisy general knowledge of the world, yet are hard to train or fine-tune.
In this work, we combine the best of both worlds: bootstrapping a cognitive-based model with the noisy knowledge encoded in large language models.
arXiv Detail & Related papers (2024-02-25T01:40:30Z) - Exploiting Language Models as a Source of Knowledge for Cognitive Agents [4.557963624437782]
Large language models (LLMs) provide capabilities far beyond sentence completion, including question answering, summarization, and natural-language inference.
While many of these capabilities have potential application to cognitive systems, our research is exploiting language models as a source of task knowledge for cognitive agents, that is, agents realized via a cognitive architecture.
arXiv Detail & Related papers (2023-09-05T15:18:04Z) - Diffusion Language Models Can Perform Many Tasks with Scaling and
Instruction-Finetuning [56.03057119008865]
We show that scaling diffusion language models can effectively make them strong language learners.
We build competent diffusion language models at scale by first acquiring knowledge from massive data.
Experiments show that scaling diffusion language models consistently improves performance across downstream language tasks.
arXiv Detail & Related papers (2023-08-23T16:01:12Z) - Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model.
It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model.
It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z) - Knowledge Rumination for Pre-trained Language Models [77.55888291165462]
We propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize related latent knowledge without retrieving it from the external corpus.
We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3.
arXiv Detail & Related papers (2023-05-15T15:47:09Z) - LM-CORE: Language Models with Contextually Relevant External Knowledge [13.451001884972033]
We argue that storing large amounts of knowledge in the model parameters is sub-optimal given the ever-growing amounts of knowledge and resource requirements.
We present LM-CORE -- a general framework to achieve this -- that allows textitdecoupling of the language model training from the external knowledge source.
Experimental results show that LM-CORE, having access to external knowledge, achieves significant and robust outperformance over state-of-the-art knowledge-enhanced language models on knowledge probing tasks.
arXiv Detail & Related papers (2022-08-12T18:59:37Z) - A Survey of Knowledge-Intensive NLP with Pre-Trained Language Models [185.08295787309544]
We aim to summarize the current progress of pre-trained language model-based knowledge-enhanced models (PLMKEs)
We present the challenges of PLMKEs based on the discussion regarding the three elements and attempt to provide NLP practitioners with potential directions for further research.
arXiv Detail & Related papers (2022-02-17T17:17:43Z) - Knowledge Engineering in the Long Game of Artificial Intelligence: The
Case of Speech Acts [0.6445605125467572]
This paper describes principles and practices of knowledge engineering that enable the development of holistic language-endowed intelligent agents.
We focus on dialog act modeling, a task that has been widely pursued in linguistics, cognitive modeling, and statistical natural language processing.
arXiv Detail & Related papers (2022-02-02T14:05:12Z) - Knowledge Based Multilingual Language Model [44.70205282863062]
We present a novel framework to pretrain knowledge based multilingual language models (KMLMs)
We generate a large amount of code-switched synthetic sentences and reasoning-based multilingual training data using the Wikidata knowledge graphs.
Based on the intra- and inter-sentence structures of the generated data, we design pretraining tasks to facilitate knowledge learning.
arXiv Detail & Related papers (2021-11-22T02:56:04Z) - Generated Knowledge Prompting for Commonsense Reasoning [53.88983683513114]
We propose generating knowledge statements directly from a language model with a generic prompt format.
This approach improves performance of both off-the-shelf and finetuned language models on four commonsense reasoning tasks.
Notably, we find that a model's predictions can improve when using its own generated knowledge.
arXiv Detail & Related papers (2021-10-15T21:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.