Rule or Story, Which is a Better Commonsense Expression for Talking with Large Language Models?
- URL: http://arxiv.org/abs/2402.14355v2
- Date: Tue, 4 Jun 2024 08:05:51 GMT
- Title: Rule or Story, Which is a Better Commonsense Expression for Talking with Large Language Models?
- Authors: Ning Bian, Xianpei Han, Hongyu Lin, Yaojie Lu, Ben He, Le Sun,
- Abstract summary: Humans convey and pass down commonsense implicitly through stories.
This paper investigates the inherent commonsense ability of large language models (LLMs) expressed through stories.
- Score: 49.83570853386928
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Building machines with commonsense has been a longstanding challenge in NLP due to the reporting bias of commonsense rules and the exposure bias of rule-based commonsense reasoning. In contrast, humans convey and pass down commonsense implicitly through stories. This paper investigates the inherent commonsense ability of large language models (LLMs) expressed through storytelling. We systematically investigate and compare stories and rules for retrieving and leveraging commonsense in LLMs. Experimental results on 28 commonsense QA datasets show that stories outperform rules as the expression for retrieving commonsense from LLMs, exhibiting higher generation confidence and commonsense accuracy. Moreover, stories are the more effective commonsense expression for answering questions regarding daily events, while rules are more effective for scientific questions. This aligns with the reporting bias of commonsense in text corpora. We further show that the correctness and relevance of commonsense stories can be further improved via iterative self-supervised fine-tuning. These findings emphasize the importance of using appropriate language to express, retrieve, and leverage commonsense for LLMs, highlighting a promising direction for better exploiting their commonsense abilities.
Related papers
- What Really is Commonsense Knowledge? [58.5342212738895]
We survey existing definitions of commonsense knowledge, ground into the three frameworks for defining concepts, and consolidate them into a unified definition of commonsense knowledge.
We then use the consolidated definition for annotations and experiments on the CommonsenseQA and CommonsenseQA 2.0 datasets.
Our study shows that there exists a large portion of non-commonsense-knowledge instances in the two datasets, and a large performance gap on these two subsets.
arXiv Detail & Related papers (2024-11-06T14:54:19Z) - Narrative Analysis of True Crime Podcasts With Knowledge Graph-Augmented Large Language Models [8.78598447041169]
Large language models (LLMs) still struggle with complex narrative arcs as well as narratives containing conflicting information.
Recent work indicates LLMs augmented with external knowledge bases can improve the accuracy and interpretability of the resulting models.
In this work, we analyze the effectiveness of applying knowledge graphs (KGs) in understanding true-crime podcast data.
arXiv Detail & Related papers (2024-11-01T21:49:00Z) - Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs [61.796960984541464]
We present COM2 (COMplex COMmonsense), a new dataset created by sampling logical queries.
We verbalize them using handcrafted rules and large language models into multiple-choice and text generation questions.
Experiments show that language models trained on COM2 exhibit significant improvements in complex reasoning ability.
arXiv Detail & Related papers (2024-03-12T08:13:52Z) - MORE: Multi-mOdal REtrieval Augmented Generative Commonsense Reasoning [66.06254418551737]
We propose a novel Multi-mOdal REtrieval framework to leverage both text and images to enhance the commonsense ability of language models.
Experiments on the Common-Gen task have demonstrated the efficacy of MORE based on the pre-trained models of both single and multiple modalities.
arXiv Detail & Related papers (2024-02-21T08:54:47Z) - CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks
for Chinese Large Language Models [42.5532503036805]
CORECODE is a dataset that contains abundant commonsense knowledge manually annotated on dyadic dialogues.
We categorize commonsense knowledge in everyday conversations into three dimensions: entity, event, and social interaction.
We collect 76,787 commonsense knowledge annotations from 19,700 dialogues through crowdsourcing.
arXiv Detail & Related papers (2023-12-20T09:06:18Z) - Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model.
It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model.
It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z) - Adversarial Transformer Language Models for Contextual Commonsense
Inference [14.12019824666882]
Contextualized or discourse aware commonsense inference is the task of generating coherent commonsense assertions.
Some problems with the task are: lack of controllability for topics of the inferred facts; lack of commonsense knowledge during training.
We develop techniques to address the aforementioned problems in the task.
arXiv Detail & Related papers (2023-02-10T18:21:13Z) - ComFact: A Benchmark for Linking Contextual Commonsense Knowledge [31.19689856957576]
We propose the new task of commonsense fact linking, where models are given contexts and trained to identify situationally-relevant commonsense knowledge from KGs.
Our novel benchmark, ComFact, contains 293k in-context relevance annotations for commonsense across four stylistically diverse datasets.
arXiv Detail & Related papers (2022-10-23T09:30:39Z) - Do Children Texts Hold The Key To Commonsense Knowledge? [14.678465723838599]
This paper explores whether children's texts hold the key to commonsense knowledge compilation.
An analysis with several corpora shows that children's texts indeed contain much more, and more typical commonsense assertions.
Experiments show that this advantage can be leveraged in popular language-model-based commonsense knowledge extraction settings.
arXiv Detail & Related papers (2022-10-10T09:56:08Z) - Evaluate Confidence Instead of Perplexity for Zero-shot Commonsense
Reasoning [85.1541170468617]
This paper reconsiders the nature of commonsense reasoning and proposes a novel commonsense reasoning metric, Non-Replacement Confidence (NRC)
Our proposed novel method boosts zero-shot performance on two commonsense reasoning benchmark datasets and further seven commonsense question-answering datasets.
arXiv Detail & Related papers (2022-08-23T14:42:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.