elBERto: Self-supervised Commonsense Learning for Question Answering
- URL: http://arxiv.org/abs/2203.09424v1
- Date: Thu, 17 Mar 2022 16:23:45 GMT
- Title: elBERto: Self-supervised Commonsense Learning for Question Answering
- Authors: Xunlin Zhan, Yuan Li, Xiao Dong, Xiaodan Liang, Zhiting Hu, and
Lawrence Carin
- Abstract summary: We propose a Self-supervised Bidirectional Representation Learning of Commonsense framework, which is compatible with off-the-shelf QA model architectures.
The framework comprises five self-supervised tasks to force the model to fully exploit the additional training signals from contexts containing rich commonsense.
elBERto achieves substantial improvements on out-of-paragraph and no-effect questions where simple lexical similarity comparison does not help.
- Score: 131.51059870970616
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Commonsense question answering requires reasoning about everyday situations
and causes and effects implicit in context. Typically, existing approaches
first retrieve external evidence and then perform commonsense reasoning using
these evidence. In this paper, we propose a Self-supervised Bidirectional
Encoder Representation Learning of Commonsense (elBERto) framework, which is
compatible with off-the-shelf QA model architectures. The framework comprises
five self-supervised tasks to force the model to fully exploit the additional
training signals from contexts containing rich commonsense. The tasks include a
novel Contrastive Relation Learning task to encourage the model to distinguish
between logically contrastive contexts, a new Jigsaw Puzzle task that requires
the model to infer logical chains in long contexts, and three classic SSL tasks
to maintain pre-trained models language encoding ability. On the representative
WIQA, CosmosQA, and ReClor datasets, elBERto outperforms all other methods,
including those utilizing explicit graph reasoning and external knowledge
retrieval. Moreover, elBERto achieves substantial improvements on
out-of-paragraph and no-effect questions where simple lexical similarity
comparison does not help, indicating that it successfully learns commonsense
and is able to leverage it when given dynamic context.
Related papers
- SINKT: A Structure-Aware Inductive Knowledge Tracing Model with Large Language Model [64.92472567841105]
Knowledge Tracing (KT) aims to determine whether students will respond correctly to the next question.
Structure-aware Inductive Knowledge Tracing model with large language model (dubbed SINKT)
SINKT predicts the student's response to the target question by interacting with the student's knowledge state and the question representation.
arXiv Detail & Related papers (2024-07-01T12:44:52Z) - Multi-hop Commonsense Knowledge Injection Framework for Zero-Shot
Commonsense Question Answering [6.086719709100659]
We propose a novel multi-hop commonsense knowledge injection framework.
Our framework achieves state-of-art performance on five commonsense question answering benchmarks.
arXiv Detail & Related papers (2023-05-10T07:13:47Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - ReAct: Synergizing Reasoning and Acting in Language Models [44.746116256516046]
We show that large language models (LLMs) can generate both reasoning traces and task-specific actions in an interleaved manner.
We apply our approach, named ReAct, to a diverse set of language and decision making tasks.
ReAct overcomes issues of hallucination and error propagation prevalent in chain-of-thought reasoning by interacting with a simple Wikipedia API.
arXiv Detail & Related papers (2022-10-06T01:00:32Z) - Utterance Rewriting with Contrastive Learning in Multi-turn Dialogue [22.103162555263143]
We introduce contrastive learning and multi-task learning to jointly model the problem.
Our proposed model achieves state-of-the-art performance on several public datasets.
arXiv Detail & Related papers (2022-03-22T10:13:27Z) - GreaseLM: Graph REASoning Enhanced Language Models for Question
Answering [159.9645181522436]
GreaseLM is a new model that fuses encoded representations from pretrained LMs and graph neural networks over multiple layers of modality interaction operations.
We show that GreaseLM can more reliably answer questions that require reasoning over both situational constraints and structured knowledge, even outperforming models 8x larger.
arXiv Detail & Related papers (2022-01-21T19:00:05Z) - Zero-shot Commonsense Question Answering with Cloze Translation and
Consistency Optimization [20.14487209460865]
We investigate four translation methods that can translate natural questions into cloze-style sentences.
We show that our methods are complementary datasets to a knowledge base improved model, and combining them can lead to state-of-the-art zero-shot performance.
arXiv Detail & Related papers (2022-01-01T07:12:49Z) - Question Answering over Knowledge Bases by Leveraging Semantic Parsing
and Neuro-Symbolic Reasoning [73.00049753292316]
We propose a semantic parsing and reasoning-based Neuro-Symbolic Question Answering(NSQA) system.
NSQA achieves state-of-the-art performance on QALD-9 and LC-QuAD 1.0.
arXiv Detail & Related papers (2020-12-03T05:17:55Z) - Knowledge-driven Data Construction for Zero-shot Evaluation in
Commonsense Question Answering [80.60605604261416]
We propose a novel neuro-symbolic framework for zero-shot question answering across commonsense tasks.
We vary the set of language models, training regimes, knowledge sources, and data generation strategies, and measure their impact across tasks.
We show that, while an individual knowledge graph is better suited for specific tasks, a global knowledge graph brings consistent gains across different tasks.
arXiv Detail & Related papers (2020-11-07T22:52:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.