ReasonBERT: Pre-trained to Reason with Distant Supervision
- URL: http://arxiv.org/abs/2109.04912v1
- Date: Fri, 10 Sep 2021 14:49:44 GMT
- Title: ReasonBERT: Pre-trained to Reason with Distant Supervision
- Authors: Xiang Deng, Yu Su, Alyssa Lees, You Wu, Cong Yu, Huan Sun
- Abstract summary: We present ReasonBert, a pre-training method that augments language models with the ability to reason over long-range relations and multiple, possibly hybrid contexts.
Different types of reasoning are simulated, including intersecting multiple pieces of evidence, bridging from one piece of evidence to another, and detecting unanswerable cases.
- Score: 17.962648165675684
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present ReasonBert, a pre-training method that augments language models
with the ability to reason over long-range relations and multiple, possibly
hybrid contexts. Unlike existing pre-training methods that only harvest
learning signals from local contexts of naturally occurring texts, we propose a
generalized notion of distant supervision to automatically connect multiple
pieces of text and tables to create pre-training examples that require
long-range reasoning. Different types of reasoning are simulated, including
intersecting multiple pieces of evidence, bridging from one piece of evidence
to another, and detecting unanswerable cases. We conduct a comprehensive
evaluation on a variety of extractive question answering datasets ranging from
single-hop to multi-hop and from text-only to table-only to hybrid that require
various reasoning capabilities and show that ReasonBert achieves remarkable
improvement over an array of strong baselines. Few-shot experiments further
demonstrate that our pre-training method substantially improves sample
efficiency.
Related papers
- Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - HOP, UNION, GENERATE: Explainable Multi-hop Reasoning without Rationale
Supervision [118.0818807474809]
This work proposes a principled, probabilistic approach for training explainable multi-hop QA systems without rationale supervision.
Our approach performs multi-hop reasoning by explicitly modeling rationales as sets, enabling the model to capture interactions between documents and sentences within a document.
arXiv Detail & Related papers (2023-05-23T16:53:49Z) - Consistent Multi-Granular Rationale Extraction for Explainable Multi-hop
Fact Verification [13.72453491358488]
This paper explores the viability of multi-granular rationale extraction with consistency and faithfulness for explainable multi-hop fact verification.
In particular, given a pretrained veracity prediction model, both the token-level explainer and sentence-level explainer are trained simultaneously to obtain multi-granular rationales.
Experimental results on three multi-hop fact verification datasets show that the proposed approach outperforms some state-of-the-art baselines.
arXiv Detail & Related papers (2023-05-16T12:31:53Z) - Stabilized In-Context Learning with Pre-trained Language Models for Few
Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks.
For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial.
We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z) - Momentum Contrastive Pre-training for Question Answering [54.57078061878619]
MCROSS introduces a momentum contrastive learning framework to align the answer probability between cloze-like and natural query-passage sample pairs.
Our method achieves noticeable improvement compared with all baselines in both supervised and zero-shot scenarios.
arXiv Detail & Related papers (2022-12-12T08:28:22Z) - Reasoning Circuits: Few-shot Multihop Question Generation with
Structured Rationales [11.068901022944015]
Chain-of-thought rationale generation has been shown to improve performance on multi-step reasoning tasks.
We introduce a new framework for applying chain-of-thought inspired structured rationale generation to multi-hop question generation under a very low supervision regime.
arXiv Detail & Related papers (2022-11-15T19:36:06Z) - Interlock-Free Multi-Aspect Rationalization for Text Classification [33.33452117387646]
We show that we address the interlocking problem in the multi-aspect setting.
We propose a multi-stage training method incorporating an additional self-supervised contrastive loss.
Empirical results on the beer review dataset show that our method improves significantly the rationalization performance.
arXiv Detail & Related papers (2022-05-13T16:38:38Z) - Towards Robust Online Dialogue Response Generation [62.99904593650087]
We argue that this can be caused by a discrepancy between training and real-world testing.
We propose a hierarchical sampling-based method consisting of both utterance-level sampling and semi-utterance-level sampling.
arXiv Detail & Related papers (2022-03-07T06:51:41Z) - Self-training with Few-shot Rationalization: Teacher Explanations Aid
Student in Few-shot NLU [88.8401599172922]
We develop a framework based on self-training language models with limited task-specific labels and rationales.
We show that the neural model performance can be significantly improved by making it aware of its rationalized predictions.
arXiv Detail & Related papers (2021-09-17T00:36:46Z) - Case-Based Abductive Natural Language Inference [4.726777092009554]
Case-Based Abductive Natural Language Inference (CB-ANLI)
Case-Based Abductive Natural Language Inference (CB-ANLI)
arXiv Detail & Related papers (2020-09-30T09:50:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.