Teaching Machine Comprehension with Compositional Explanations
- URL: http://arxiv.org/abs/2005.00806v3
- Date: Tue, 13 Oct 2020 19:28:50 GMT
- Title: Teaching Machine Comprehension with Compositional Explanations
- Authors: Qinyuan Ye, Xiao Huang, Elizabeth Boschee, Xiang Ren
- Abstract summary: We focus on "teaching" machines reading comprehension, using a small number of semi-structured explanations.
We use learnable neural modules and soft logic to handle linguistic variation and overcome sparse coverage.
On the SQuAD dataset, our proposed method achieves 70.14% F1 score with supervision from 26 explanations, comparable to plain supervised learning.
- Score: 32.82449839424392
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Advances in machine reading comprehension (MRC) rely heavily on the
collection of large scale human-annotated examples in the form of (question,
paragraph, answer) triples. In contrast, humans are typically able to
generalize with only a few examples, relying on deeper underlying world
knowledge, linguistic sophistication, and/or simply superior deductive powers.
In this paper, we focus on "teaching" machines reading comprehension, using a
small number of semi-structured explanations that explicitly inform machines
why answer spans are correct. We extract structured variables and rules from
explanations and compose neural module teachers that annotate instances for
training downstream MRC models. We use learnable neural modules and soft logic
to handle linguistic variation and overcome sparse coverage; the modules are
jointly optimized with the MRC model to improve final performance. On the SQuAD
dataset, our proposed method achieves 70.14% F1 score with supervision from 26
explanations, comparable to plain supervised learning using 1,100 labeled
instances, yielding a 12x speed up.
Related papers
- Compositional Program Generation for Few-Shot Systematic Generalization [59.57656559816271]
This study on a neuro-symbolic architecture called the Compositional Program Generator (CPG)
CPG has three key features: textitmodularity, textitcomposition, and textitabstraction, in the form of grammar rules.
It perfect achieves generalization on both the SCAN and COGS benchmarks using just 14 examples for SCAN and 22 examples for COGS.
arXiv Detail & Related papers (2023-09-28T14:33:20Z) - Language models are weak learners [71.33837923104808]
We show that prompt-based large language models can operate effectively as weak learners.
We incorporate these models into a boosting approach, which can leverage the knowledge within the model to outperform traditional tree-based boosting.
Results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.
arXiv Detail & Related papers (2023-06-25T02:39:19Z) - MaNtLE: Model-agnostic Natural Language Explainer [9.43206883360088]
We introduce MaNtLE, a model-agnostic natural language explainer that analyzes multiple classifier predictions.
MaNtLE uses multi-task training on thousands of synthetic classification tasks to generate faithful explanations.
Simulated user studies indicate that, on average, MaNtLE-generated explanations are at least 11% more faithful compared to LIME and Anchors explanations.
arXiv Detail & Related papers (2023-05-22T12:58:06Z) - Learn to Explain: Multimodal Reasoning via Thought Chains for Science
Question Answering [124.16250115608604]
We present Science Question Answering (SQA), a new benchmark that consists of 21k multimodal multiple choice questions with a diverse set of science topics and annotations of their answers with corresponding lectures and explanations.
We show that SQA improves the question answering performance by 1.20% in few-shot GPT-3 and 3.99% in fine-tuned UnifiedQA.
Our analysis further shows that language models, similar to humans, benefit from explanations to learn from fewer data and achieve the same performance with just 40% of the data.
arXiv Detail & Related papers (2022-09-20T07:04:24Z) - An Understanding-Oriented Robust Machine Reading Comprehension Model [12.870425062204035]
We propose an understanding-oriented machine reading comprehension model to address three kinds of robustness issues.
Specifically, we first use a natural language inference module to help the model understand the accurate semantic meanings of input questions.
Third, we propose a multilanguage learning mechanism to address the issue of generalization.
arXiv Detail & Related papers (2022-07-01T03:32:02Z) - CLUES: A Benchmark for Learning Classifiers using Natural Language
Explanations [12.278877764015725]
Supervised learning has traditionally focused on inductive learning by observing labeled examples of a task.
In contrast, humans have ability to learn new concepts from language.
We introduce CLUES, benchmark for learning using natural language ExplanationS.
CLUES consists of 36 real-world and 144 synthetic classification tasks.
arXiv Detail & Related papers (2022-04-14T17:54:46Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z) - ExpMRC: Explainability Evaluation for Machine Reading Comprehension [42.483940360860096]
We propose a new benchmark called ExpMRC for evaluating the explainability of the Machine Reading systems.
We use state-of-the-art pre-trained language models to build baseline systems and adopt various unsupervised approaches to extract evidence without a human-annotated training set.
arXiv Detail & Related papers (2021-05-10T06:00:20Z) - Towards Interpretable Natural Language Understanding with Explanations
as Latent Variables [146.83882632854485]
We develop a framework for interpretable natural language understanding that requires only a small set of human annotated explanations for training.
Our framework treats natural language explanations as latent variables that model the underlying reasoning process of a neural model.
arXiv Detail & Related papers (2020-10-24T02:05:56Z) - ALICE: Active Learning with Contrastive Natural Language Explanations [69.03658685761538]
We propose Active Learning with Contrastive Explanations (ALICE) to improve data efficiency in learning.
ALICE learns to first use active learning to select the most informative pairs of label classes to elicit contrastive natural language explanations.
It extracts knowledge from these explanations using a semantically extracted knowledge.
arXiv Detail & Related papers (2020-09-22T01:02:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.