Related papers: Explaining Question Answering Models through Text Generation

Explaining Question Answering Models through Text Generation

URL: http://arxiv.org/abs/2004.05569v1
Date: Sun, 12 Apr 2020 09:06:46 GMT
Title: Explaining Question Answering Models through Text Generation
Authors: Veronica Latcinnik, Jonathan Berant
Abstract summary: Large pre-trained language models (LMs) have been shown to perform surprisingly well when fine-tuned on tasks that require commonsense and world knowledge. It is difficult to explain what is the knowledge in the LM that allows it to make a correct prediction in end-to-end architectures. We show on several tasks that our model reaches performance that is comparable to end-to-end architectures.
Score: 42.36596190720944
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large pre-trained language models (LMs) have been shown to perform surprisingly well when fine-tuned on tasks that require commonsense and world knowledge. However, in end-to-end architectures, it is difficult to explain what is the knowledge in the LM that allows it to make a correct prediction. In this work, we propose a model for multi-choice question answering, where a LM-based generator generates a textual hypothesis that is later used by a classifier to answer the question. The hypothesis provides a window into the information used by the fine-tuned LM that can be inspected by humans. A key challenge in this setup is how to constrain the model to generate hypotheses that are meaningful to humans. We tackle this by (a) joint training with a simple similarity classifier that encourages meaningful hypotheses, and (b) by adding loss functions that encourage natural text without repetitions. We show on several tasks that our model reaches performance that is comparable to end-to-end architectures, while producing hypotheses that elucidate the knowledge used by the LM for answering the question.

Related papers

What Do Language Models Learn in Context? The Structured Task Hypothesis [89.65045443150889]
Large language models (LLMs) learn a novel task from in-context examples presented in a demonstration, termed in-context learning (ICL) One popular hypothesis explains ICL by task selection. Another popular hypothesis is that ICL is a form of meta-learning, i.e., the models learn a learning algorithm at pre-training time and apply it to the demonstration.
arXiv Detail & Related papers (2024-06-06T16:15:34Z)
I've got the "Answer"! Interpretation of LLMs Hidden States in Question Answering [0.0]
This paper investigates the interpretation of large language models (LLMs) in the context of the knowledge-based question answering. The main hypothesis of the study is that correct and incorrect model behavior can be distinguished at the level of hidden states.
arXiv Detail & Related papers (2024-06-04T07:43:12Z)
Crafting Interpretable Embeddings by Asking LLMs Questions [89.49960984640363]
Large language models (LLMs) have rapidly improved text embeddings for a growing array of natural-language processing tasks. We introduce question-answering embeddings (QA-Emb), embeddings where each feature represents an answer to a yes/no question asked to an LLM. We use QA-Emb to flexibly generate interpretable models for predicting fMRI voxel responses to language stimuli.
arXiv Detail & Related papers (2024-05-26T22:30:29Z)
Can Small Language Models Help Large Language Models Reason Better?: LM-Guided Chain-of-Thought [51.240387516059535]
We introduce a novel framework, LM-Guided CoT, that leverages a lightweight (i.e., 1B) language model (LM) for guiding a black-box large (i.e., >10B) LM in reasoning tasks. We optimize the model through 1) knowledge distillation and 2) reinforcement learning from rationale-oriented and task-oriented reward signals.
arXiv Detail & Related papers (2024-04-04T12:46:37Z)
A Hypothesis-Driven Framework for the Analysis of Self-Rationalising Models [0.8702432681310401]
We use a Bayesian network to implement a hypothesis about how a task is solved. The resulting models do not exhibit a strong similarity to GPT-3.5. We discuss the implications of this as well as the framework's potential to approximate LLM decisions better in future work.
arXiv Detail & Related papers (2024-02-07T12:26:12Z)
Text Modular Networks: Learning to Decompose Tasks in the Language of Existing Models [61.480085460269514]
We propose a framework for building interpretable systems that learn to solve complex tasks by decomposing them into simpler ones solvable by existing models. We use this framework to build ModularQA, a system that can answer multi-hop reasoning questions by decomposing them into sub-questions answerable by a neural factoid single-span QA model and a symbolic calculator.
arXiv Detail & Related papers (2020-09-01T23:45:42Z)
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge [96.92252296244233]
Large pre-trained language models (LMs) acquire some reasoning capacity, but this ability is difficult to control. We show that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements. Our work paves a path towards open-domain systems that constantly improve by interacting with users who can instantly correct a model by adding simple natural language statements.
arXiv Detail & Related papers (2020-06-11T17:02:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.