Subjective Question Answering: Deciphering the inner workings of
Transformers in the realm of subjectivity
- URL: http://arxiv.org/abs/2006.08342v2
- Date: Wed, 14 Oct 2020 07:47:52 GMT
- Title: Subjective Question Answering: Deciphering the inner workings of
Transformers in the realm of subjectivity
- Authors: Lukas Muttenthaler
- Abstract summary: I've exploited a recently released dataset for span-selection Question Answering, namely SubjQA.
SubjQA is the first dataset that contains questions that ask for subjective opinions corresponding to review paragraphs from six different domains.
I've investigated the inner workings of a Transformer-based architecture to contribute to a better understanding of these not yet well understood "black-box" models.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding subjectivity demands reasoning skills beyond the realm of
common knowledge. It requires a machine learning model to process sentiment and
to perform opinion mining. In this work, I've exploited a recently released
dataset for span-selection Question Answering, namely SubjQA. SubjQA is the
first QA dataset that contains questions that ask for subjective opinions
corresponding to review paragraphs from six different domains. Hence, to answer
these subjective questions, a learner must extract opinions and process
sentiment for various domains, and additionally, align the knowledge extracted
from a paragraph with the natural language utterances in the corresponding
question, which together enhance the difficulty of a QA task. The primary goal
of this thesis was to investigate the inner workings (i.e., latent
representations) of a Transformer-based architecture to contribute to a better
understanding of these not yet well understood "black-box" models.
Transformer's hidden representations, concerning the true answer span, are
clustered more closely in vector space than those representations corresponding
to erroneous predictions. This observation holds across the top three
Transformer layers for both objective and subjective questions and generally
increases as a function of layer dimensions. Moreover, the probability to
achieve a high cosine similarity among hidden representations in latent space
concerning the true answer span tokens is significantly higher for correct
compared to incorrect answer span predictions. These results have decisive
implications for down-stream applications, where it is crucial to know about
why a neural network made mistakes, and in which point, in space and time the
mistake has happened (e.g., to automatically predict correctness of an answer
span prediction without the necessity of labeled data).
Related papers
- Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features [19.261178173399784]
Our work studies word sensitivity (WS) in the prototypical setting of random features.
We show that attention layers enjoy high WS, namely, there exists a vector in the space of embeddings that largely perturbs the random attention features map.
We then translate these results on the word sensitivity into generalization bounds.
arXiv Detail & Related papers (2024-02-05T12:47:19Z) - How Do Transformers Learn Topic Structure: Towards a Mechanistic
Understanding [56.222097640468306]
We provide mechanistic understanding of how transformers learn "semantic structure"
We show, through a combination of mathematical analysis and experiments on Wikipedia data, that the embedding layer and the self-attention layer encode the topical structure.
arXiv Detail & Related papers (2023-03-07T21:42:17Z) - Interpretable by Design: Learning Predictors by Composing Interpretable
Queries [8.054701719767293]
We argue that machine learning algorithms should be interpretable by design.
We minimize the expected number of queries needed for accurate prediction.
Experiments on vision and NLP tasks demonstrate the efficacy of our approach.
arXiv Detail & Related papers (2022-07-03T02:40:34Z) - Knowledge-Routed Visual Question Reasoning: Challenges for Deep
Representation Embedding [140.5911760063681]
We propose a novel dataset named Knowledge-Routed Visual Question Reasoning for VQA model evaluation.
We generate the question-answer pair based on both the Visual Genome scene graph and an external knowledge base with controlled programs.
arXiv Detail & Related papers (2020-12-14T00:33:44Z) - A Wrong Answer or a Wrong Question? An Intricate Relationship between
Question Reformulation and Answer Selection in Conversational Question
Answering [15.355557454305776]
We show that question rewriting (QR) of the conversational context allows to shed more light on this phenomenon.
We present the results of this analysis on the TREC CAsT and QuAC (CANARD) datasets.
arXiv Detail & Related papers (2020-10-13T06:29:51Z) - FAT ALBERT: Finding Answers in Large Texts using Semantic Similarity
Attention Layer based on BERT [0.5772546394254112]
We develop a model based on BERT, a state-of-the-art transformer network.
We are ranked first in the leader board with test accuracy of 87.79%.
arXiv Detail & Related papers (2020-08-22T08:04:21Z) - ClarQ: A large-scale and diverse dataset for Clarification Question
Generation [67.1162903046619]
We devise a novel bootstrapping framework that assists in the creation of a diverse, large-scale dataset of clarification questions based on postcomments extracted from stackexchange.
We quantitatively demonstrate the utility of the newly created dataset by applying it to the downstream task of question-answering.
We release this dataset in order to foster research into the field of clarification question generation with the larger goal of enhancing dialog and question answering systems.
arXiv Detail & Related papers (2020-06-10T17:56:50Z) - Visual Question Answering with Prior Class Semantics [50.845003775809836]
We show how to exploit additional information pertaining to the semantics of candidate answers.
We extend the answer prediction process with a regression objective in a semantic space.
Our method brings improvements in consistency and accuracy over a range of question types.
arXiv Detail & Related papers (2020-05-04T02:46:31Z) - Knowledgeable Dialogue Reading Comprehension on Key Turns [84.1784903043884]
Multi-choice machine reading comprehension (MRC) requires models to choose the correct answer from candidate options given a passage and a question.
Our research focuses dialogue-based MRC, where the passages are multi-turn dialogues.
It suffers from two challenges, the answer selection decision is made without support of latently helpful commonsense, and the multi-turn context may hide considerable irrelevant information.
arXiv Detail & Related papers (2020-04-29T07:04:43Z) - SQuINTing at VQA Models: Introspecting VQA Models with Sub-Questions [66.86887670416193]
We show that state-of-the-art VQA models have comparable performance in answering perception and reasoning questions, but suffer from consistency problems.
To address this shortcoming, we propose an approach called Sub-Question-aware Network Tuning (SQuINT)
We show that SQuINT improves model consistency by 5%, also marginally improving performance on the Reasoning questions in VQA, while also displaying better attention maps.
arXiv Detail & Related papers (2020-01-20T01:02:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.