A Semantic-based Method for Unsupervised Commonsense Question Answering
- URL: http://arxiv.org/abs/2105.14781v1
- Date: Mon, 31 May 2021 08:21:52 GMT
- Title: A Semantic-based Method for Unsupervised Commonsense Question Answering
- Authors: Yilin Niu, Fei Huang, Jiaming Liang, Wenkai Chen, Xiaoyan Zhu, Minlie
Huang
- Abstract summary: Unsupervised commonsense question answering is appealing since it does not rely on any labeled task data.
We present a novel SEmantic-based Question Answering method (SEQA) for unsupervised commonsense question answering.
- Score: 40.18557352036813
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised commonsense question answering is appealing since it does not
rely on any labeled task data. Among existing work, a popular solution is to
use pre-trained language models to score candidate choices directly conditioned
on the question or context. However, such scores from language models can be
easily affected by irrelevant factors, such as word frequencies, sentence
structures, etc. These distracting factors may not only mislead the model to
choose a wrong answer but also make it oversensitive to lexical perturbations
in candidate answers.
In this paper, we present a novel SEmantic-based Question Answering method
(SEQA) for unsupervised commonsense question answering. Instead of directly
scoring each answer choice, our method first generates a set of plausible
answers with generative models (e.g., GPT-2), and then uses these plausible
answers to select the correct choice by considering the semantic similarity
between each plausible answer and each choice. We devise a simple, yet sound
formalism for this idea and verify its effectiveness and robustness with
extensive experiments. We evaluate the proposed method on four benchmark
datasets, and our method achieves the best results in unsupervised settings.
Moreover, when attacked by TextFooler with synonym replacement, SEQA
demonstrates much less performance drops than baselines, thereby indicating
stronger robustness.
Related papers
- Look at the Text: Instruction-Tuned Language Models are More Robust Multiple Choice Selectors than You Think [27.595110330513567]
We show that the text answers are more robust to question perturbations than the first token probabilities.
Our findings provide further evidence for the benefits of text answer evaluation over first token probability evaluation.
arXiv Detail & Related papers (2024-04-12T10:36:15Z) - "My Answer is C": First-Token Probabilities Do Not Match Text Answers in Instruction-Tuned Language Models [40.867655189493924]
Open-ended nature of language generation makes evaluation of large language models (LLMs) challenging.
One common evaluation approach uses multiple-choice questions (MCQ) to limit the response space.
We evaluate how aligned first-token evaluation is with the text output along several dimensions.
arXiv Detail & Related papers (2024-02-22T12:47:33Z) - Answering Ambiguous Questions via Iterative Prompting [84.3426020642704]
In open-domain question answering, due to the ambiguity of questions, multiple plausible answers may exist.
One approach is to directly predict all valid answers, but this can struggle with balancing relevance and diversity.
We present AmbigPrompt to address the imperfections of existing approaches to answering ambiguous questions.
arXiv Detail & Related papers (2023-07-08T04:32:17Z) - Momentum Contrastive Pre-training for Question Answering [54.57078061878619]
MCROSS introduces a momentum contrastive learning framework to align the answer probability between cloze-like and natural query-passage sample pairs.
Our method achieves noticeable improvement compared with all baselines in both supervised and zero-shot scenarios.
arXiv Detail & Related papers (2022-12-12T08:28:22Z) - TASA: Deceiving Question Answering Models by Twin Answer Sentences
Attack [93.50174324435321]
We present Twin Answer Sentences Attack (TASA), an adversarial attack method for question answering (QA) models.
TASA produces fluent and grammatical adversarial contexts while maintaining gold answers.
arXiv Detail & Related papers (2022-10-27T07:16:30Z) - A Mutual Information Maximization Approach for the Spurious Solution
Problem in Weakly Supervised Question Answering [60.768146126094955]
Weakly supervised question answering usually has only the final answers as supervision signals.
There may exist many spurious solutions that coincidentally derive the correct answer, but training on such solutions can hurt model performance.
We propose to explicitly exploit such semantic correlations by maximizing the mutual information between question-answer pairs and predicted solutions.
arXiv Detail & Related papers (2021-06-14T05:47:41Z) - Generative Context Pair Selection for Multi-hop Question Answering [60.74354009152721]
We propose a generative context selection model for multi-hop question answering.
Our proposed generative passage selection model has a better performance (4.9% higher than baseline) on adversarial held-out set.
arXiv Detail & Related papers (2021-04-18T07:00:48Z) - A Wrong Answer or a Wrong Question? An Intricate Relationship between
Question Reformulation and Answer Selection in Conversational Question
Answering [15.355557454305776]
We show that question rewriting (QR) of the conversational context allows to shed more light on this phenomenon.
We present the results of this analysis on the TREC CAsT and QuAC (CANARD) datasets.
arXiv Detail & Related papers (2020-10-13T06:29:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.