CASE: Commonsense-Augmented Score with an Expanded Answer Space
- URL: http://arxiv.org/abs/2311.01684v1
- Date: Fri, 3 Nov 2023 03:15:26 GMT
- Title: CASE: Commonsense-Augmented Score with an Expanded Answer Space
- Authors: Wenkai Chen and Sahithya Ravi and Vered Shwartz
- Abstract summary: We propose CASE, a Commonsense-Augmented Score with an Expanded Answer Space.
Case addresses the limitation by assigning importance weights for individual words based on their semantic relations to other words in the input.
We then also follow prior work in expanding the answer space by generating lexically-divergent answers that are conceptually-similar to the choices.
- Score: 13.915710684653174
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: LLMs have demonstrated impressive zero-shot performance on NLP tasks thanks
to the knowledge they acquired in their training. In multiple-choice QA tasks,
the LM probabilities are used as an imperfect measure of the plausibility of
each answer choice. One of the major limitations of the basic score is that it
treats all words as equally important. We propose CASE, a Commonsense-Augmented
Score with an Expanded Answer Space. CASE addresses this limitation by
assigning importance weights for individual words based on their semantic
relations to other words in the input. The dynamic weighting approach
outperforms basic LM scores, not only because it reduces noise from unimportant
words, but also because it informs the model of implicit commonsense knowledge
that may be useful for answering the question. We then also follow prior work
in expanding the answer space by generating lexically-divergent answers that
are conceptually-similar to the choices. When combined with answer space
expansion, our method outperforms strong baselines on 5 commonsense benchmarks.
We further show these two approaches are complementary and may be especially
beneficial when using smaller LMs.
Related papers
- Not All LLM Reasoners Are Created Equal [58.236453890457476]
We study the depth of grade-school math problem-solving capabilities of LLMs.
We evaluate their performance on pairs of existing math word problems together.
arXiv Detail & Related papers (2024-10-02T17:01:10Z) - Crafting Interpretable Embeddings by Asking LLMs Questions [89.49960984640363]
Large language models (LLMs) have rapidly improved text embeddings for a growing array of natural-language processing tasks.
We introduce question-answering embeddings (QA-Emb), embeddings where each feature represents an answer to a yes/no question asked to an LLM.
We use QA-Emb to flexibly generate interpretable models for predicting fMRI voxel responses to language stimuli.
arXiv Detail & Related papers (2024-05-26T22:30:29Z) - Graph Elicitation for Guiding Multi-Step Reasoning in Large Language Models [16.432208223793666]
Chain-of-Thought prompting along with sub-question generation and answering has enhanced multi-step reasoning capabilities.
We propose a GE-Reasoning method, which directs Large Language Models to generate proper sub-questions and corresponding answers.
Our approach outperforms previous CoT prompting methods and their variants on multi-hop question answering benchmark datasets.
arXiv Detail & Related papers (2023-11-16T10:36:08Z) - Clarify When Necessary: Resolving Ambiguity Through Interaction with LMs [58.620269228776294]
We propose a task-agnostic framework for resolving ambiguity by asking users clarifying questions.
We evaluate systems across three NLP applications: question answering, machine translation and natural language inference.
We find that intent-sim is robust, demonstrating improvements across a wide range of NLP tasks and LMs.
arXiv Detail & Related papers (2023-11-16T00:18:50Z) - Improving Zero-shot Visual Question Answering via Large Language Models
with Reasoning Question Prompts [22.669502403623166]
We present Reasoning Question Prompts for VQA tasks, which can further activate the potential of Large Language Models.
We generate self-contained questions as reasoning question prompts via an unsupervised question edition module.
Each reasoning question prompt clearly indicates the intent of the original question.
Then, the candidate answers associated with their confidence scores acting as answer integritys are fed into LLMs.
arXiv Detail & Related papers (2023-11-15T15:40:46Z) - Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models [27.491408293411734]
Large Language Models (LLMs) show promising results in language generation and instruction following but frequently "hallucinate"
Our research introduces a simple redundancy: not all tokens in auto-regressive text equally represent the underlying meaning.
arXiv Detail & Related papers (2023-07-03T22:17:16Z) - Allies: Prompting Large Language Model with Beam Search [107.38790111856761]
In this work, we propose a novel method called ALLIES.
Given an input query, ALLIES leverages LLMs to iteratively generate new queries related to the original query.
By iteratively refining and expanding the scope of the original query, ALLIES captures and utilizes hidden knowledge that may not be directly through retrieval.
arXiv Detail & Related papers (2023-05-24T06:16:44Z) - Increasing Probability Mass on Answer Choices Does Not Always Improve
Accuracy [60.18632773935895]
Spreading probability mass across multiple surface forms with identical meaning is thought to cause an underestimation of a model's true performance.
We propose a mathematical formalism for SFC which allows us to quantify and bound its impact for the first time.
We identify a simple method for reducing it -- namely, increasing probability mass on the given answer choices by a) including them in the prompt and b) using in-context learning with even just one example.
arXiv Detail & Related papers (2023-05-24T00:27:00Z) - Learning to Ask Conversational Questions by Optimizing Levenshtein
Distance [83.53855889592734]
We introduce a Reinforcement Iterative Sequence Editing (RISE) framework that optimize the minimum Levenshtein distance (MLD) through explicit editing actions.
RISE is able to pay attention to tokens that are related to conversational characteristics.
Experimental results on two benchmark datasets show that RISE significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-06-30T08:44:19Z) - A Semantic-based Method for Unsupervised Commonsense Question Answering [40.18557352036813]
Unsupervised commonsense question answering is appealing since it does not rely on any labeled task data.
We present a novel SEmantic-based Question Answering method (SEQA) for unsupervised commonsense question answering.
arXiv Detail & Related papers (2021-05-31T08:21:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.