Related papers: The Role of the Availability Heuristic in Multiple-Choice Answering Behaviour

The Role of the Availability Heuristic in Multiple-Choice Answering Behaviour

URL: http://arxiv.org/abs/2602.17377v1
Date: Thu, 19 Feb 2026 13:58:48 GMT
Title: The Role of the Availability Heuristic in Multiple-Choice Answering Behaviour
Authors: Leonidas Zotos, Hedderik van Rijn, Malvina Nissim,
Abstract summary: Using Wikipedia as the retrieval corpus, we find that always selecting the most available option leads to scores 13.5% to 32.9% above the random-guess baseline.<n>Our findings suggest that availability should be considered in current and future work when modelling student behaviour.
Score: 13.619432837325471
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: When students are unsure of the correct answer to a multiple-choice question (MCQ), guessing is common practice. The availability heuristic, proposed by A. Tversky and D. Kahneman in 1973, suggests that the ease with which relevant instances come to mind, typically operationalised by the mere frequency of exposure, can offer a mental shortcut for problems in which the test-taker does not know the exact answer. Is simply choosing the option that comes most readily to mind a good strategy for answering MCQs? We propose a computational method of assessing the cognitive availability of MCQ options operationalised by concepts' prevalence in large corpora. The key finding, across three large question sets, is that correct answers, independently of the question stem, are significantly more available than incorrect MCQ options. Specifically, using Wikipedia as the retrieval corpus, we find that always selecting the most available option leads to scores 13.5% to 32.9% above the random-guess baseline. We further find that LLM-generated MCQ options show similar patterns of availability compared to expert-created options, despite the LLMs' frequentist nature and their training on large collections of textual data. Our findings suggest that availability should be considered in current and future work when computationally modelling student behaviour.

Related papers

Reasoning Models are Test Exploiters: Rethinking Multiple-Choice [12.317748510370238]
Large Language Models (LLMs) are asked to choose among a fixed set of choices.<n>Multiple-choice question-answering (McQCA) is a good proxy for the downstream performance of models.<n>This paper investigates the extent to which this trend continues to hold for state-of-the-art reasoning models.
arXiv Detail & Related papers (2025-07-21T07:49:32Z)
Right Answer, Wrong Score: Uncovering the Inconsistencies of LLM Evaluation in Multiple-Choice Question Answering [78.89231943329885]
Multiple-Choice Question Answering (MCQA) is widely used to evaluate Large Language Models (LLMs)<n>We show that multiple factors can significantly impact the reported performance of LLMs.<n>We analyze whether existing answer extraction methods are aligned with human judgment.
arXiv Detail & Related papers (2025-03-19T08:45:03Z)
Differentiating Choices via Commonality for Multiple-Choice Question Answering [54.04315943420376]
Multiple-choice question answering can provide valuable clues for choosing the right answer. Existing models often rank each choice separately, overlooking the context provided by other choices. We propose a novel model by differentiating choices through identifying and eliminating their commonality, called DCQA.
arXiv Detail & Related papers (2024-08-21T12:05:21Z)
Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question? [15.308093827770474]
We probe if large language models (LLMs) can perform multiple-choice question answering (MCQA) with choices-only prompts. This prompt bests a majority baseline in 11/12 cases, with up to 0.33 accuracy gain. We conduct an in-depth, black-box analysis on memorization, choice dynamics, and question inference.
arXiv Detail & Related papers (2024-02-19T19:38:58Z)
Large Language Models Are Not Robust Multiple Choice Selectors [117.72712117510953]
Multiple choice questions (MCQs) serve as a common yet important task format in the evaluation of large language models (LLMs) This work shows that modern LLMs are vulnerable to option position changes due to their inherent "selection bias" We propose a label-free, inference-time debiasing method, called PriDe, which separates the model's prior bias for option IDs from the overall prediction distribution.
arXiv Detail & Related papers (2023-09-07T17:44:56Z)
Leveraging Large Language Models for Multiple Choice Question Answering [6.198523595657983]
We show that a model with high MCSB ability performs much better with the natural approach than with the traditional approach. We show that a model with high MCSB ability performs much better with the natural approach than with the traditional approach.
arXiv Detail & Related papers (2022-10-22T05:04:54Z)
Generative Context Pair Selection for Multi-hop Question Answering [60.74354009152721]
We propose a generative context selection model for multi-hop question answering. Our proposed generative passage selection model has a better performance (4.9% higher than baseline) on adversarial held-out set.
arXiv Detail & Related papers (2021-04-18T07:00:48Z)
Unsupervised Multiple Choices Question Answering: Start Learning from Basic Knowledge [75.7135212362517]
We study the possibility of almost unsupervised Multiple Choices Question Answering (MCQA) The proposed method is shown to outperform the baseline approaches on RACE and even comparable with some supervised learning approaches on MC500.
arXiv Detail & Related papers (2020-10-21T13:44:35Z)
MS-Ranker: Accumulating Evidence from Potentially Correct Candidates for Answer Selection [59.95429407899612]
We propose a novel reinforcement learning based multi-step ranking model, named MS-Ranker. We explicitly consider the potential correctness of candidates and update the evidence with a gating mechanism. Our model significantly outperforms existing methods that do not rely on external resources.
arXiv Detail & Related papers (2020-10-10T10:36:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.