Related papers: Condition-Gated Reasoning for Context-Dependent Biomedical Question Answering

Condition-Gated Reasoning for Context-Dependent Biomedical Question Answering

URL: http://arxiv.org/abs/2602.17911v1
Date: Fri, 20 Feb 2026 00:17:14 GMT
Title: Condition-Gated Reasoning for Context-Dependent Biomedical Question Answering
Authors: Jash Rajesh Parekh, Wonbin Kweon, Joey Chan, Rezarta Islamaj, Robert Leaman, Pengcheng Jiang, Chih-Hsuan Wei, Zhizheng Wang, Zhiyong Lu, Jiawei Han,
Abstract summary: We propose CondMedQA, the first benchmark for conditional biomedical QA.<n>We also propose Condition-Gated Reasoning (CGR), a novel framework that constructs condition-aware knowledge graphs.<n>Our findings show that CGR more reliably selects condition-appropriate answers while matching or exceeding state-of-the-art performance.
Score: 21.630894843470156
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Current biomedical question answering (QA) systems often assume that medical knowledge applies uniformly, yet real-world clinical reasoning is inherently conditional: nearly every decision depends on patient-specific factors such as comorbidities and contraindications. Existing benchmarks do not evaluate such conditional reasoning, and retrieval-augmented or graph-based methods lack explicit mechanisms to ensure that retrieved knowledge is applicable to given context. To address this gap, we propose CondMedQA, the first benchmark for conditional biomedical QA, consisting of multi-hop questions whose answers vary with patient conditions. Furthermore, we propose Condition-Gated Reasoning (CGR), a novel framework that constructs condition-aware knowledge graphs and selectively activates or prunes reasoning paths based on query conditions. Our findings show that CGR more reliably selects condition-appropriate answers while matching or exceeding state-of-the-art performance on biomedical QA benchmarks, highlighting the importance of explicitly modeling conditionality for robust medical reasoning.

Related papers

QIME: Constructing Interpretable Medical Text Embeddings via Ontology-Grounded Questions [29.87717901839441]
We propose QIME, an ontology-grounded framework for constructing interpretable medical text embeddings.<n>QIME generates semantically atomic questions that capture fine-grained distinctions in biomedical text.<n>We show that QIME consistently outperforms prior interpretable embedding methods and substantially narrows the gap to strong black-box biomedical encoders.
arXiv Detail & Related papers (2026-03-02T10:18:06Z)
MedRAGChecker: Claim-Level Verification for Biomedical Retrieval-Augmented Generation [8.37586466142299]
We introduce MedRAGChecker, a claim-level verification and diagnostic framework for biomedical RAG.<n>Given a question, retrieved evidence, and a generated answer, MedRAGChecker decomposes the answer into atomic claims and estimates claim support.<n>We show that MedRAGChecker reliably flags unsupported and contradicted claims and reveals distinct risk profiles across generators.
arXiv Detail & Related papers (2026-01-10T10:40:42Z)
Simulating Viva Voce Examinations to Evaluate Clinical Reasoning in Large Language Models [51.91760712805404]
We introduce VivaBench, a benchmark for evaluating sequential clinical reasoning in large language models (LLMs)<n>Our dataset consists of 1762 physician-curated clinical vignettes structured as interactive scenarios that simulate a (oral) examination in medical training.<n>Our analysis identified several failure modes that mirror common cognitive errors in clinical practice.
arXiv Detail & Related papers (2025-10-11T16:24:35Z)
Hierarchical Modeling for Medical Visual Question Answering with Cross-Attention Fusion [4.821565717653691]
Medical Visual Question Answering (Med-VQA) answers clinical questions using medical images, aiding diagnosis.<n>This study proposes a HiCA-VQA method, including two modules: Hierarchical Prompting for fine-grained medical questions and Hierarchical Answer Decoders.<n> Experiments on the Rad-Restruct benchmark demonstrate that the HiCA-VQA framework better outperforms existing state-of-the-art methods in answering hierarchical fine-grained questions.
arXiv Detail & Related papers (2025-04-04T03:03:12Z)
MedCoT: Medical Chain of Thought via Hierarchical Expert [48.91966620985221]
This paper presents MedCoT, a novel hierarchical expert verification reasoning chain method.<n>It is designed to enhance interpretability and accuracy in biomedical imaging inquiries.<n> Experimental evaluations on four standard Med-VQA datasets demonstrate that MedCoT surpasses existing state-of-the-art approaches.
arXiv Detail & Related papers (2024-12-18T11:14:02Z)
Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering [70.44269982045415]
Retrieval-augmented generation (RAG) has emerged as a promising approach to enhance the performance of large language models (LLMs) We introduce Medical Retrieval-Augmented Generation Benchmark (MedRGB) that provides various supplementary elements to four medical QA datasets. Our experimental results reveals current models' limited ability to handle noise and misinformation in the retrieved documents.
arXiv Detail & Related papers (2024-11-14T06:19:18Z)
MedLogic-AQA: Enhancing Medical Question Answering with Abstractive Models Focusing on Logical Structures [24.262037382512975]
We propose a novel Abstractive QA system MedLogic-AQA that harnesses First Order Logic (FOL) based rules extracted from both context and questions to generate well-grounded answers. This distinctive fusion of logical reasoning with abstractive QA equips our system to produce answers that are logically sound, relevant, and engaging.
arXiv Detail & Related papers (2024-10-20T18:29:38Z)
Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond [52.246494389096654]
This paper introduces Word-Sequence Entropy (WSE), a method that calibrates uncertainty at both the word and sequence levels. We compare WSE with six baseline methods on five free-form medical QA datasets, utilizing seven popular large language models (LLMs)
arXiv Detail & Related papers (2024-02-22T03:46:08Z)
Generating Explanations in Medical Question-Answering by Expectation Maximization Inference over Evidence [33.018873142559286]
We propose a novel approach for generating natural language explanations for answers predicted by medical QA systems. Our system extract knowledge from medical textbooks to enhance the quality of explanations during the explanation generation process.
arXiv Detail & Related papers (2023-10-02T16:00:37Z)
COSMO: Conditional SEQ2SEQ-based Mixture Model for Zero-Shot Commonsense Question Answering [50.65816570279115]
Identification of the implicit causes and effects of a social context is the driving capability which can enable machines to perform commonsense reasoning. Current approaches in this realm lack the ability to perform commonsense reasoning upon facing an unseen situation. We present Conditional SEQ2SEQ-based Mixture model (COSMO), which provides us with the capabilities of dynamic and diverse content generation.
arXiv Detail & Related papers (2020-11-02T07:08:19Z)
Interpretable Multi-Step Reasoning with Knowledge Extraction on Complex Healthcare Question Answering [89.76059961309453]
HeadQA dataset contains multiple-choice questions authorized for the public healthcare specialization exam. These questions are the most challenging for current QA systems. We present a Multi-step reasoning with Knowledge extraction framework (MurKe) We are striving to make full use of off-the-shelf pre-trained models.
arXiv Detail & Related papers (2020-08-06T02:47:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.