MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical
domain Question Answering
- URL: http://arxiv.org/abs/2203.14371v1
- Date: Sun, 27 Mar 2022 18:59:16 GMT
- Title: MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical
domain Question Answering
- Authors: Ankit Pal, Logesh Kumar Umapathi and Malaikannan Sankarasubbu
- Abstract summary: More than 194k high-quality AIIMS & NEET PG entrance exam MCQs covering 2.4k healthcare topics and 21 medical subjects are collected.
Each sample contains a question, correct answer(s), and other options which requires a deeper language understanding.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces MedMCQA, a new large-scale, Multiple-Choice Question
Answering (MCQA) dataset designed to address real-world medical entrance exam
questions. More than 194k high-quality AIIMS \& NEET PG entrance exam MCQs
covering 2.4k healthcare topics and 21 medical subjects are collected with an
average token length of 12.77 and high topical diversity. Each sample contains
a question, correct answer(s), and other options which requires a deeper
language understanding as it tests the 10+ reasoning abilities of a model
across a wide range of medical subjects \& topics. A detailed explanation of
the solution, along with the above information, is provided in this study.
Related papers
- MediQAl: A French Medical Question Answering Dataset for Knowledge and Reasoning Evaluation [0.7770029179741429]
MediQAl contains 32,603 questions sourced from French medical examinations across 41 medical subjects.<n>The dataset includes three tasks: (i) Multiple-Choice Question with Unique answer, (ii) Multiple-Choice Question with Multiple answer, and (iii) Open-Ended Question with Short-Answer.
arXiv Detail & Related papers (2025-07-28T15:17:48Z) - Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning [57.873833577058]
We build a multimodal dataset enriched with extensive medical knowledge.<n>We then introduce our medical-specialized MLLM: Lingshu.<n>Lingshu undergoes multi-stage training to embed medical expertise and enhance its task-solving capabilities.
arXiv Detail & Related papers (2025-06-08T08:47:30Z) - MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding [20.83722922095852]
MedXpertQA includes 4,460 questions spanning 17 specialties and 11 body systems.
MM introduces expert-level exam questions with diverse images and rich clinical information.
arXiv Detail & Related papers (2025-01-30T14:07:56Z) - MediFact at MEDIQA-M3G 2024: Medical Question Answering in Dermatology with Multimodal Learning [0.0]
This paper addresses the limitations of traditional methods by proposing a weakly supervised learning approach for open-ended medical question-answering (QA)
Our system leverages readily available MEDIQA-M3G images via a VGG16-CNN-SVM model, enabling multilingual learning of informative skin condition representations.
This work advances medical QA research, paving the way for clinical decision support systems and ultimately improving healthcare delivery.
arXiv Detail & Related papers (2024-04-27T20:03:47Z) - Large Language Models for Multi-Choice Question Classification of Medical Subjects [0.2020207586732771]
We train deep neural networks for multi-class classification of questions into the inferred medical subjects.
We show the capability of AI and LLMs in particular for multi-classification tasks in the Healthcare domain.
arXiv Detail & Related papers (2024-03-21T17:36:08Z) - OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM [48.16696073640864]
We introduce OmniMedVQA, a novel comprehensive medical Visual Question Answering (VQA) benchmark.
All images in this benchmark are sourced from authentic medical scenarios.
We have found that existing LVLMs struggle to address these medical VQA problems effectively.
arXiv Detail & Related papers (2024-02-14T13:51:56Z) - Contributions to the Improvement of Question Answering Systems in the
Biomedical Domain [0.951828574518325]
This thesis work falls within the framework of question answering (QA) in the biomedical domain.
We propose four contributions to improve the performance of QA in the biomedical domain.
We develop a fully automated semantic biomedical QA system called SemBioNLQA.
arXiv Detail & Related papers (2023-07-25T16:31:20Z) - PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering [56.25766322554655]
Medical Visual Question Answering (MedVQA) presents a significant opportunity to enhance diagnostic accuracy and healthcare delivery.
We propose a generative-based model for medical visual understanding by aligning visual information from a pre-trained vision encoder with a large language model.
We train the proposed model on PMC-VQA and then fine-tune it on multiple public benchmarks, e.g., VQA-RAD, SLAKE, and Image-Clef 2019.
arXiv Detail & Related papers (2023-05-17T17:50:16Z) - Large Language Models Need Holistically Thought in Medical
Conversational QA [24.2230289885612]
The Holistically Thought (HoT) method is designed to guide the LLMs to perform the diffused and focused thinking for generating high-quality medical responses.
The proposed HoT method has been evaluated through automated and manual assessments in three different medical CQA datasets.
arXiv Detail & Related papers (2023-05-09T12:57:28Z) - PMC-LLaMA: Towards Building Open-source Language Models for Medicine [62.39105735933138]
Large Language Models (LLMs) have showcased remarkable capabilities in natural language understanding.
LLMs struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge.
We describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA.
arXiv Detail & Related papers (2023-04-27T18:29:05Z) - FrenchMedMCQA: A French Multiple-Choice Question Answering Dataset for
Medical domain [4.989459243399296]
This paper introduces FrenchMedMCQA, the first publicly available Multiple-Choice Question Answering (MCQA) dataset in French for medical domain.
It is composed of 3,105 questions taken from real exams of the French medical specialization diploma in pharmacy.
arXiv Detail & Related papers (2023-04-09T16:57:40Z) - Medical Visual Question Answering: A Survey [55.53205317089564]
Medical Visual Question Answering(VQA) is a combination of medical artificial intelligence and popular VQA challenges.
Given a medical image and a clinically relevant question in natural language, the medical VQA system is expected to predict a plausible and convincing answer.
arXiv Detail & Related papers (2021-11-19T05:55:15Z) - MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware
Medical Dialogue Generation [86.38736781043109]
We build and release a large-scale high-quality Medical Dialogue dataset related to 12 types of common Gastrointestinal diseases named MedDG.
We propose two kinds of medical dialogue tasks based on MedDG dataset. One is the next entity prediction and the other is the doctor response generation.
Experimental results show that the pre-train language models and other baselines struggle on both tasks with poor performance in our dataset.
arXiv Detail & Related papers (2020-10-15T03:34:33Z) - Interpretable Multi-Step Reasoning with Knowledge Extraction on Complex
Healthcare Question Answering [89.76059961309453]
HeadQA dataset contains multiple-choice questions authorized for the public healthcare specialization exam.
These questions are the most challenging for current QA systems.
We present a Multi-step reasoning with Knowledge extraction framework (MurKe)
We are striving to make full use of off-the-shelf pre-trained models.
arXiv Detail & Related papers (2020-08-06T02:47:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.