PathVQA: 30000+ Questions for Medical Visual Question Answering
- URL: http://arxiv.org/abs/2003.10286v1
- Date: Sat, 7 Mar 2020 17:55:41 GMT
- Title: PathVQA: 30000+ Questions for Medical Visual Question Answering
- Authors: Xuehai He, Yichen Zhang, Luntian Mou, Eric Xing, Pengtao Xie
- Abstract summary: This is the first dataset for pathology VQA. To our best knowledge, this is the first dataset for pathology VQA. Our dataset will be released publicly to promote research in medical VQA.
- Score: 15.343890121216335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Is it possible to develop an "AI Pathologist" to pass the board-certified
examination of the American Board of Pathology? To achieve this goal, the first
step is to create a visual question answering (VQA) dataset where the AI agent
is presented with a pathology image together with a question and is asked to
give the correct answer. Our work makes the first attempt to build such a
dataset. Different from creating general-domain VQA datasets where the images
are widely accessible and there are many crowdsourcing workers available and
capable of generating question-answer pairs, developing a medical VQA dataset
is much more challenging. First, due to privacy concerns, pathology images are
usually not publicly available. Second, only well-trained pathologists can
understand pathology images, but they barely have time to help create datasets
for AI research. To address these challenges, we resort to pathology textbooks
and online digital libraries. We develop a semi-automated pipeline to extract
pathology images and captions from textbooks and generate question-answer pairs
from captions using natural language processing. We collect 32,799 open-ended
questions from 4,998 pathology images where each question is manually checked
to ensure correctness. To our best knowledge, this is the first dataset for
pathology VQA. Our dataset will be released publicly to promote research in
medical VQA.
Related papers
- Ask Questions with Double Hints: Visual Question Generation with Answer-awareness and Region-reference [107.53380946417003]
We propose a novel learning paradigm to generate visual questions with answer-awareness and region-reference.
We develop a simple methodology to self-learn the visual hints without introducing any additional human annotations.
arXiv Detail & Related papers (2024-07-06T15:07:32Z) - UNK-VQA: A Dataset and a Probe into the Abstention Ability of Multi-modal Large Models [55.22048505787125]
This paper contributes a comprehensive dataset, called UNK-VQA.
We first augment the existing data via deliberate perturbations on either the image or question.
We then extensively evaluate the zero- and few-shot performance of several emerging multi-modal large models.
arXiv Detail & Related papers (2023-10-17T02:38:09Z) - Expert Knowledge-Aware Image Difference Graph Representation Learning for Difference-Aware Medical Visual Question Answering [45.058569118999436]
Given a pair of main and reference images, this task attempts to answer several questions on both diseases.
We collect a new dataset, namely MIMIC-Diff-VQA, including 700,703 QA pairs from 164,324 pairs of main and reference images.
arXiv Detail & Related papers (2023-07-22T05:34:18Z) - PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering [56.25766322554655]
Medical Visual Question Answering (MedVQA) presents a significant opportunity to enhance diagnostic accuracy and healthcare delivery.
We propose a generative-based model for medical visual understanding by aligning visual information from a pre-trained vision encoder with a large language model.
We train the proposed model on PMC-VQA and then fine-tune it on multiple public benchmarks, e.g., VQA-RAD, SLAKE, and Image-Clef 2019.
arXiv Detail & Related papers (2023-05-17T17:50:16Z) - Interpretable Medical Image Visual Question Answering via Multi-Modal
Relationship Graph Learning [45.746882253686856]
Medical visual question answering (VQA) aims to answer clinically relevant questions regarding input medical images.
We first collected a comprehensive and large-scale medical VQA dataset, focusing on chest X-ray images.
Based on this dataset, we also propose a novel baseline method by constructing three different relationship graphs.
arXiv Detail & Related papers (2023-02-19T17:46:16Z) - Diagnosis of Paratuberculosis in Histopathological Images Based on
Explainable Artificial Intelligence and Deep Learning [0.0]
This study examines a new and original dataset using the deep learning algorithm, and visualizes the output with gradient-weighted class activation mapping (Grad-CAM)
Both the decision-making processes and the explanations were verified, and the accuracy of the output was tested.
The research results greatly help pathologists in the diagnosis of paratuberculosis.
arXiv Detail & Related papers (2022-08-02T18:05:26Z) - A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge [39.788346536244504]
A-OKVQA is a crowdsourced dataset composed of about 25K questions.
We demonstrate the potential of this new dataset through a detailed analysis of its contents.
arXiv Detail & Related papers (2022-06-03T17:52:27Z) - Medical Visual Question Answering: A Survey [55.53205317089564]
Medical Visual Question Answering(VQA) is a combination of medical artificial intelligence and popular VQA challenges.
Given a medical image and a clinically relevant question in natural language, the medical VQA system is expected to predict a plausible and convincing answer.
arXiv Detail & Related papers (2021-11-19T05:55:15Z) - Knowledge-Routed Visual Question Reasoning: Challenges for Deep
Representation Embedding [140.5911760063681]
We propose a novel dataset named Knowledge-Routed Visual Question Reasoning for VQA model evaluation.
We generate the question-answer pair based on both the Visual Genome scene graph and an external knowledge base with controlled programs.
arXiv Detail & Related papers (2020-12-14T00:33:44Z) - Pathological Visual Question Answering [14.816825480418588]
We need to create a visual question answering (VQA) dataset where the AI agent is presented with a pathology image together with a question and is asked to give the correct answer.
Due to privacy concerns, pathology images are usually not publicly available.
It is difficult to hire highly experienced pathologists to create pathology visual questions and answers.
The medical concepts and knowledge covered in pathology question-answer (QA) pairs are very diverse.
arXiv Detail & Related papers (2020-10-06T00:36:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.