BESTMVQA: A Benchmark Evaluation System for Medical Visual Question
Answering
- URL: http://arxiv.org/abs/2312.07867v1
- Date: Wed, 13 Dec 2023 03:08:48 GMT
- Title: BESTMVQA: A Benchmark Evaluation System for Medical Visual Question
Answering
- Authors: Xiaojie Hong, Zixin Song, Liangzhi Li, Xiaoli Wang, Feiyan Liu
- Abstract summary: This paper develops a Benchmark Evaluation SysTem for Medical Visual Question Answering, denoted by BESTMVQA.
Our system provides a useful tool for users to automatically build Med-VQA datasets, which helps overcoming the data insufficient problem.
With simple configurations, our system automatically trains and evaluates the selected models over a benchmark dataset.
- Score: 8.547600133510551
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Medical Visual Question Answering (Med-VQA) is a very important task in
healthcare industry, which answers a natural language question with a medical
image. Existing VQA techniques in information systems can be directly applied
to solving the task. However, they often suffer from (i) the data insufficient
problem, which makes it difficult to train the state of the arts (SOTAs) for
the domain-specific task, and (ii) the reproducibility problem, that many
existing models have not been thoroughly evaluated in a unified experimental
setup. To address these issues, this paper develops a Benchmark Evaluation
SysTem for Medical Visual Question Answering, denoted by BESTMVQA. Given
self-collected clinical data, our system provides a useful tool for users to
automatically build Med-VQA datasets, which helps overcoming the data
insufficient problem. Users also can conveniently select a wide spectrum of
SOTA models from our model library to perform a comprehensive empirical study.
With simple configurations, our system automatically trains and evaluates the
selected models over a benchmark dataset, and reports the comprehensive results
for users to develop new techniques or perform medical practice. Limitations of
existing work are overcome (i) by the data generation tool, which automatically
constructs new datasets from unstructured clinical data, and (ii) by evaluating
SOTAs on benchmark datasets in a unified experimental setup. The demonstration
video of our system can be found at https://youtu.be/QkEeFlu1x4A. Our code and
data will be available soon.
Related papers
- Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - UNK-VQA: A Dataset and a Probe into the Abstention Ability of Multi-modal Large Models [55.22048505787125]
This paper contributes a comprehensive dataset, called UNK-VQA.
We first augment the existing data via deliberate perturbations on either the image or question.
We then extensively evaluate the zero- and few-shot performance of several emerging multi-modal large models.
arXiv Detail & Related papers (2023-10-17T02:38:09Z) - Visual Question Answering in the Medical Domain [13.673890873313354]
We present a novel contrastive learning pretraining method to mitigate the problem of small datasets for the Med-VQA task.
Our proposed model obtained an accuracy of 60% on the VQA-Med 2019 test set, giving comparable results to other state-of-the-art Med-VQA models.
arXiv Detail & Related papers (2023-09-20T06:06:10Z) - PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering [56.25766322554655]
Medical Visual Question Answering (MedVQA) presents a significant opportunity to enhance diagnostic accuracy and healthcare delivery.
We propose a generative-based model for medical visual understanding by aligning visual information from a pre-trained vision encoder with a large language model.
We train the proposed model on PMC-VQA and then fine-tune it on multiple public benchmarks, e.g., VQA-RAD, SLAKE, and Image-Clef 2019.
arXiv Detail & Related papers (2023-05-17T17:50:16Z) - Huatuo-26M, a Large-scale Chinese Medical QA Dataset [29.130166934474044]
In this paper, we release a largest ever medical Question Answering (QA) dataset with 26 million QA pairs.
We benchmark many existing approaches in our dataset in terms of both retrieval and generation.
We believe that this dataset will not only contribute to medical research but also facilitate both the patients and clinical doctors.
arXiv Detail & Related papers (2023-05-02T15:33:01Z) - Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review
and Replicability Study [60.56194508762205]
We reproduce, compare, and analyze state-of-the-art automated medical coding machine learning models.
We show that several models underperform due to weak configurations, poorly sampled train-test splits, and insufficient evaluation.
We present the first comprehensive results on the newly released MIMIC-IV dataset using the reproduced models.
arXiv Detail & Related papers (2023-04-21T11:54:44Z) - A Real Use Case of Semi-Supervised Learning for Mammogram Classification
in a Local Clinic of Costa Rica [0.5541644538483946]
Training a deep learning model requires a considerable amount of labeled images.
A number of publicly available datasets have been built with data from different hospitals and clinics.
The use of the semi-supervised deep learning approach known as MixMatch, to leverage the usage of unlabeled data is proposed and evaluated.
arXiv Detail & Related papers (2021-07-24T22:26:50Z) - Multiple Meta-model Quantifying for Medical Visual Question Answering [17.263363346756854]
We present a new multiple meta-model method that effectively learns meta-annotation and leverages meaningful features to the medical VQA task.
Our proposed method is designed to increase meta-data by auto-annotation, deal with noisy labels, and output meta-models which provide robust features for medical VQA tasks.
arXiv Detail & Related papers (2021-05-19T04:06:05Z) - Active Selection of Classification Features [0.0]
Auxiliary data, such as demographics, might help in selecting a smaller sample that comprises the individuals with the most informative MRI scans.
We propose two utility-based approaches for this problem, and evaluate their performance on three public real-world benchmark datasets.
arXiv Detail & Related papers (2021-02-26T18:19:08Z) - Diverse Complexity Measures for Dataset Curation in Self-driving [80.55417232642124]
We propose a new data selection method that exploits a diverse set of criteria that quantize interestingness of traffic scenes.
Our experiments show that the proposed curation pipeline is able to select datasets that lead to better generalization and higher performance.
arXiv Detail & Related papers (2021-01-16T23:45:02Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.