MedINST: Meta Dataset of Biomedical Instructions
- URL: http://arxiv.org/abs/2410.13458v1
- Date: Thu, 17 Oct 2024 11:38:54 GMT
- Title: MedINST: Meta Dataset of Biomedical Instructions
- Authors: Wenhan Han, Meng Fang, Zihan Zhang, Yu Yin, Zirui Song, Ling Chen, Mykola Pechenizkiy, Qingyu Chen,
- Abstract summary: MedINST comprises 133 biomedical NLP tasks and over 7 million training samples.
We fine-tune several LLMs on MedINST and evaluate on MedINST32, showcasing enhanced cross-task generalization.
- Score: 47.7146913542186
- License:
- Abstract: The integration of large language model (LLM) techniques in the field of medical analysis has brought about significant advancements, yet the scarcity of large, diverse, and well-annotated datasets remains a major challenge. Medical data and tasks, which vary in format, size, and other parameters, require extensive preprocessing and standardization for effective use in training LLMs. To address these challenges, we introduce MedINST, the Meta Dataset of Biomedical Instructions, a novel multi-domain, multi-task instructional meta-dataset. MedINST comprises 133 biomedical NLP tasks and over 7 million training samples, making it the most comprehensive biomedical instruction dataset to date. Using MedINST as the meta dataset, we curate MedINST32, a challenging benchmark with different task difficulties aiming to evaluate LLMs' generalization ability. We fine-tune several LLMs on MedINST and evaluate on MedINST32, showcasing enhanced cross-task generalization.
Related papers
- A Survey of Medical Vision-and-Language Applications and Their Techniques [48.268198631277315]
Medical vision-and-language models (MVLMs) have attracted substantial interest due to their capability to offer a natural language interface for interpreting complex medical data.
Here, we provide a comprehensive overview of MVLMs and the various medical tasks to which they have been applied.
We also examine the datasets used for these tasks and compare the performance of different models based on standardized evaluation metrics.
arXiv Detail & Related papers (2024-11-19T03:27:05Z) - Large Language Model Benchmarks in Medical Tasks [11.196196955468992]
This paper presents a survey of various benchmark datasets employed in medical large language models (LLMs) tasks.
The survey categorizes the datasets by modality, discussing their significance, data structure, and impact on the development of LLMs.
The paper emphasizes the need for datasets with a greater degree of language diversity, structured omics data, and innovative approaches to synthesis.
arXiv Detail & Related papers (2024-10-28T11:07:33Z) - Demystifying Large Language Models for Medicine: A Primer [50.83806796466396]
Large language models (LLMs) represent a transformative class of AI tools capable of revolutionizing various aspects of healthcare.
This tutorial aims to equip healthcare professionals with the tools necessary to effectively integrate LLMs into clinical practice.
arXiv Detail & Related papers (2024-10-24T15:41:56Z) - Repurposing Foundation Model for Generalizable Medical Time Series Classification [16.21546283978257]
FORMED is a foundation classification model that leverages a pre-trained backbone.
It can adapt seamlessly to unseen MedTS datasets, regardless of the number of channels, sample lengths, or medical tasks.
Our results highlight FORMED as a versatile and scalable model for a wide range of MedTS classification tasks, positioning it as a strong foundation model for future research in MedTS analysis.
arXiv Detail & Related papers (2024-10-03T23:50:04Z) - Towards Evaluating and Building Versatile Large Language Models for Medicine [57.49547766838095]
We present MedS-Bench, a benchmark designed to evaluate the performance of large language models (LLMs) in clinical contexts.
MedS-Bench spans 11 high-level clinical tasks, including clinical report summarization, treatment recommendations, diagnosis, named entity recognition, and medical concept explanation.
MedS-Ins comprises 58 medically oriented language corpora, totaling 13.5 million samples across 122 tasks.
arXiv Detail & Related papers (2024-08-22T17:01:34Z) - MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis [6.30440420617113]
We introduce MedTsLLM, a general multimodal large language model (LLM) framework that integrates time series data and rich contextual information in the form of text to analyze physiological signals.
We perform three tasks with clinical relevance: semantic segmentation, boundary detection, and anomaly detection in time series.
Our model outperforms state-of-the-art baselines, including deep learning models, other LLMs, and clinical methods across multiple medical domains.
arXiv Detail & Related papers (2024-08-14T18:57:05Z) - A comprehensive and easy-to-use multi-domain multi-task medical imaging meta-dataset (MedIMeta) [1.3641191496021943]
We introduce the Medical Imaging Meta-Dataset (MedIMeta), a novel multi-domain, multi-task meta-dataset.
MedIMeta contains 19 medical imaging datasets spanning 10 different domains and encompassing 54 distinct medical tasks.
We perform a technical validation of MedIMeta, demonstrating its utility through fully supervised and cross-domain few-shot learning baselines.
arXiv Detail & Related papers (2024-04-24T17:27:57Z) - Asclepius: A Spectrum Evaluation Benchmark for Medical Multi-Modal Large
Language Models [59.60384461302662]
We introduce Asclepius, a novel benchmark for evaluating Medical Multi-Modal Large Language Models (Med-MLLMs)
Asclepius rigorously and comprehensively assesses model capability in terms of distinct medical specialties and different diagnostic capacities.
We also provide an in-depth analysis of 6 Med-MLLMs and compare them with 5 human specialists.
arXiv Detail & Related papers (2024-02-17T08:04:23Z) - MedAlign: A Clinician-Generated Dataset for Instruction Following with
Electronic Medical Records [60.35217378132709]
Large language models (LLMs) can follow natural language instructions with human-level fluency.
evaluating LLMs on realistic text generation tasks for healthcare remains challenging.
We introduce MedAlign, a benchmark dataset of 983 natural language instructions for EHR data.
arXiv Detail & Related papers (2023-08-27T12:24:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.