Learning A Multi-Task Transformer Via Unified And Customized Instruction
Tuning For Chest Radiograph Interpretation
- URL: http://arxiv.org/abs/2311.01092v2
- Date: Mon, 4 Mar 2024 04:28:11 GMT
- Title: Learning A Multi-Task Transformer Via Unified And Customized Instruction
Tuning For Chest Radiograph Interpretation
- Authors: Lijian Xu, Ziyu Ni, Xinglong Liu, Xiaosong Wang, Hongsheng Li, and
Shaoting Zhang
- Abstract summary: We demonstrate a unified transformer model specifically designed for multi-modal clinical tasks by incorporating customized instruction tuning.
We first compose a multi-task training dataset comprising 13.4 million instruction and ground-truth pairs.
We can unify the various vision-intensive tasks in a single training framework with homogeneous model inputs and outputs to increase clinical interpretability in one reading.
- Score: 35.87795950781491
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The emergence of multi-modal deep learning models has made significant
impacts on clinical applications in the last decade. However, the majority of
models are limited to single-tasking, without considering disease diagnosis is
indeed a multi-task procedure. Here, we demonstrate a unified transformer model
specifically designed for multi-modal clinical tasks by incorporating
customized instruction tuning. We first compose a multi-task training dataset
comprising 13.4 million instruction and ground-truth pairs (with approximately
one million radiographs) for the customized tuning, involving both image- and
pixel-level tasks. Thus, we can unify the various vision-intensive tasks in a
single training framework with homogeneous model inputs and outputs to increase
clinical interpretability in one reading. Finally, we demonstrate the overall
superior performance of our model compared to prior arts on various chest X-ray
benchmarks across multi-tasks in both direct inference and finetuning settings.
Three radiologists further evaluate the generated reports against the recorded
ones, which also exhibit the enhanced explainability of our multi-task model.
Related papers
- PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation [51.509573838103854]
We propose a semi-supervised learning framework, termed Progressive Mean Teachers (PMT), for medical image segmentation.
Our PMT generates high-fidelity pseudo labels by learning robust and diverse features in the training process.
Experimental results on two datasets with different modalities, i.e., CT and MRI, demonstrate that our method outperforms the state-of-the-art medical image segmentation approaches.
arXiv Detail & Related papers (2024-09-08T15:02:25Z) - M3H: Multimodal Multitask Machine Learning for Healthcare [7.4489490661717355]
M3H is an explainable Multimodal Multitask Machine Learning for Healthcare framework.
It consolidates learning from data for supervised binary/multiclass classification, regression, and unsupervised clustering.
It consistently outperforms single-task models by on average 11.6% across 40 disease diagnoses from 16 medical departments, three hospital operation forecasts, and one patient phenotyping task.
arXiv Detail & Related papers (2024-04-29T14:39:15Z) - Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models [17.643421997037514]
We propose a novel framework that tackles both discriminative and generative multimodal medical tasks.
The learning of Med-MoE consists of three steps: multimodal medical alignment, instruction tuning and routing, and domain-specific MoE tuning.
Our model can achieve performance superior to or on par with state-of-the-art baselines.
arXiv Detail & Related papers (2024-04-16T02:35:17Z) - MUSCLE: Multi-task Self-supervised Continual Learning to Pre-train Deep
Models for X-ray Images of Multiple Body Parts [63.30352394004674]
Multi-task Self-super-vised Continual Learning (MUSCLE) is a novel self-supervised pre-training pipeline for medical imaging tasks.
MUSCLE aggregates X-rays collected from multiple body parts for representation learning, and adopts a well-designed continual learning procedure.
We evaluate MUSCLE using 9 real-world X-ray datasets with various tasks, including pneumonia classification, skeletal abnormality classification, lung segmentation, and tuberculosis (TB) detection.
arXiv Detail & Related papers (2023-10-03T12:19:19Z) - A Transformer-based representation-learning model with unified
processing of multimodal input for clinical diagnostics [63.106382317917344]
We report a Transformer-based representation-learning model as a clinical diagnostic aid that processes multimodal input in a unified manner.
The unified model outperformed an image-only model and non-unified multimodal diagnosis models in the identification of pulmonary diseases.
arXiv Detail & Related papers (2023-06-01T16:23:47Z) - Ambiguous Medical Image Segmentation using Diffusion Models [60.378180265885945]
We introduce a single diffusion model-based approach that produces multiple plausible outputs by learning a distribution over group insights.
Our proposed model generates a distribution of segmentation masks by leveraging the inherent sampling process of diffusion.
Comprehensive results show that our proposed approach outperforms existing state-of-the-art ambiguous segmentation networks.
arXiv Detail & Related papers (2023-04-10T17:58:22Z) - Specialty-Oriented Generalist Medical AI for Chest CT Screening [14.31187762890342]
We propose the first-of-its-kind medical multimodal-multitask foundation model (M3FM) with application in lung cancer screening and related tasks.
M3FM consistently outperforms the state-of-the-art single-modal task-specific models.
As a specialty-oriented generalist medical AI model, M3FM paves the way for similar breakthroughs in other areas of medicine.
arXiv Detail & Related papers (2023-04-03T20:19:56Z) - Efficient Extraction of Pathologies from C-Spine Radiology Reports using
Multi-Task Learning [3.0473556982158625]
We show that a multi-task model can beat or achieve the performance of multiple BERT-based models finetuned on various tasks.
We validate our method on our internal radiologist's report dataset on cervical spine.
arXiv Detail & Related papers (2022-04-09T20:29:48Z) - Multi-Domain Balanced Sampling Improves Out-of-Distribution
Generalization of Chest X-ray Pathology Prediction Models [67.2867506736665]
We propose an idea for out-of-distribution generalization of chest X-ray pathologies that uses a simple balanced batch sampling technique.
We observed that balanced sampling between the multiple training datasets improves the performance over baseline models trained without balancing.
arXiv Detail & Related papers (2021-12-27T15:28:01Z) - MultiMix: Sparingly Supervised, Extreme Multitask Learning From Medical
Images [13.690075845927606]
We propose a novel multitask learning model, namely MultiMix, which jointly learns disease classification and anatomical segmentation in a sparingly supervised manner.
Our experiments justify the effectiveness of our multitasking model for the classification of pneumonia and segmentation of lungs from chest X-ray images.
arXiv Detail & Related papers (2020-10-28T03:47:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.