Fair Text to Medical Image Diffusion Model with Subgroup Distribution Aligned Tuning
- URL: http://arxiv.org/abs/2406.14847v1
- Date: Fri, 21 Jun 2024 03:23:37 GMT
- Title: Fair Text to Medical Image Diffusion Model with Subgroup Distribution Aligned Tuning
- Authors: Xu Han, Fangfang Fan, Jingzhao Rong, Xiaofeng Liu,
- Abstract summary: The text to medical image (T2MedI) with latent diffusion model has great potential to alleviate the scarcity of medical imaging data.
However, as the text to nature image models, we show that the T2MedI model can also bias to some subgroups to overlook the minority ones in the training set.
In this work, we first build a T2MedI model based on the pre-trained Imagen model, which has the fixed contrastive language-image pre-training (CLIP) text encoder.
Its decoder has been fine-tuned on medical images from the Radiology Objects in C
- Score: 12.064840522920251
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The text to medical image (T2MedI) with latent diffusion model has great potential to alleviate the scarcity of medical imaging data and explore the underlying appearance distribution of lesions in a specific patient status description. However, as the text to nature image models, we show that the T2MedI model can also bias to some subgroups to overlook the minority ones in the training set. In this work, we first build a T2MedI model based on the pre-trained Imagen model, which has the fixed contrastive language-image pre-training (CLIP) text encoder, while its decoder has been fine-tuned on medical images from the Radiology Objects in COntext (ROCO) dataset. Its gender bias is analyzed qualitatively and quantitatively. Toward this issue, we propose to fine-tune the T2MedI toward the target application dataset to align their sensitive subgroups distribution probability. Specifically, the alignment loss for fine-tuning is guided by an off-the-shelf sensitivity-subgroup classifier to match the classification probability between the generated images and the expected target dataset. In addition, the image quality is maintained by a CLIP-consistency regularization term following a knowledge distillation scheme. For evaluation, we set the target dataset to be enhanced as the BraST18 dataset, and trained a brain magnetic resonance (MR) slice-based gender classifier from it. With our method, the generated MR image can markedly reduce the inconsistency with the gender proportion in the BraTS18 dataset.
Related papers
- Cross-model Mutual Learning for Exemplar-based Medical Image Segmentation [25.874281336821685]
Cross-model Mutual learning framework for Exemplar-based Medical image (CMEMS)
We introduce a novel Cross-model Mutual learning framework for Exemplar-based Medical image (CMEMS)
arXiv Detail & Related papers (2024-04-18T00:18:07Z) - FDDM: Unsupervised Medical Image Translation with a Frequency-Decoupled Diffusion Model [2.2726755789556794]
We introduce the Frequency Decoupled Diffusion Model for MR-to-CT conversion.
Our model uses a dual-path reverse diffusion process for low-frequency and high-frequency information.
It can generate high-quality target domain images while maintaining the accuracy of translated anatomical structures.
arXiv Detail & Related papers (2023-11-19T19:44:44Z) - Semi-Supervised Medical Image Segmentation with Co-Distribution
Alignment [16.038016822861092]
This paper proposes Co-Distribution Alignment (Co-DA) for semi-supervised medical image segmentation.
Co-DA aligns marginal predictions on unlabeled data to marginal predictions on labeled data in a class-wise manner.
We show that the proposed approach outperforms existing state-of-the-art semi-supervised medical image segmentation methods.
arXiv Detail & Related papers (2023-07-24T09:08:30Z) - Vision-Language Modelling For Radiological Imaging and Reports In The
Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space.
We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains.
Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z) - Learning to Exploit Temporal Structure for Biomedical Vision-Language
Processing [53.89917396428747]
Self-supervised learning in vision-language processing exploits semantic alignment between imaging and text modalities.
We explicitly account for prior images and reports when available during both training and fine-tuning.
Our approach, named BioViL-T, uses a CNN-Transformer hybrid multi-image encoder trained jointly with a text model.
arXiv Detail & Related papers (2023-01-11T16:35:33Z) - HealthyGAN: Learning from Unannotated Medical Images to Detect Anomalies
Associated with Human Disease [13.827062843105365]
A typical technique in the current medical imaging literature has focused on deriving diagnostic models from healthy subjects only.
HealthyGAN learns to translate the images from the mixed dataset to only healthy images.
Being one-directional, HealthyGAN relaxes the requirement of cycle consistency of existing unpaired image-to-image translation methods.
arXiv Detail & Related papers (2022-09-05T08:10:52Z) - Mixing-AdaSIN: Constructing a de-biased dataset using Adaptive
Structural Instance Normalization and texture Mixing [6.976822832216875]
We propose Mixing-AdaSIN; a bias mitigation method that uses a generative model to generate de-biased images.
To demonstrate the efficacy of our method, we construct a biased COVID-19 vs. bacterial pneumonia dataset.
arXiv Detail & Related papers (2021-03-26T04:40:14Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Improved Slice-wise Tumour Detection in Brain MRIs by Computing
Dissimilarities between Latent Representations [68.8204255655161]
Anomaly detection for Magnetic Resonance Images (MRIs) can be solved with unsupervised methods.
We have proposed a slice-wise semi-supervised method for tumour detection based on the computation of a dissimilarity function in the latent space of a Variational AutoEncoder.
We show that by training the models on higher resolution images and by improving the quality of the reconstructions, we obtain results which are comparable with different baselines.
arXiv Detail & Related papers (2020-07-24T14:02:09Z) - ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised
Medical Image Segmentation [99.90263375737362]
We propose ATSO, an asynchronous version of teacher-student optimization.
ATSO partitions the unlabeled data into two subsets and alternately uses one subset to fine-tune the model and updates the label on the other subset.
We evaluate ATSO on two popular medical image segmentation datasets and show its superior performance in various semi-supervised settings.
arXiv Detail & Related papers (2020-06-24T04:05:12Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.