Medical Multimodal Model Stealing Attacks via Adversarial Domain Alignment
- URL: http://arxiv.org/abs/2502.02438v1
- Date: Tue, 04 Feb 2025 16:04:48 GMT
- Title: Medical Multimodal Model Stealing Attacks via Adversarial Domain Alignment
- Authors: Yaling Shen, Zhixiong Zhuang, Kun Yuan, Maria-Irina Nicolae, Nassir Navab, Nicolas Padoy, Mario Fritz,
- Abstract summary: Medical multimodal large language models (MLLMs) are becoming an instrumental part of healthcare systems.
As medical data is scarce and protected by privacy regulations, medical MLLMs represent valuable intellectual property.
We introduce Adversarial Domain Alignment (ADA-STEAL), the first stealing attack against medical MLLMs.
- Score: 79.41098832007819
- License:
- Abstract: Medical multimodal large language models (MLLMs) are becoming an instrumental part of healthcare systems, assisting medical personnel with decision making and results analysis. Models for radiology report generation are able to interpret medical imagery, thus reducing the workload of radiologists. As medical data is scarce and protected by privacy regulations, medical MLLMs represent valuable intellectual property. However, these assets are potentially vulnerable to model stealing, where attackers aim to replicate their functionality via black-box access. So far, model stealing for the medical domain has focused on classification; however, existing attacks are not effective against MLLMs. In this paper, we introduce Adversarial Domain Alignment (ADA-STEAL), the first stealing attack against medical MLLMs. ADA-STEAL relies on natural images, which are public and widely available, as opposed to their medical counterparts. We show that data augmentation with adversarial noise is sufficient to overcome the data distribution gap between natural images and the domain-specific distribution of the victim MLLM. Experiments on the IU X-RAY and MIMIC-CXR radiology datasets demonstrate that Adversarial Domain Alignment enables attackers to steal the medical MLLM without any access to medical data.
Related papers
- UniMed-CLIP: Towards a Unified Image-Text Pretraining Paradigm for Diverse Medical Imaging Modalities [68.12889379702824]
Vision-Language Models (VLMs) trained via contrastive learning have achieved notable success in natural image tasks.
UniMed is a large-scale, open-source multi-modal medical dataset comprising over 5.3 million image-text pairs.
We trained UniMed-CLIP, a unified VLM for six modalities, achieving notable gains in zero-shot evaluations.
arXiv Detail & Related papers (2024-12-13T18:59:40Z) - MaskMedPaint: Masked Medical Image Inpainting with Diffusion Models for Mitigation of Spurious Correlations [13.599251610827539]
Masked Medical Image Inpainting (MaskMedPaint)
We propose Masked Medical Image Inpainting (MaskMedPaint), which uses text-to-image diffusion models to augment training images by inpainting areas outside key classification regions to match the target domain.
We demonstrate that MaskMedPaint enhances generalization to target domains across both natural (Waterbirds, iWildCam) and medical (ISIC 2018, Chest X-ray) datasets, given limited unlabeled target images.
arXiv Detail & Related papers (2024-11-16T03:23:06Z) - MediConfusion: Can you trust your AI radiologist? Probing the reliability of multimodal medical foundation models [20.781551849965357]
We introduce MediConfusion, a challenging medical Visual Question Answering (VQA) benchmark dataset.
We reveal that state-of-the-art models are easily confused by image pairs that are otherwise visually dissimilar and clearly distinct for medical experts.
We also extract common patterns of model failure that may help the design of a new generation of more trustworthy and reliable MLLMs in healthcare.
arXiv Detail & Related papers (2024-09-23T18:59:37Z) - BAPLe: Backdoor Attacks on Medical Foundational Models using Prompt Learning [71.60858267608306]
Medical foundation models are susceptible to backdoor attacks.
This work introduces a method to embed a backdoor into the medical foundation model during the prompt learning phase.
Our method, BAPLe, requires only a minimal subset of data to adjust the noise trigger and the text prompts for downstream tasks.
arXiv Detail & Related papers (2024-08-14T10:18:42Z) - Medical Unlearnable Examples: Securing Medical Data from Unauthorized Training via Sparsity-Aware Local Masking [24.850260039814774]
Fears of unauthorized use, like training commercial AI models, hinder researchers from sharing their valuable datasets.
We propose the Sparsity-Aware Local Masking (SALM) method, which selectively perturbs significant pixel regions rather than the entire image.
Our experiments demonstrate that SALM effectively prevents unauthorized training of different models and outperforms previous SoTA data protection methods.
arXiv Detail & Related papers (2024-03-15T02:35:36Z) - MITS-GAN: Safeguarding Medical Imaging from Tampering with Generative Adversarial Networks [48.686454485328895]
This study introduces MITS-GAN, a novel approach to prevent tampering in medical images.
The approach disrupts the output of the attacker's CT-GAN architecture by introducing finely tuned perturbations that are imperceptible to the human eye.
Experimental results on a CT scan demonstrate MITS-GAN's superior performance.
arXiv Detail & Related papers (2024-01-17T22:30:41Z) - Medical Report Generation based on Segment-Enhanced Contrastive
Representation Learning [39.17345313432545]
We propose MSCL (Medical image with Contrastive Learning) to segment organs, abnormalities, bones, etc.
We introduce a supervised contrastive loss that assigns more weight to reports that are semantically similar to the target while training.
Experimental results demonstrate the effectiveness of our proposed model, where we achieve state-of-the-art performance on the IU X-Ray public dataset.
arXiv Detail & Related papers (2023-12-26T03:33:48Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - XrayGPT: Chest Radiographs Summarization using Medical Vision-Language
Models [60.437091462613544]
We introduce XrayGPT, a novel conversational medical vision-language model.
It can analyze and answer open-ended questions about chest radiographs.
We generate 217k interactive and high-quality summaries from free-text radiology reports.
arXiv Detail & Related papers (2023-06-13T17:59:59Z) - Privacy-preserving Machine Learning for Medical Image Classification [0.0]
Image classification is an important use case of Machine Learning (ML) in the medical industry.
There is a privacy concern when using automated systems like these.
In this study, we aim to solve these problems in the context of a medical image classification problem of detection of pneumonia by examining chest x-ray images.
arXiv Detail & Related papers (2021-08-29T10:50:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.