Assessing Risk of Stealing Proprietary Models for Medical Imaging Tasks
- URL: http://arxiv.org/abs/2506.19464v1
- Date: Tue, 24 Jun 2025 09:46:01 GMT
- Title: Assessing Risk of Stealing Proprietary Models for Medical Imaging Tasks
- Authors: Ankita Raj, Harsh Swaika, Deepankar Varma, Chetan Arora,
- Abstract summary: This paper investigates the vulnerability of black-box medical imaging models to model stealing attacks.<n>We demonstrate that adversaries can effectively execute MS attacks by using publicly available datasets.<n>We propose a two-step model stealing approach termed QueryWise to enhance MS capabilities with limited query budgets.
- Score: 5.34146886237413
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The success of deep learning in medical imaging applications has led several companies to deploy proprietary models in diagnostic workflows, offering monetized services. Even though model weights are hidden to protect the intellectual property of the service provider, these models are exposed to model stealing (MS) attacks, where adversaries can clone the model's functionality by querying it with a proxy dataset and training a thief model on the acquired predictions. While extensively studied on general vision tasks, the susceptibility of medical imaging models to MS attacks remains inadequately explored. This paper investigates the vulnerability of black-box medical imaging models to MS attacks under realistic conditions where the adversary lacks access to the victim model's training data and operates with limited query budgets. We demonstrate that adversaries can effectively execute MS attacks by using publicly available datasets. To further enhance MS capabilities with limited query budgets, we propose a two-step model stealing approach termed QueryWise. This method capitalizes on unlabeled data obtained from a proxy distribution to train the thief model without incurring additional queries. Evaluation on two medical imaging models for Gallbladder Cancer and COVID-19 classification substantiates the effectiveness of the proposed attack. The source code is available at https://github.com/rajankita/QueryWise.
Related papers
- Explore the vulnerability of black-box models via diffusion models [12.444628438522702]
In this study, we uncover a novel security threat where an attacker leverages diffusion model APIs to generate synthetic images.<n>This enables the attacker to execute model extraction and transfer-based adversarial attacks on black-box classification models.<n>Our method shows an average improvement of 27.37% over state-of-the-art methods while using just 0.01 times of the query budget.
arXiv Detail & Related papers (2025-06-09T09:36:31Z) - Where's the liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content [42.68683643671603]
We introduce a novel black box detection framework that requires only API access.<n>We measure the likelihood that the image was generated by the model itself.<n>For black-box models that do not support masked image inputs, we incorporate a cost efficient surrogate model trained to align with the target model distribution.
arXiv Detail & Related papers (2025-05-02T05:11:35Z) - Model-Guardian: Protecting against Data-Free Model Stealing Using Gradient Representations and Deceptive Predictions [5.6731655991880965]
Model stealing is increasingly threatening the confidentiality of machine learning models deployed in the cloud.<n>This paper introduces a novel defense framework named Model-Guardian.<n>It is designed to address the shortcomings of current defenses with the help of the artifact properties of synthetic samples and gradient representations of samples.
arXiv Detail & Related papers (2025-03-23T14:14:36Z) - Medical Multimodal Model Stealing Attacks via Adversarial Domain Alignment [79.41098832007819]
Medical multimodal large language models (MLLMs) are becoming an instrumental part of healthcare systems.<n>As medical data is scarce and protected by privacy regulations, medical MLLMs represent valuable intellectual property.<n>We introduce Adversarial Domain Alignment (ADA-STEAL), the first stealing attack against medical MLLMs.
arXiv Detail & Related papers (2025-02-04T16:04:48Z) - BAPLe: Backdoor Attacks on Medical Foundational Models using Prompt Learning [71.60858267608306]
Medical foundation models are susceptible to backdoor attacks.
This work introduces a method to embed a backdoor into the medical foundation model during the prompt learning phase.
Our method, BAPLe, requires only a minimal subset of data to adjust the noise trigger and the text prompts for downstream tasks.
arXiv Detail & Related papers (2024-08-14T10:18:42Z) - DiffExplainer: Unveiling Black Box Models Via Counterfactual Generation [11.201840101870808]
We propose an agent model capable of generating counterfactual images that prompt different decisions when plugged into a black box model.
By employing this agent model, we can uncover influential image patterns that impact the black model's final predictions.
We validated our approach in the rigorous domain of medical prognosis tasks.
arXiv Detail & Related papers (2024-06-21T14:27:02Z) - Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - Beyond Labeling Oracles: What does it mean to steal ML models? [52.63413852460003]
Model extraction attacks are designed to steal trained models with only query access.
We investigate factors influencing the success of model extraction attacks.
Our findings urge the community to redefine the adversarial goals of ME attacks.
arXiv Detail & Related papers (2023-10-03T11:10:21Z) - Isolation and Induction: Training Robust Deep Neural Networks against
Model Stealing Attacks [51.51023951695014]
Existing model stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers.
This paper proposes Isolation and Induction (InI), a novel and effective training framework for model stealing defenses.
In contrast to adding perturbations over model predictions that harm the benign accuracy, we train models to produce uninformative outputs against stealing queries.
arXiv Detail & Related papers (2023-08-02T05:54:01Z) - Careful What You Wish For: on the Extraction of Adversarially Trained
Models [2.707154152696381]
Recent attacks on Machine Learning (ML) models pose several security and privacy threats.
We propose a framework to assess extraction attacks on adversarially trained models.
We show that adversarially trained models are more vulnerable to extraction attacks than models obtained under natural training circumstances.
arXiv Detail & Related papers (2022-07-21T16:04:37Z) - Defending against Model Stealing via Verifying Embedded External
Features [90.29429679125508]
adversaries can steal' deployed models even when they have no training samples and can not get access to the model parameters or structures.
We explore the defense from another angle by verifying whether a suspicious model contains the knowledge of defender-specified emphexternal features.
Our method is effective in detecting different types of model stealing simultaneously, even if the stolen model is obtained via a multi-stage stealing process.
arXiv Detail & Related papers (2021-12-07T03:51:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.