P-MIA: A Profiled-Based Membership Inference Attack on Cognitive Diagnosis Models
- URL: http://arxiv.org/abs/2511.04716v1
- Date: Thu, 06 Nov 2025 01:53:04 GMT
- Title: P-MIA: A Profiled-Based Membership Inference Attack on Cognitive Diagnosis Models
- Authors: Mingliang Hou, Yinuo Wang, Teng Guo, Zitao Liu, Wenzhou Dou, Jiaqi Zheng, Renqiang Luo, Mi Tian, Weiqi Luo,
- Abstract summary: This paper is the first to systematically investigate membership inference attacks (MIA) against cognitive diagnosis models (CDMs)<n>We introduce a novel and realistic grey box threat model that exploits the explainability features of these platforms.<n>We propose a profile-based MIA framework that leverages both the model's final prediction probabilities and the exposed internal knowledge state vectors as features.
- Score: 22.027021891488683
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cognitive diagnosis models (CDMs) are pivotal for creating fine-grained learner profiles in modern intelligent education platforms. However, these models are trained on sensitive student data, raising significant privacy concerns. While membership inference attacks (MIA) have been studied in various domains, their application to CDMs remains a critical research gap, leaving their privacy risks unquantified. This paper is the first to systematically investigate MIA against CDMs. We introduce a novel and realistic grey box threat model that exploits the explainability features of these platforms, where a model's internal knowledge state vectors are exposed to users through visualizations such as radar charts. We demonstrate that these vectors can be accurately reverse-engineered from such visualizations, creating a potent attack surface. Based on this threat model, we propose a profile-based MIA (P-MIA) framework that leverages both the model's final prediction probabilities and the exposed internal knowledge state vectors as features. Extensive experiments on three real-world datasets against mainstream CDMs show that our grey-box attack significantly outperforms standard black-box baselines. Furthermore, we showcase the utility of P-MIA as an auditing tool by successfully evaluating the efficacy of machine unlearning techniques and revealing their limitations.
Related papers
- Neural Breadcrumbs: Membership Inference Attacks on LLMs Through Hidden State and Attention Pattern Analysis [9.529147118376464]
Membership inference attacks (MIAs) reveal whether specific data was used to train machine learning models.<n>Our work explores how examining internal representations, rather than just their outputs, may provide additional insights into potential membership inference signals.<n>Our findings suggest that internal model behaviors can reveal aspects of training data exposure even when output-based signals appear protected.
arXiv Detail & Related papers (2025-09-05T19:05:49Z) - Evaluating the Defense Potential of Machine Unlearning against Membership Inference Attacks [0.0]
Membership Inference Attacks (MIAs) enable adversaries to determine whether a specific data point was included in the training dataset of a model.<n>While Machine Unlearning is not inherently a countermeasure against MIA, the unlearning algorithm and data characteristics can significantly affect a model's vulnerability.<n>This work provides essential insights into the interplay between Machine Unlearning and MIAs, offering guidance for the design of privacy-preserving machine learning systems.
arXiv Detail & Related papers (2025-08-22T07:19:33Z) - A Systematic Survey of Model Extraction Attacks and Defenses: State-of-the-Art and Perspectives [65.3369988566853]
Recent studies have demonstrated that adversaries can replicate a target model's functionality.<n>Model Extraction Attacks pose threats to intellectual property, privacy, and system security.<n>We propose a novel taxonomy that classifies MEAs according to attack mechanisms, defense approaches, and computing environments.
arXiv Detail & Related papers (2025-08-20T19:49:59Z) - On Transfer-based Universal Attacks in Pure Black-box Setting [94.92884394009288]
We study the role of prior knowledge of the target model data and number of classes in attack performance.<n>We also provide several interesting insights based on our analysis, and demonstrate that priors cause overestimation in transferability scores.
arXiv Detail & Related papers (2025-04-11T10:41:20Z) - Membership Inference Attacks on Large-Scale Models: A Survey [5.795582095405318]
Membership Inference Attacks (MIAs) are an important technique for exposing or assessing privacy risks.<n>We provide the first comprehensive review of MIAs targeting Large Language Models (LLMs) and Large Multimodal Models (LMMs)<n>Unlike prior surveys, we further examine MIAs across multiple stages of the model pipeline, including pre-training, fine-tuning, alignment, and Retrieval-Augmented Generation (RAG)
arXiv Detail & Related papers (2025-03-25T04:11:47Z) - A Survey of Model Extraction Attacks and Defenses in Distributed Computing Environments [55.60375624503877]
Model Extraction Attacks (MEAs) threaten modern machine learning systems by enabling adversaries to steal models, exposing intellectual property and training data.<n>This survey is motivated by the urgent need to understand how the unique characteristics of cloud, edge, and federated deployments shape attack vectors and defense requirements.<n>We systematically examine the evolution of attack methodologies and defense mechanisms across these environments, demonstrating how environmental factors influence security strategies in critical sectors such as autonomous vehicles, healthcare, and financial services.
arXiv Detail & Related papers (2025-02-22T03:46:50Z) - Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - Model Inversion Attack via Dynamic Memory Learning [41.742953947551364]
Model Inversion (MI) attacks aim to recover the private training data from the target model.
Recent advances in generative adversarial models have rendered them particularly effective in MI attacks.
We propose a novel Dynamic Memory Model Inversion Attack (DMMIA) to leverage historically learned knowledge.
arXiv Detail & Related papers (2023-08-24T02:32:59Z) - MF-CLIP: Leveraging CLIP as Surrogate Models for No-box Adversarial Attacks [65.86360607693457]
No-box attacks, where adversaries have no prior knowledge, remain relatively underexplored despite its practical relevance.<n>This work presents a systematic investigation into leveraging large-scale Vision-Language Models (VLMs) as surrogate models for executing no-box attacks.<n>Our theoretical and empirical analyses reveal a key limitation in the execution of no-box attacks stemming from insufficient discriminative capabilities for direct application of vanilla CLIP as a surrogate model.<n>We propose MF-CLIP: a novel framework that enhances CLIP's effectiveness as a surrogate model through margin-aware feature space optimization.
arXiv Detail & Related papers (2023-07-13T08:10:48Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - Improving Robustness to Model Inversion Attacks via Mutual Information
Regularization [12.079281416410227]
This paper studies defense mechanisms against model inversion (MI) attacks.
MI is a type of privacy attacks aimed at inferring information about the training data distribution given the access to a target machine learning model.
We propose the Mutual Information Regularization based Defense (MID) against MI attacks.
arXiv Detail & Related papers (2020-09-11T06:02:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.