Related papers: Noisy Neighbors: Efficient membership inference attacks against LLMs

Related papers

SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks [17.77094760401298]
We study the vulnerability of fine-tuned large language models to membership inference attacks (MIAs)<n>We propose SOFT, a novel defense technique that mitigates privacy leakage by leveraging influential data selection with an adjustable parameter to balance utility preservation and privacy protection.
arXiv Detail & Related papers (2025-06-12T07:23:56Z)
DynaNoise: Dynamic Probabilistic Noise Injection for Defending Against Membership Inference Attacks [6.610581923321801]
Membership Inference Attacks (MIAs) pose a significant risk to the privacy of training datasets.<n>Traditional mitigation techniques rely on injecting a fixed amount of noise during training or inference.<n>We present DynaNoise, an adaptive approach that dynamically modulates noise injection based on query sensitivity.
arXiv Detail & Related papers (2025-05-19T17:07:00Z)
User Behavior Analysis in Privacy Protection with Large Language Models: A Study on Privacy Preferences with Limited Data [5.152440245370642]
This study explores how large language models (LLMs) can analyze user behavior related to privacy protection in scenarios with limited data.<n>The research utilizes anonymized user privacy settings data, survey responses, and simulated data.<n> Experimental results demonstrate that, even with limited data, LLMs significantly improve the accuracy of privacy preference modeling.
arXiv Detail & Related papers (2025-05-08T04:42:17Z)
Winning the MIDST Challenge: New Membership Inference Attacks on Diffusion Models for Tabular Data Synthesis [10.682673935815547]
Existing privacy evaluations often rely on metrics or weak membership inference attacks (MIA) In this work, we conduct a rigorous MIA study on diffusion-based synthesis, revealing that state-of-the-art attacks designed for image models fail in this setting. Our method, implemented with a lightweight-driven approach, effectively learns membership signals, eliminating the need for manual optimization.
arXiv Detail & Related papers (2025-03-15T06:13:27Z)
SafeSynthDP: Leveraging Large Language Models for Privacy-Preserving Synthetic Data Generation Using Differential Privacy [0.0]
We investigate capability of Large Language Models (Ms) to generate synthetic datasets with Differential Privacy (DP) mechanisms. Our approach incorporates DP-based noise injection methods, including Laplace and Gaussian distributions, into the data generation process. We then evaluate the utility of these DP-enhanced synthetic datasets by comparing the performance of ML models trained on them against models trained on the original data.
arXiv Detail & Related papers (2024-12-30T01:10:10Z)
EM-MIAs: Enhancing Membership Inference Attacks in Large Language Models through Ensemble Modeling [2.494935495983421]
This paper proposes a novel ensemble attack method that integrates several existing MIAs techniques into an XGBoost-based model to enhance overall attack performance (EM-MIAs) Experimental results demonstrate that the ensemble model significantly improves both AUC-ROC and accuracy compared to individual attack methods across various large language models and datasets.
arXiv Detail & Related papers (2024-12-23T03:47:54Z)
Preventing Non-intrusive Load Monitoring Privacy Invasion: A Precise Adversarial Attack Scheme for Networked Smart Meters [99.90150979732641]
We propose an innovative scheme based on adversarial attack in this paper. The scheme effectively prevents NILM models from violating appliance-level privacy, while also ensuring accurate billing calculation for users. Our solutions exhibit transferability, making the generated perturbation signal from one target model applicable to other diverse NILM models.
arXiv Detail & Related papers (2024-12-22T07:06:46Z)
Active Learning for Robust and Representative LLM Generation in Safety-Critical Scenarios [32.16984263644299]
Large Language Models (LLMs) can generate valuable data for safety measures, but often exhibit distributional biases. We propose a novel framework that integrates active learning with clustering to guide LLM generation. Our results show that the proposed framework produces a more representative set of safety scenarios without requiring prior knowledge of the underlying data distribution.
arXiv Detail & Related papers (2024-10-14T21:48:14Z)
Ingest-And-Ground: Dispelling Hallucinations from Continually-Pretrained LLMs with RAG [2.7972592976232833]
We continually pre-train the base LLM model with a privacy-specific knowledge base and then augment it with a semantic RAG layer. Our evaluations demonstrate that this approach enhances the model performance (as much as doubled metrics compared to out-of-box LLM) in handling privacy-related queries.
arXiv Detail & Related papers (2024-09-30T20:32:29Z)
Robust Utility-Preserving Text Anonymization Based on Large Language Models [80.5266278002083]
Text anonymization is crucial for sharing sensitive data while maintaining privacy. Existing techniques face the emerging challenges of re-identification attack ability of Large Language Models. This paper proposes a framework composed of three LLM-based components -- a privacy evaluator, a utility evaluator, and an optimization component.
arXiv Detail & Related papers (2024-07-16T14:28:56Z)
Exposing Privacy Gaps: Membership Inference Attack on Preference Data for LLM Alignment [8.028743532294532]
We introduce a novel reference-based attack framework specifically for analyzing preference data called PREMIA. We provide empirical evidence that DPO models are more vulnerable to MIA compared to PPO models.
arXiv Detail & Related papers (2024-07-08T22:53:23Z)
Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning [61.2224355547598]
Open-sourcing of large language models (LLMs) accelerates application development, innovation, and scientific progress. Our investigation exposes a critical oversight in this belief. By deploying carefully designed demonstrations, our research demonstrates that base LLMs could effectively interpret and execute malicious instructions.
arXiv Detail & Related papers (2024-04-16T13:22:54Z)
Practical Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration [32.15773300068426]
Membership Inference Attacks aim to infer whether a target data record has been utilized for model training. We propose a Membership Inference Attack based on Self-calibrated Probabilistic Variation (SPV-MIA)
arXiv Detail & Related papers (2023-11-10T13:55:05Z)
Assessing Privacy Risks in Language Models: A Case Study on Summarization Tasks [65.21536453075275]
We focus on the summarization task and investigate the membership inference (MI) attack. We exploit text similarity and the model's resistance to document modifications as potential MI signals. We discuss several safeguards for training summarization models to protect against MI attacks and discuss the inherent trade-off between privacy and utility.
arXiv Detail & Related papers (2023-10-20T05:44:39Z)
ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks [91.55895047448249]
This paper presents ReEval, an LLM-based framework using prompt chaining to perturb the original evidence for generating new test cases. We implement ReEval using ChatGPT and evaluate the resulting variants of two popular open-domain QA datasets. Our generated data is human-readable and useful to trigger hallucination in large language models.
arXiv Detail & Related papers (2023-10-19T06:37:32Z)
A Differentially Private Weighted Empirical Risk Minimization Procedure and its Application to Outcome Weighted Learning [4.322221694511603]
Differential privacy (DP) is an appealing framework for addressing data privacy issues. DP provides mathematically provable bounds on the privacy loss incurred when releasing information from sensitive data. We propose the first differentially private algorithm for general wERM, with theoretical DP guarantees.
arXiv Detail & Related papers (2023-07-24T21:03:25Z)
MAPS: A Noise-Robust Progressive Learning Approach for Source-Free Domain Adaptive Keypoint Detection [76.97324120775475]
Cross-domain keypoint detection methods always require accessing the source data during adaptation. This paper considers source-free domain adaptive keypoint detection, where only the well-trained source model is provided to the target domain.
arXiv Detail & Related papers (2023-02-09T12:06:08Z)
Privacy-Constrained Policies via Mutual Information Regularized Policy Gradients [54.98496284653234]
We consider the task of training a policy that maximizes reward while minimizing disclosure of certain sensitive state variables through the actions. We solve this problem by introducing a regularizer based on the mutual information between the sensitive state and the actions. We develop a model-based estimator for optimization of privacy-constrained policies.
arXiv Detail & Related papers (2020-12-30T03:22:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.