Embedding-Space Data Augmentation to Prevent Membership Inference Attacks in Clinical Time Series Forecasting
- URL: http://arxiv.org/abs/2511.05289v1
- Date: Fri, 07 Nov 2025 14:49:18 GMT
- Title: Embedding-Space Data Augmentation to Prevent Membership Inference Attacks in Clinical Time Series Forecasting
- Authors: Marius Fracarolli, Michael Staniek, Stefan Riezler,
- Abstract summary: We show how data augmentation can mitigate Membership Inference Attacks (MIA) on TSF models.<n>Key challenge is generating synthetic samples that closely resemble the original training data to confuse the attacker.<n>Experiments show that ZOO-PCA yields the best reductions in TPR/FPR ratio for MIA attacks without sacrificing performance on test data.
- Score: 6.217506701223212
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Balancing strong privacy guarantees with high predictive performance is critical for time series forecasting (TSF) tasks involving Electronic Health Records (EHR). In this study, we explore how data augmentation can mitigate Membership Inference Attacks (MIA) on TSF models. We show that retraining with synthetic data can substantially reduce the effectiveness of loss-based MIAs by reducing the attacker's true-positive to false-positive ratio. The key challenge is generating synthetic samples that closely resemble the original training data to confuse the attacker, while also introducing enough novelty to enhance the model's ability to generalize to unseen data. We examine multiple augmentation strategies - Zeroth-Order Optimization (ZOO), a variant of ZOO constrained by Principal Component Analysis (ZOO-PCA), and MixUp - to strengthen model resilience without sacrificing accuracy. Our experimental results show that ZOO-PCA yields the best reductions in TPR/FPR ratio for MIA attacks without sacrificing performance on test data.
Related papers
- Adaptive Defense against Harmful Fine-Tuning for Large Language Models via Bayesian Data Scheduler [67.24175911858312]
Harmful fine-tuning poses critical safety risks to fine-tuning-as-a-service for large language models.<n>Bayesian Data Scheduler (BDS) is an adaptive tuning-stage defense strategy with no need for attack simulation.<n>BDS learns the posterior distribution of each data point's safety attribute, conditioned on the fine-tuning and alignment datasets.
arXiv Detail & Related papers (2025-10-31T04:49:37Z) - Empirical Comparison of Membership Inference Attacks in Deep Transfer Learning [4.877819365490361]
Membership inference attacks (MIAs) provide an empirical estimate of the privacy leakage by machine learning models.<n>We compare performance of diverse MIAs in transfer learning settings to help practitioners identify the most efficient attacks for privacy risk evaluation.
arXiv Detail & Related papers (2025-10-07T10:21:05Z) - Privacy Auditing Synthetic Data Release through Local Likelihood Attacks [7.780592134085148]
Gene Likelihood Ratio Attack (Gen-LRA)<n>Gen-LRA formulates its attack by evaluating the influence a test observation has in a surrogate model's estimation of a local likelihood ratio over the synthetic data.<n>Results underscore Gen-LRA's effectiveness as a privacy auditing tool for the release of synthetic data.
arXiv Detail & Related papers (2025-08-28T18:27:40Z) - Unveiling Impact of Frequency Components on Membership Inference Attacks for Diffusion Models [51.179816451161635]
Membership Inference Attacks (MIAs) are designed to ascertain whether specific data were utilized during a model's training phase.<n>We formalize them into a unified general paradigm which computes the membership score for membership identification.<n>Under this paradigm, we empirically find that existing attacks overlook the inherent deficiency in how diffusion models process high-frequency information.<n>We propose a plug-and-play high-frequency filter module to mitigate the adverse effects of the deficiency.
arXiv Detail & Related papers (2025-05-27T09:50:11Z) - SEVA: Leveraging Single-Step Ensemble of Vicinal Augmentations for Test-Time Adaptation [29.441669360316418]
Test-Time adaptation (TTA) aims to enhance model robustness against distribution shifts through rapid model adaptation during inference.<n> augmentation strategies can effectively unleash the potential of reliable samples, but the rapidly growing computational cost impedes their real-time application.<n>We propose a novel TTA approach named Single-step Ensemble of Vicinal Augmentations (SEVA) which can take advantage of data augmentations without increasing the computational burden.
arXiv Detail & Related papers (2025-05-07T02:58:37Z) - Winning the MIDST Challenge: New Membership Inference Attacks on Diffusion Models for Tabular Data Synthesis [10.682673935815547]
Existing privacy evaluations often rely on metrics or weak membership inference attacks (MIA)<n>In this work, we conduct a rigorous MIA study on diffusion-based synthesis, revealing that state-of-the-art attacks designed for image models fail in this setting.<n>Our method, implemented with a lightweight-driven approach, effectively learns membership signals, eliminating the need for manual optimization.
arXiv Detail & Related papers (2025-03-15T06:13:27Z) - Towards Robust Federated Learning via Logits Calibration on Non-IID Data [49.286558007937856]
Federated learning (FL) is a privacy-preserving distributed management framework based on collaborative model training of distributed devices in edge networks.
Recent studies have shown that FL is vulnerable to adversarial examples, leading to a significant drop in its performance.
In this work, we adopt the adversarial training (AT) framework to improve the robustness of FL models against adversarial example (AE) attacks.
arXiv Detail & Related papers (2024-03-05T09:18:29Z) - Low-Cost High-Power Membership Inference Attacks [15.240271537329534]
Membership inference attacks aim to detect if a particular data point was used in training a model.
We design a novel statistical test to perform robust membership inference attacks with low computational overhead.
RMIA lays the groundwork for practical yet accurate data privacy risk assessment in machine learning.
arXiv Detail & Related papers (2023-12-06T03:18:49Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - Interpolated Joint Space Adversarial Training for Robust and
Generalizable Defenses [82.3052187788609]
Adversarial training (AT) is considered to be one of the most reliable defenses against adversarial attacks.
Recent works show generalization improvement with adversarial samples under novel threat models.
We propose a novel threat model called Joint Space Threat Model (JSTM)
Under JSTM, we develop novel adversarial attacks and defenses.
arXiv Detail & Related papers (2021-12-12T21:08:14Z) - Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited
Data [125.7135706352493]
Generative adversarial networks (GANs) typically require ample data for training in order to synthesize high-fidelity images.
Recent studies have shown that training GANs with limited data remains formidable due to discriminator overfitting.
This paper introduces a novel strategy called Adaptive Pseudo Augmentation (APA) to encourage healthy competition between the generator and the discriminator.
arXiv Detail & Related papers (2021-11-12T18:13:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.