Related papers: LeakBoost: Perceptual-Loss-Based Membership Inference Attack

LeakBoost: Perceptual-Loss-Based Membership Inference Attack

URL: http://arxiv.org/abs/2602.05748v1
Date: Thu, 05 Feb 2026 15:15:35 GMT
Title: LeakBoost: Perceptual-Loss-Based Membership Inference Attack
Authors: Amit Kravchik Taub, Fred M. Grabovski, Guy Amit, Yisroel Mirsky,
Abstract summary: LeakBoost is a perceptual-loss-based interrogation framework that actively probes a model's internal representations to expose hidden membership signals.<n>LeakBoost achieves substantial improvements at low false-positive rates across multiple image classification datasets and diverse neural network architectures.
Score: 4.82560917771631
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Membership inference attacks (MIAs) aim to determine whether a sample was part of a model's training set, posing serious privacy risks for modern machine-learning systems. Existing MIAs primarily rely on static indicators, such as loss or confidence, and do not fully leverage the dynamic behavior of models when actively probed. We propose LeakBoost, a perceptual-loss-based interrogation framework that actively probes a model's internal representations to expose hidden membership signals. Given a candidate input, LeakBoost synthesizes an interrogation image by optimizing a perceptual (activation-space) objective, amplifying representational differences between members and non-members. This image is then analyzed by an off-the-shelf membership detector, without modifying the detector itself. When combined with existing membership inference methods, LeakBoost achieves substantial improvements at low false-positive rates across multiple image classification datasets and diverse neural network architectures. In particular, it raises AUC from near-chance levels (0.53-0.62) to 0.81-0.88, and increases TPR at 1 percent FPR by over an order of magnitude compared to strong baseline attacks. A detailed sensitivity analysis reveals that deeper layers and short, low-learning-rate optimization produce the strongest leakage, and that improvements concentrate in gradient-based detectors. LeakBoost thus offers a modular and computationally efficient way to assess privacy risks in white-box settings, advancing the study of dynamic membership inference.

Related papers

AttenMIA: LLM Membership Inference Attack through Attention Signals [8.170623979629953]
We introduce AttenMIA, a new MIA framework that exploits self-attention patterns inside the transformer model to infer membership.<n>We show that attention-based features consistently outperform baselines, particularly under the important low-false-positive metric.<n>We also show that using AttenMIA to replace other membership inference attacks in a data extraction framework results in training data extraction attacks that outperform the state of the art.
arXiv Detail & Related papers (2026-01-26T03:45:56Z)
Exposing and Defending Membership Leakage in Vulnerability Prediction Models [13.905375956316632]
Membership Inference Attacks (MIAs) aim to infer whether a particular code sample was used during training.<n>Noise-based Membership Inference Defense (NMID) is a lightweight defense module that applies output masking and Gaussian noise injection to disrupt adversarial inference.<n>Our study highlights critical privacy risks in code analysis and offers actionable defense strategies for securing AI-powered software systems.
arXiv Detail & Related papers (2025-12-09T06:40:51Z)
DeLeaker: Dynamic Inference-Time Reweighting For Semantic Leakage Mitigation in Text-to-Image Models [55.30555646945055]
Text-to-Image (T2I) models are vulnerable to semantic leakage.<n>We introduce DeLeaker, a lightweight approach that mitigates leakage by directly intervening on the model's attention maps.<n>SLIM is the first dataset dedicated to semantic leakage.
arXiv Detail & Related papers (2025-10-16T17:39:21Z)
Source-Free Object Detection with Detection Transformer [59.33653163035064]
Source-Free Object Detection (SFOD) enables knowledge transfer from a source domain to an unsupervised target domain for object detection without access to source data.<n>Most existing SFOD approaches are either confined to conventional object detection (OD) models like Faster R-CNN or designed as general solutions without tailored adaptations for novel OD architectures, especially Detection Transformer (DETR)<n>In this paper, we introduce Feature Reweighting ANd Contrastive Learning NetworK (FRANCK), a novel SFOD framework specifically designed to perform query-centric feature enhancement for DETRs.
arXiv Detail & Related papers (2025-10-13T07:35:04Z)
Knowledge Regularized Negative Feature Tuning of Vision-Language Models for Out-of-Distribution Detection [54.433899174017185]
Out-of-distribution (OOD) detection is crucial for building reliable machine learning models.<n>We propose a novel method called Knowledge Regularized Negative Feature Tuning (KR-NFT)<n>NFT applies distribution-aware transformations to pre-trained text features, effectively separating positive and negative features into distinct spaces.<n>When trained with few-shot samples from ImageNet dataset, KR-NFT not only improves ID classification accuracy and OOD detection but also significantly reduces the FPR95 by 5.44%.
arXiv Detail & Related papers (2025-07-26T07:44:04Z)
Evaluating Ensemble and Deep Learning Models for Static Malware Detection with Dimensionality Reduction Using the EMBER Dataset [0.0]
This study investigates the effectiveness of several machine learning algorithms for static malware detection using the EMBER dataset.<n>We evaluate eight classification models: LightGBM, XGBoost, CatBoost, Random Forest, Extra Trees, HistGradientBoosting, k-Nearest Neighbors (KNN), and TabNet.<n>The models are assessed on accuracy, precision, recall, F1 score, and AUC to examine both predictive performance and robustness.
arXiv Detail & Related papers (2025-07-22T18:45:10Z)
Interpretable Temporal Class Activation Representation for Audio Spoofing Detection [7.476305130252989]
We utilize the wav2vec 2.0 model and attentive utterance-level features to integrate interpretability directly into the model's architecture. Our model achieves state-of-the-art results, with an EER of 0.51% and a min t-DCF of 0.0165 on the ASVspoof 2019-LA set.
arXiv Detail & Related papers (2024-06-13T05:36:01Z)
Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework [61.34862133870934]
We propose a bi-level adversarial framework to promote the robustness of detection in the presence of distinct corruptions. Our scheme remarkably improves 21.96% IOU across a wide array of corruptions and notably promotes 4.97% IOU on the general benchmark.
arXiv Detail & Related papers (2023-09-03T06:35:07Z)
Robust and Accurate Object Detection via Adversarial Learning [111.36192453882195]
This work augments the fine-tuning stage for object detectors by exploring adversarial examples. Our approach boosts the performance of state-of-the-art EfficientDets by +1.1 mAP on the object detection benchmark.
arXiv Detail & Related papers (2021-03-23T19:45:26Z)
Adversarial Feature Augmentation and Normalization for Visual Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models. Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings. We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z)
Transformer-Encoder Detector Module: Using Context to Improve Robustness to Adversarial Attacks on Object Detection [12.521662223741673]
This article proposes a new context module that can be applied to an object detector to improve the labeling of object instances. The proposed model achieves higher mAP, F1 scores and AUC average score of up to 13% compared to the baseline Faster-RCNN detector.
arXiv Detail & Related papers (2020-11-13T15:52:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.