Related papers: From Research to Reality: Feasibility of Gradient Inversion Attacks in Federated Learning

From Research to Reality: Feasibility of Gradient Inversion Attacks in Federated Learning

URL: http://arxiv.org/abs/2508.19819v1
Date: Wed, 27 Aug 2025 12:07:23 GMT
Title: From Research to Reality: Feasibility of Gradient Inversion Attacks in Federated Learning
Authors: Viktor Valadi, Mattias Åkesson, Johan Östman, Salman Toor, Andreas Hellander,
Abstract summary: We systematically analyze how architecture and training behavior affect vulnerability, including the first in-depth study of inference-mode clients.<n>We introduce two novel attacks against models in training-mode with varying attacker knowledge, achieving state-of-the-art performance under realistic training conditions.<n>We conclude this work by offering the first comprehensive mapping of settings, clarifying which combinations of architectural choices and operational modes meaningfully impact privacy.
Score: 3.6055874544834445
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Gradient inversion attacks have garnered attention for their ability to compromise privacy in federated learning. However, many studies consider attacks with the model in inference mode, where training-time behaviors like dropout are disabled and batch normalization relies on fixed statistics. In this work, we systematically analyze how architecture and training behavior affect vulnerability, including the first in-depth study of inference-mode clients, which we show dramatically simplifies inversion. To assess attack feasibility under more realistic conditions, we turn to clients operating in standard training mode. In this setting, we find that successful attacks are only possible when several architectural conditions are met simultaneously: models must be shallow and wide, use skip connections, and, critically, employ pre-activation normalization. We introduce two novel attacks against models in training-mode with varying attacker knowledge, achieving state-of-the-art performance under realistic training conditions. We extend these efforts by presenting the first attack on a production-grade object-detection model. Here, to enable any visibly identifiable leakage, we revert to the lenient inference mode setting and make multiple architectural modifications to increase model vulnerability, with the extent of required changes highlighting the strong inherent robustness of such architectures. We conclude this work by offering the first comprehensive mapping of settings, clarifying which combinations of architectural choices and operational modes meaningfully impact privacy. Our analysis provides actionable insight into when models are likely vulnerable, when they appear robust, and where subtle leakage may persist. Together, these findings reframe how gradient inversion risk should be assessed in future research and deployment scenarios.

Related papers

Deep Leakage with Generative Flow Matching Denoiser [54.05993847488204]
We introduce a new deep leakage (DL) attack that integrates a generative Flow Matching (FM) prior into the reconstruction process.<n>Our approach consistently outperforms state-of-the-art attacks across pixel-level, perceptual, and feature-based similarity metrics.
arXiv Detail & Related papers (2026-01-21T14:51:01Z)
Exploiting Edge Features for Transferable Adversarial Attacks in Distributed Machine Learning [54.26807397329468]
This work explores a previously overlooked vulnerability in distributed deep learning systems.<n>An adversary who intercepts the intermediate features transmitted between them can still pose a serious threat.<n>We propose an exploitation strategy specifically designed for distributed settings.
arXiv Detail & Related papers (2025-07-09T20:09:00Z)
SecureFed: A Two-Phase Framework for Detecting Malicious Clients in Federated Learning [0.0]
Federated Learning (FL) protects data privacy while providing a decentralized method for training models.<n>Because of the distributed schema, it is susceptible to adversarial clients that could alter results or sabotage model performance.<n>This study presents SecureFed, a two-phase FL framework for identifying and reducing the impact of such attackers.
arXiv Detail & Related papers (2025-06-19T16:52:48Z)
Task-Agnostic Attacks Against Vision Foundation Models [12.487589700031661]
It has become standard practice for machine learning practitioners to adopt publicly available pre-trained vision foundation models.<n>The study of attacks on such foundation models and their impact to multiple downstream tasks remains vastly unexplored.<n>This work proposes a general framework that forges task-agnostic adversarial examples by maximally disrupting the feature representation obtained with foundation models.
arXiv Detail & Related papers (2025-03-05T19:15:14Z)
Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning [49.242828934501986]
Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features. backdoor attacks subtly embed malicious behaviors within the model during training. We introduce an innovative token-based localized forgetting training regime.
arXiv Detail & Related papers (2024-03-24T18:33:15Z)
Bounding Reconstruction Attack Success of Adversaries Without Data Priors [53.41619942066895]
Reconstruction attacks on machine learning (ML) models pose a strong risk of leakage of sensitive data. In this work, we provide formal upper bounds on reconstruction success under realistic adversarial settings.
arXiv Detail & Related papers (2024-02-20T09:52:30Z)
Enhancing Multiple Reliability Measures via Nuisance-extended Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition. We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training. We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z)
Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks. We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z)
Beyond Gradients: Exploiting Adversarial Priors in Model Inversion Attacks [7.49320945341034]
Collaborative machine learning settings can be susceptible to adversarial interference and attacks. One class of such attacks is termed model inversion attacks, characterised by the adversary reverse-engineering the model to extract representations. We propose a novel model inversion framework that builds on the foundations of gradient-based model inversion attacks.
arXiv Detail & Related papers (2022-03-01T14:22:29Z)
Thief, Beware of What Get You There: Towards Understanding Model Extraction Attack [13.28881502612207]
In some scenarios, AI models are trained proprietarily, where neither pre-trained models nor sufficient in-distribution data is publicly available. We find the effectiveness of existing techniques significantly affected by the absence of pre-trained models. We formulate model extraction attacks into an adaptive framework that captures these factors with deep reinforcement learning.
arXiv Detail & Related papers (2021-04-13T03:46:59Z)
Explainable Adversarial Attacks in Deep Neural Networks Using Activation Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples. We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.