MAUI: Reconstructing Private Client Data in Federated Transfer Learning
- URL: http://arxiv.org/abs/2509.11451v1
- Date: Sun, 14 Sep 2025 21:52:47 GMT
- Title: MAUI: Reconstructing Private Client Data in Federated Transfer Learning
- Authors: Ahaan Dabholkar, Atul Sharma, Z. Berkay Celik, Saurabh Bagchi,
- Abstract summary: Data reconstruction attacks (DRAs) in federated learning (FL) show two key weaknesses.<n>We propose MAUI, a stealthy DRA that does not require any overt manipulations to the model architecture or weights.<n> MAUI significantly outperforms prior DRAs in reconstruction quality, achieving 40-120% higher PSNR scores.
- Score: 18.64239297166542
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent works in federated learning (FL) have shown the utility of leveraging transfer learning for balancing the benefits of FL and centralized learning. In this setting, federated training happens after a stable point has been reached through conventional training. Global model weights are first centrally pretrained by the server on a public dataset following which only the last few linear layers (the classification head) of the model are finetuned across clients. In this scenario, existing data reconstruction attacks (DRAs) in FL show two key weaknesses. First, strongly input-correlated gradient information from the initial model layers is never shared, significantly degrading reconstruction accuracy. Second, DRAs in which the server makes highly specific, handcrafted manipulations to the model structure or parameters (for e.g., layers with all zero weights, identity mappings and rows with identical weight patterns) are easily detectable by an active client. Improving on these, we propose MAUI, a stealthy DRA that does not require any overt manipulations to the model architecture or weights, and relies solely on the gradients of the classification head. MAUI first extracts "robust" feature representations of the input batch from the gradients of the classification head and subsequently inverts these representations to the original inputs. We report highly accurate reconstructions on the CIFAR10 and ImageNet datasets on a variety of model architectures including convolution networks (CNN, VGG11), ResNets (18, 50), ShuffleNet-V2 and Vision Transformer (ViT B-32), regardless of the batch size. MAUI significantly outperforms prior DRAs in reconstruction quality, achieving 40-120% higher PSNR scores.
Related papers
- Update Your Transformer to the Latest Release: Re-Basin of Task Vectors [27.63078324151366]
Foundation models serve as the backbone for numerous specialized models developed through fine-tuning.<n>When the underlying pretrained model is updated or retrained, the fine-tuned model becomes obsolete.<n>This raises the question: is it possible to transfer fine-tuning to a new release of the model?<n>In this work, we investigate how to transfer fine-tuning to a new checkpoint without having to re-train, in a data-free manner.
arXiv Detail & Related papers (2025-05-28T13:55:12Z) - Is Self-Supervision Enough? Benchmarking Foundation Models Against End-to-End Training for Mitotic Figure Classification [0.37334049820361814]
Foundation models (FMs) have become popular and available recently for the domain of histopathology.<n>In this work, we investigate to which degree this also holds for mitotic figure classification.<n>We found that the end-to-end-trained baseline outperformed all FM-based classifiers, regardless of the amount of data provided.
arXiv Detail & Related papers (2024-12-09T10:35:39Z) - Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning [86.15009879251386]
We propose a novel architecture and method of explainable classification with Concept Bottleneck Models (CBM)
CBMs require an additional set of concepts to leverage.
We show a significant increase in accuracy using sparse hidden layers in CLIP-based bottleneck models.
arXiv Detail & Related papers (2024-04-04T09:43:43Z) - MixtureGrowth: Growing Neural Networks by Recombining Learned Parameters [19.358670728803336]
Most deep neural networks are trained under fixed network architectures and require retraining when the architecture changes.
To avoid this, one can grow from a small network by adding random weights over time to gradually achieve the target network size.
This naive approach falls short in practice as it brings too much noise to the growing process.
arXiv Detail & Related papers (2023-11-07T11:37:08Z) - Maximum Knowledge Orthogonality Reconstruction with Gradients in
Federated Learning [12.709670487307294]
Federated learning (FL) aims at keeping client data local to preserve privacy.
Most existing FL approaches assume an FL setting with unrealistically small batch size.
We propose a novel and completely analytical approach to reconstruct clients' input data.
arXiv Detail & Related papers (2023-10-30T02:01:48Z) - NORM: Knowledge Distillation via N-to-One Representation Matching [18.973254404242507]
We present a new two-stage knowledge distillation method, which relies on a simple Feature Transform (FT) module consisting of two linear layers.
In view of preserving the intact information learnt by the teacher network, our FT module is merely inserted after the last convolutional layer of the student network.
By sequentially splitting the expanded student representation into N non-overlapping feature segments having the same number of feature channels as the teacher's, they can be readily forced to approximate the intact teacher representation simultaneously.
arXiv Detail & Related papers (2023-05-23T08:15:45Z) - Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need [84.3507610522086]
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones.
Recent pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL.
We argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring.
arXiv Detail & Related papers (2023-03-13T17:59:02Z) - BAFFLE: A Baseline of Backpropagation-Free Federated Learning [71.09425114547055]
Federated learning (FL) is a general principle for decentralized clients to train a server model collectively without sharing local data.
We develop backpropagation-free federated learning, dubbed BAFFLE, in which backpropagation is replaced by multiple forward processes to estimate gradients.
BAFFLE is 1) memory-efficient and easily fits uploading bandwidth; 2) compatible with inference-only hardware optimization and model quantization or pruning; and 3) well-suited to trusted execution environments.
arXiv Detail & Related papers (2023-01-28T13:34:36Z) - Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum.
Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels.
They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z) - Fast-iTPN: Integrally Pre-Trained Transformer Pyramid Network with Token
Migration [138.24994198567794]
iTPN is born with two elaborated designs: 1) The first pre-trained feature pyramid upon vision transformer (ViT)
Fast-iTPN can accelerate the inference procedure by up to 70%, with negligible performance loss.
arXiv Detail & Related papers (2022-11-23T06:56:12Z) - FedAvg with Fine Tuning: Local Updates Lead to Representation Learning [54.65133770989836]
Federated Averaging (FedAvg) algorithm consists of alternating between a few local gradient updates at client nodes, followed by a model averaging update at the server.
We show that the reason behind generalizability of the FedAvg's output is its power in learning the common data representation among the clients' tasks.
We also provide empirical evidence demonstrating FedAvg's representation learning ability in federated image classification with heterogeneous data.
arXiv Detail & Related papers (2022-05-27T00:55:24Z) - How Well Do Sparse Imagenet Models Transfer? [75.98123173154605]
Transfer learning is a classic paradigm by which models pretrained on large "upstream" datasets are adapted to yield good results on "downstream" datasets.
In this work, we perform an in-depth investigation of this phenomenon in the context of convolutional neural networks (CNNs) trained on the ImageNet dataset.
We show that sparse models can match or even outperform the transfer performance of dense models, even at high sparsities.
arXiv Detail & Related papers (2021-11-26T11:58:51Z) - Train your classifier first: Cascade Neural Networks Training from upper
layers to lower layers [54.47911829539919]
We develop a novel top-down training method which can be viewed as an algorithm for searching for high-quality classifiers.
We tested this method on automatic speech recognition (ASR) tasks and language modelling tasks.
The proposed method consistently improves recurrent neural network ASR models on Wall Street Journal, self-attention ASR models on Switchboard, and AWD-LSTM language models on WikiText-2.
arXiv Detail & Related papers (2021-02-09T08:19:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.