Related papers: Transformer-based Federated Learning for Multi-Label Remote Sensing Image Classification

Transformer-based Federated Learning for Multi-Label Remote Sensing Image Classification

URL: http://arxiv.org/abs/2405.15405v1
Date: Fri, 24 May 2024 10:13:49 GMT
Title: Transformer-based Federated Learning for Multi-Label Remote Sensing Image Classification
Authors: Barış Büyüktaş, Kenneth Weitzel, Sebastian Völkers, Felix Zailskas, Begüm Demir,
Abstract summary: We investigate the capability of state-of-the-art transformer architectures to address the challenges related to non-IID training data across various clients. The considered transformer architectures increase the ability with the cost of higher local training and aggregation complexities.
Score: 2.3255040478777755
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Federated learning (FL) aims to collaboratively learn deep learning model parameters from decentralized data archives (i.e., clients) without accessing training data on clients. However, the training data across clients might be not independent and identically distributed (non-IID), which may result in difficulty in achieving optimal model convergence. In this work, we investigate the capability of state-of-the-art transformer architectures (which are MLP-Mixer, ConvMixer, PoolFormer) to address the challenges related to non-IID training data across various clients in the context of FL for multi-label classification (MLC) problems in remote sensing (RS). The considered transformer architectures are compared among themselves and with the ResNet-50 architecture in terms of their: 1) robustness to training data heterogeneity; 2) local training complexity; and 3) aggregation complexity under different non-IID levels. The experimental results obtained on the BigEarthNet-S2 benchmark archive demonstrate that the considered architectures increase the generalization ability with the cost of higher local training and aggregation complexities. On the basis of our analysis, some guidelines are derived for a proper selection of transformer architecture in the context of FL for RS MLC. The code of this work is publicly available at https://git.tu-berlin.de/rsim/FL-Transformer.

Related papers

An Adaptive ML Framework for Power Converter Monitoring via Federated Transfer Learning [0.0]
This study explores alternative framework configurations for adapting thermal machine learning (ML) models for power converters. The framework starts with a base model that is incrementally adapted by multiple clients via adapting three state-of-the-art domain adaptation techniques. Validation with field data demonstrates that fine-tuning offers a straightforward TL approach with high accuracy, making it suitable for practical applications.
arXiv Detail & Related papers (2025-04-23T16:39:54Z)
Modality Alignment Meets Federated Broadcasting [9.752555511824593]
Federated learning (FL) has emerged as a powerful approach to safeguard data privacy by training models across distributed edge devices without centralizing local data. This paper introduces a novel FL framework leveraging modality alignment, where a text encoder resides on the server, and image encoders operate on local devices.
arXiv Detail & Related papers (2024-11-24T13:30:03Z)
FLASH: Federated Learning Across Simultaneous Heterogeneities [54.80435317208111]
FLASH(Federated Learning Across Simultaneous Heterogeneities) is a lightweight and flexible client selection algorithm. It outperforms state-of-the-art FL frameworks under extensive sources of Heterogeneities. It achieves substantial and consistent improvements over state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-13T20:04:39Z)
FedConv: Enhancing Convolutional Neural Networks for Handling Data Heterogeneity in Federated Learning [34.37155882617201]
Federated learning (FL) is an emerging paradigm in machine learning, where a shared model is collaboratively learned using data from multiple devices. We systematically investigate the impact of different architectural elements, such as activation functions and normalization layers, on the performance within heterogeneous FL. Our findings indicate that with strategic architectural modifications, pure CNNs can achieve a level of robustness that either matches or even exceeds that of ViTs.
arXiv Detail & Related papers (2023-10-06T17:57:50Z)
Towards Instance-adaptive Inference for Federated Learning [80.38701896056828]
Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training. In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework. Our experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64% improvement against the top-performing method with less than 15% communication cost on Tiny-ImageNet.
arXiv Detail & Related papers (2023-08-11T09:58:47Z)
SIM-Trans: Structure Information Modeling Transformer for Fine-grained Visual Categorization [59.732036564862796]
We propose the Structure Information Modeling Transformer (SIM-Trans) to incorporate object structure information into transformer for enhancing discriminative representation learning. The proposed two modules are light-weighted and can be plugged into any transformer network and trained end-to-end easily. Experiments and analyses demonstrate that the proposed SIM-Trans achieves state-of-the-art performance on fine-grained visual categorization benchmarks.
arXiv Detail & Related papers (2022-08-31T03:00:07Z)
Heterogeneous Ensemble Knowledge Transfer for Training Large Models in Federated Learning [22.310090483499035]
Federated learning (FL) enables edge-devices to collaboratively learn a model without disclosing their private data to a central aggregating server. Most existing FL algorithms require models of identical architecture to be deployed across the clients and server. We propose a novel ensemble knowledge transfer method named Fed-ET in which small models are trained on clients, and used to train a larger model at the server.
arXiv Detail & Related papers (2022-04-27T05:18:32Z)
Build a Robust QA System with Transformer-based Mixture of Experts [0.29005223064604074]
We build a robust question answering system that can adapt to out-of-domain datasets. We show that our combination of best architecture and data augmentation techniques achieves a 53.477 F1 score in the out-of-domain evaluation.
arXiv Detail & Related papers (2022-03-20T02:38:29Z)
Robust Semi-supervised Federated Learning for Images Automatic Recognition in Internet of Drones [57.468730437381076]
We present a Semi-supervised Federated Learning (SSFL) framework for privacy-preserving UAV image recognition. There are significant differences in the number, features, and distribution of local data collected by UAVs using different camera modules. We propose an aggregation rule based on the frequency of the client's participation in training, namely the FedFreq aggregation rule.
arXiv Detail & Related papers (2022-01-03T16:49:33Z)
Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning [53.73083199055093]
We show that attention-based architectures (e.g., Transformers) are fairly robust to distribution shifts. Our experiments show that replacing convolutional networks with Transformers can greatly reduce catastrophic forgetting of previous devices.
arXiv Detail & Related papers (2021-06-10T21:04:18Z)
HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients [42.365530133003816]
We propose a new federated learning framework named HeteroFL to address heterogeneous clients equipped with very different computation and communication capabilities. Our solution can enable the training of heterogeneous local models with varying complexities. We show that adaptively distributing data according to clients' capabilities is both computation and communication efficient.
arXiv Detail & Related papers (2020-10-03T02:55:33Z)
Ensemble Distillation for Robust Model Fusion in Federated Learning [72.61259487233214]
Federated Learning (FL) is a machine learning setting where many devices collaboratively train a machine learning model. In most of the current training schemes the central model is refined by averaging the parameters of the server model and the updated parameters from the client side. We propose ensemble distillation for model fusion, i.e. training the central classifier through unlabeled data on the outputs of the models from the clients.
arXiv Detail & Related papers (2020-06-12T14:49:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.