Related papers: Navigating Data Heterogeneity in Federated Learning A Semi-Supervised Federated Object Detection

Navigating Data Heterogeneity in Federated Learning A Semi-Supervised Federated Object Detection

URL: http://arxiv.org/abs/2310.17097v3
Date: Wed, 3 Jan 2024 01:03:58 GMT
Title: Navigating Data Heterogeneity in Federated Learning A Semi-Supervised Federated Object Detection
Authors: Taehyeon Kim, Eric Lin, Junu Lee, Christian Lau, Vaikkunth Mugunthan
Abstract summary: Federated Learning (FL) has emerged as a potent framework for training models across distributed data sources. It faces challenges with limited high-quality labels and non-IID client data, particularly in applications like autonomous driving. We present a pioneering SSFOD framework, designed for scenarios where labeled data reside only at the server while clients possess unlabeled data.
Score: 3.7398615061365206
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Federated Learning (FL) has emerged as a potent framework for training models across distributed data sources while maintaining data privacy. Nevertheless, it faces challenges with limited high-quality labels and non-IID client data, particularly in applications like autonomous driving. To address these hurdles, we navigate the uncharted waters of Semi-Supervised Federated Object Detection (SSFOD). We present a pioneering SSFOD framework, designed for scenarios where labeled data reside only at the server while clients possess unlabeled data. Notably, our method represents the inaugural implementation of SSFOD for clients with 0% labeled non-IID data, a stark contrast to previous studies that maintain some subset of labels at each client. We propose FedSTO, a two-stage strategy encompassing Selective Training followed by Orthogonally enhanced full-parameter training, to effectively address data shift (e.g. weather conditions) between server and clients. Our contributions include selectively refining the backbone of the detector to avert overfitting, orthogonality regularization to boost representation divergence, and local EMA-driven pseudo label assignment to yield high-quality pseudo labels. Extensive validation on prominent autonomous driving datasets (BDD100K, Cityscapes, and SODA10M) attests to the efficacy of our approach, demonstrating state-of-the-art results. Remarkably, FedSTO, using just 20-30% of labels, performs nearly as well as fully-supervised centralized training methods.

Related papers

Semi-Supervised Federated Learning via Dual Contrastive Learning and Soft Labeling for Intelligent Fault Diagnosis [30.60728200709919]
This paper proposes a semi-supervised federated learning framework, SSFL-DCSL.<n>It integrates dual contrastive loss and soft labeling to address data and label scarcity for distributed clients.<n>It can improve accuracy by 1.15% to 7.85% over state-of-the-art methods.
arXiv Detail & Related papers (2025-07-12T10:54:23Z)
Diffusion Model-Based Data Synthesis Aided Federated Semi-Supervised Learning [33.570347678194494]
Federated semi-supervised learning (FSSL) is primarily challenged by two factors: the scarcity of labeled data across clients and the non-independent and identically distribution (non-IID) nature of data among clients. We propose a novel approach, diffusion model-based data synthesis aided FSSL (DDSA-FSSL), which utilizes a diffusion model (DM) to generate synthetic data.
arXiv Detail & Related papers (2025-01-04T07:38:15Z)
Optimizing Federated Learning by Entropy-Based Client Selection [13.851391819710367]
Deep learning domains typically require an extensive amount of data for optimal performance. FedOptEnt is designed to mitigate performance issues caused by label distribution skew. The proposed method outperforms several state-of-the-art algorithms by up to 6% in classification accuracy.
arXiv Detail & Related papers (2024-11-02T13:31:36Z)
FedAnchor: Enhancing Federated Semi-Supervised Learning with Label Contrastive Loss for Unlabeled Clients [19.3885479917635]
Federated learning (FL) is a distributed learning paradigm that facilitates collaborative training of a shared global model across devices. We propose FedAnchor, an innovative FSSL method that introduces a unique double-head structure, called anchor head, paired with the classification head trained exclusively on labeled anchor data on the server. Our approach mitigates the confirmation bias and overfitting issues associated with pseudo-labeling techniques based on high-confidence model prediction samples.
arXiv Detail & Related papers (2024-02-15T18:48:21Z)
Investigation of Federated Learning Algorithms for Retinal Optical Coherence Tomography Image Classification with Statistical Heterogeneity [6.318288071829899]
We investigate the effectiveness of FedAvg and FedProx to train an OCT image classification model in a decentralized fashion. We partitioned a publicly available OCT dataset across multiple clients under IID and Non-IID settings and conducted local training on the subsets for each client.
arXiv Detail & Related papers (2024-02-15T15:58:42Z)
Combating Data Imbalances in Federated Semi-supervised Learning with Dual Regulators [40.12377870379059]
Federated semi-supervised learning (FSSL) emerges to train models from a small fraction of labeled data. We propose a novel FSSL framework with dual regulators, FedDure. We show that FedDure is superior to the existing methods across a wide range of settings.
arXiv Detail & Related papers (2023-07-11T15:45:03Z)
Exploring One-shot Semi-supervised Federated Learning with A Pre-trained Diffusion Model [40.83058938096914]
We propose FedDISC, a Federated Diffusion-Inspired Semi-supervised Co-training method. We first extract prototypes of the labeled server data and use these prototypes to predict pseudo-labels of the client data. For each category, we compute the cluster centroids and domain-specific representations to signify the semantic and stylistic information of their distributions. These representations are sent back to the server, which uses the pre-trained to generate synthetic datasets complying with the client distributions and train a global model on it.
arXiv Detail & Related papers (2023-05-06T14:22:33Z)
Cluster-level pseudo-labelling for source-free cross-domain facial expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER) Our method exploits self-supervised pretraining to learn good feature representations from the target data. We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z)
Rethinking Data Heterogeneity in Federated Learning: Introducing a New Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants. Our observations are intuitive. Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z)
Towards Fair Federated Learning with Zero-Shot Data Augmentation [123.37082242750866]
Federated learning has emerged as an important distributed learning paradigm, where a server aggregates a global model from many client-trained models while having no access to the client data. We propose a novel federated learning system that employs zero-shot data augmentation on under-represented data to mitigate statistical heterogeneity and encourage more uniform accuracy performance across clients in federated networks. We study two variants of this scheme, Fed-ZDAC (federated learning with zero-shot data augmentation at the clients) and Fed-ZDAS (federated learning with zero-shot data augmentation at the server).
arXiv Detail & Related papers (2021-04-27T18:23:54Z)
Dual-Refinement: Joint Label and Feature Refinement for Unsupervised Domain Adaptive Person Re-Identification [51.98150752331922]
Unsupervised domain adaptive (UDA) person re-identification (re-ID) is a challenging task due to the missing of labels for the target domain data. We propose a novel approach, called Dual-Refinement, that jointly refines pseudo labels at the off-line clustering phase and features at the on-line training phase. Our method outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-12-26T07:35:35Z)
Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck. We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network. We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z)
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint Learning [78.88007892742438]
We study two essential scenarios of Federated Semi-Supervised Learning (FSSL) based on the location of the labeled data. We propose a novel method to tackle the problems, which we refer to as Federated Matching (FedMatch)
arXiv Detail & Related papers (2020-06-22T09:43:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.