FedSA: A Unified Representation Learning via Semantic Anchors for Prototype-based Federated Learning
- URL: http://arxiv.org/abs/2501.05496v1
- Date: Thu, 09 Jan 2025 16:10:03 GMT
- Title: FedSA: A Unified Representation Learning via Semantic Anchors for Prototype-based Federated Learning
- Authors: Yanbing Zhou, Xiangmou Qu, Chenlong You, Jiyang Zhou, Jingyue Tang, Xin Zheng, Chunmao Cai, Yingbo Wu,
- Abstract summary: We propose a novel framework named Federated Learning via Semantic Anchors (FedSA) to decouple the generation of prototypes from local representation learning.
FedSA significantly outperforms existing prototype-based FL methods on various classification tasks.
- Score: 4.244188591221394
- License:
- Abstract: Prototype-based federated learning has emerged as a promising approach that shares lightweight prototypes to transfer knowledge among clients with data heterogeneity in a model-agnostic manner. However, existing methods often collect prototypes directly from local models, which inevitably introduce inconsistencies into representation learning due to the biased data distributions and differing model architectures among clients. In this paper, we identify that both statistical and model heterogeneity create a vicious cycle of representation inconsistency, classifier divergence, and skewed prototype alignment, which negatively impacts the performance of clients. To break the vicious cycle, we propose a novel framework named Federated Learning via Semantic Anchors (FedSA) to decouple the generation of prototypes from local representation learning. We introduce a novel perspective that uses simple yet effective semantic anchors serving as prototypes to guide local models in learning consistent representations. By incorporating semantic anchors, we further propose anchor-based regularization with margin-enhanced contrastive learning and anchor-based classifier calibration to correct feature extractors and calibrate classifiers across clients, achieving intra-class compactness and inter-class separability of prototypes while ensuring consistent decision boundaries. We then update the semantic anchors with these consistent and discriminative prototypes, which iteratively encourage clients to collaboratively learn a unified data representation with robust generalization. Extensive experiments under both statistical and model heterogeneity settings show that FedSA significantly outperforms existing prototype-based FL methods on various classification tasks.
Related papers
- Learning Clustering-based Prototypes for Compositional Zero-shot Learning [56.57299428499455]
ClusPro is a robust clustering-based prototype mining framework for Compositional Zero-Shot Learning.
It defines the conceptual boundaries of primitives through a set of diversified prototypes.
ClusPro efficiently performs prototype clustering in a non-parametric fashion without the introduction of additional learnable parameters.
arXiv Detail & Related papers (2025-02-10T14:20:01Z) - Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models [83.02797560769285]
Data-Free Meta-Learning (DFML) aims to derive knowledge from a collection of pre-trained models without accessing their original data.
Current methods often overlook the heterogeneity among pre-trained models, which leads to performance degradation due to task conflicts.
arXiv Detail & Related papers (2024-05-26T13:11:55Z) - A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels.
We present a generative latent variable model for self-supervised learning.
We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z) - Dynamic Heterogeneous Federated Learning with Multi-Level Prototypes [45.13348636579529]
We study the new task, i.e., Dynamic Heterogeneous Federated Learning (DHFL), which addresses the practical scenario where heterogeneous data distributions exist among different clients and dynamic tasks within the client.
To mitigate concept drift, we construct prototypes and semantic prototypes to provide fruitful generalization knowledge and ensure the continuity of prototype spaces.
Extensive experiments show that the proposed method achieves state-of-the-art performance in various settings.
arXiv Detail & Related papers (2023-12-15T15:28:25Z) - A Prototypical Semantic Decoupling Method via Joint Contrastive Learning
for Few-Shot Name Entity Recognition [24.916377682689955]
Few-shot named entity recognition (NER) aims at identifying named entities based on only few labeled instances.
We propose a Prototypical Semantic Decoupling method via joint Contrastive learning (PSDC) for few-shot NER.
Experimental results on two few-shot NER benchmarks demonstrate that PSDC consistently outperforms the previous SOTA methods in terms of overall performance.
arXiv Detail & Related papers (2023-02-27T09:20:00Z) - ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model [18.537838366377915]
ProtoVAE is a variational autoencoder-based framework that learns class-specific prototypes in an end-to-end manner.
It enforces trustworthiness and diversity by regularizing the representation space and introducing an orthonormality constraint.
arXiv Detail & Related papers (2022-10-15T00:42:13Z) - Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - A Closer Look at Personalization in Federated Image Classification [33.27317065917578]
Federated Learning (FL) is developed to learn a single global model across the decentralized data.
This paper shows that it is possible to achieve flexible personalization after the convergence of the global model.
We propose RepPer, an independent two-stage personalized FL framework.
arXiv Detail & Related papers (2022-04-22T06:32:18Z) - Prototypical Representation Learning for Relation Extraction [56.501332067073065]
This paper aims to learn predictive, interpretable, and robust relation representations from distantly-labeled data.
We learn prototypes for each relation from contextual information to best explore the intrinsic semantics of relations.
Results on several relation learning tasks show that our model significantly outperforms the previous state-of-the-art relational models.
arXiv Detail & Related papers (2021-03-22T08:11:43Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z) - From Anchor Generation to Distribution Alignment: Learning a
Discriminative Embedding Space for Zero-Shot Recognition [46.47620562161315]
In zero-shot learning (ZSL), the samples to be classified are usually projected into side information templates such as attributes.
We propose a novel framework called Discriminative Anchor Generation and Distribution Alignment Model (DAGDA)
Firstly, in order to rectify the distribution of original templates, a diffusion based graph convolutional network, which can explicitly model the interaction between class and side information, is proposed to produce discriminative anchors.
Secondly, to further align the samples with the corresponding anchors in anchor space, which aims to refine the distribution in a fine-grained manner, we introduce a semantic relation regularization
arXiv Detail & Related papers (2020-02-10T05:25:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.