One-shot Robust Federated Learning of Independent Component Analysis
- URL: http://arxiv.org/abs/2505.20532v1
- Date: Mon, 26 May 2025 21:37:19 GMT
- Title: One-shot Robust Federated Learning of Independent Component Analysis
- Authors: Dian Jin, Xin Bing, Yuqian Zhang,
- Abstract summary: We propose a geometric median-based aggregation algorithm that leverages $k$-means clustering to resolve the permutation ambiguity in local client estimations.<n>Our method first performs k-means to partition client-provided estimators into clusters and then aggregates estimators within each cluster using the geometric median.
- Score: 16.462282750354408
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper investigates a general robust one-shot aggregation framework for distributed and federated Independent Component Analysis (ICA) problem. We propose a geometric median-based aggregation algorithm that leverages $k$-means clustering to resolve the permutation ambiguity in local client estimations. Our method first performs k-means to partition client-provided estimators into clusters and then aggregates estimators within each cluster using the geometric median. This approach provably remains effective even in highly heterogeneous scenarios where at most half of the clients can observe only a minimal number of samples. The key theoretical contribution lies in the combined analysis of the geometric median's error bound-aided by sample quantiles-and the maximum misclustering rates of the aforementioned solution of $k$-means. The effectiveness of the proposed approach is further supported by simulation studies conducted under various heterogeneous settings.
Related papers
- A structured regression approach for evaluating model performance across intersectional subgroups [53.91682617836498]
Disaggregated evaluation is a central task in AI fairness assessment, where the goal is to measure an AI system's performance across different subgroups.
We introduce a structured regression approach to disaggregated evaluation that we demonstrate can yield reliable system performance estimates even for very small subgroups.
arXiv Detail & Related papers (2024-01-26T14:21:45Z) - Dirichlet Process-based Robust Clustering using the Median-of-Means Estimator [16.774378814288806]
We propose an efficient and automatic clustering technique by integrating the strengths of model-based and centroid-based methodologies.<n>Our method mitigates the effect of noise on the quality of clustering; while at the same time, estimates the number of clusters.
arXiv Detail & Related papers (2023-11-26T19:01:15Z) - Evolution of $K$-means solution landscapes with the addition of dataset
outliers and a robust clustering comparison measure for their analysis [0.0]
We use the energy landscape approach to map the change in $K$-means solution space as a result of increasing dataset outliers.
Kinetic analysis reveals that in all cases the overall funnel is composed of shallow locally-funnelled regions.
We propose that the rates obtained from kinetic analysis provide a novel measure of clustering similarity.
arXiv Detail & Related papers (2023-06-25T21:22:21Z) - Simple and Scalable Algorithms for Cluster-Aware Precision Medicine [0.0]
We propose a simple and scalable approach to joint clustering and embedding.
This novel, cluster-aware embedding approach overcomes the complexity and limitations of current joint embedding and clustering methods.
Our approach does not require the user to choose the desired number of clusters, but instead yields interpretable dendrograms of hierarchically clustered embeddings.
arXiv Detail & Related papers (2022-11-29T19:27:26Z) - A One-shot Framework for Distributed Clustered Learning in Heterogeneous
Environments [54.172993875654015]
The paper proposes a family of communication efficient methods for distributed learning in heterogeneous environments.
One-shot approach, based on local computations at the users and a clustering based aggregation step at the server is shown to provide strong learning guarantees.
For strongly convex problems it is shown that, as long as the number of data points per user is above a threshold, the proposed approach achieves order-optimal mean-squared error rates in terms of the sample size.
arXiv Detail & Related papers (2022-09-22T09:04:10Z) - Gradient Based Clustering [72.15857783681658]
We propose a general approach for distance based clustering, using the gradient of the cost function that measures clustering quality.
The approach is an iterative two step procedure (alternating between cluster assignment and cluster center updates) and is applicable to a wide range of functions.
arXiv Detail & Related papers (2022-02-01T19:31:15Z) - Differentially-Private Clustering of Easy Instances [67.04951703461657]
In differentially private clustering, the goal is to identify $k$ cluster centers without disclosing information on individual data points.
We provide implementable differentially private clustering algorithms that provide utility when the data is "easy"
We propose a framework that allows us to apply non-private clustering algorithms to the easy instances and privately combine the results.
arXiv Detail & Related papers (2021-12-29T08:13:56Z) - Uniform Concentration Bounds toward a Unified Framework for Robust
Clustering [21.789311405437573]
Recent advances in center-based clustering continue to improve upon the drawbacks of Lloyd's celebrated $k$-means algorithm.
Various methods seek to address poor local minima, sensitivity to outliers, and data that are not well-suited to Euclidean measures of fit.
This paper proposes a cohesive robust framework for center-based clustering under a general class of dissimilarity measures.
arXiv Detail & Related papers (2021-10-27T03:43:44Z) - Tackling the Objective Inconsistency Problem in Heterogeneous Federated
Optimization [93.78811018928583]
This paper provides a framework to analyze the convergence of federated heterogeneous optimization algorithms.
We propose FedNova, a normalized averaging method that eliminates objective inconsistency while preserving fast error convergence.
arXiv Detail & Related papers (2020-07-15T05:01:23Z) - Computationally efficient sparse clustering [67.95910835079825]
We provide a finite sample analysis of a new clustering algorithm based on PCA.
We show that it achieves the minimax optimal misclustering rate in the regime $|theta infty$.
arXiv Detail & Related papers (2020-05-21T17:51:30Z) - Statistical power for cluster analysis [0.0]
Cluster algorithms are increasingly popular in biomedical research.
We estimate power and accuracy for common analysis through simulation.
We recommend that researchers only apply cluster analysis when large subgroup separation is expected.
arXiv Detail & Related papers (2020-03-01T02:43:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.