Federated Learning Algorithms for Generalized Mixed-effects Model (GLMM)
on Horizontally Partitioned Data from Distributed Sources
- URL: http://arxiv.org/abs/2109.14046v1
- Date: Tue, 28 Sep 2021 21:01:30 GMT
- Title: Federated Learning Algorithms for Generalized Mixed-effects Model (GLMM)
on Horizontally Partitioned Data from Distributed Sources
- Authors: Wentao Li, Jiayi Tong, Md.Monowar Anjum, Noman Mohammed, Yong Chen,
Xiaoqian Jiang
- Abstract summary: This paper develops two algorithms to achieve federated generalized linear mixed effect models (GLMM)
The log-likelihood function of GLMM is approximated by two numerical methods, which supports federated decomposition of GLMM to bring to data.
Experiment results demonstrate comparable (Laplace) and superior (Gaussian-Hermite) performances with simulated and real-world data.
- Score: 10.445522754737496
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Objectives: This paper develops two algorithms to achieve federated
generalized linear mixed effect models (GLMM), and compares the developed
model's outcomes with each other, as well as that from the standard R package
(`lme4').
Methods: The log-likelihood function of GLMM is approximated by two numerical
methods (Laplace approximation and Gaussian Hermite approximation), which
supports federated decomposition of GLMM to bring computation to data.
Results: Our developed method can handle GLMM to accommodate hierarchical
data with multiple non-independent levels of observations in a federated
setting. The experiment results demonstrate comparable (Laplace) and superior
(Gaussian-Hermite) performances with simulated and real-world data.
Conclusion: We developed and compared federated GLMMs with different
approximations, which can support researchers in analyzing biomedical data to
accommodate mixed effects and address non-independence due to hierarchical
structures (i.e., institutes, region, country, etc.).
Related papers
- Fake It Till Make It: Federated Learning with Consensus-Oriented
Generation [52.82176415223988]
We propose federated learning with consensus-oriented generation (FedCOG)
FedCOG consists of two key components at the client side: complementary data generation and knowledge-distillation-based model training.
Experiments on classical and real-world FL datasets show that FedCOG consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-12-10T18:49:59Z) - Distributed Linear Regression with Compositional Covariates [5.085889377571319]
We focus on the distributed sparse penalized linear log-contrast model in massive compositional data.
Two distributed optimization techniques are proposed for solving the two different constrained convex optimization problems.
In the decentralized topology, we introduce a distributed coordinate-wise descent algorithm for obtaining a communication-efficient regularized estimation.
arXiv Detail & Related papers (2023-10-21T11:09:37Z) - Bridging Distribution Learning and Image Clustering in High-dimensional
Space [9.131712404284876]
Distribution learning focuses on learning the probability density function from a set of data samples.
clustering aims to group similar objects together in an unsupervised manner.
In this paper, we use an autoencoder to encode images into a high-dimensional latent space.
arXiv Detail & Related papers (2023-08-29T23:35:36Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Kernel Biclustering algorithm in Hilbert Spaces [8.303238963864885]
We develop a new model-free biclustering algorithm in abstract spaces using the notions of energy distance and the maximum mean discrepancy.
The proposed method can learn more general and complex cluster shapes than most existing literature approaches.
Our results are similar to state-of-the-art methods in their optimal scenarios, assuming a proper kernel choice.
arXiv Detail & Related papers (2022-08-07T08:41:46Z) - Tackling Data Heterogeneity: A New Unified Framework for Decentralized
SGD with Sample-induced Topology [6.6682038218782065]
We develop a general framework unifying several gradient-based optimization methods for empirical risk minimization problems.
We provide a unified perspective for variance-reduction (VR) and gradient-tracking (GT) methods such as SAGA, Local-SVRG and GT-SAGA.
The rate results reveal that VR and GT methods can effectively eliminate data within and across devices, respectively, enabling the exact convergence of the algorithm to the optimal solution.
arXiv Detail & Related papers (2022-07-08T07:50:08Z) - A Robust and Flexible EM Algorithm for Mixtures of Elliptical
Distributions with Missing Data [71.9573352891936]
This paper tackles the problem of missing data imputation for noisy and non-Gaussian data.
A new EM algorithm is investigated for mixtures of elliptical distributions with the property of handling potential missing data.
Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data.
arXiv Detail & Related papers (2022-01-28T10:01:37Z) - BCDAG: An R package for Bayesian structure and Causal learning of
Gaussian DAGs [77.34726150561087]
We introduce the R package for causal discovery and causal effect estimation from observational data.
Our implementation scales efficiently with the number of observations and, whenever the DAGs are sufficiently sparse, the number of variables in the dataset.
We then illustrate the main functions and algorithms on both real and simulated datasets.
arXiv Detail & Related papers (2022-01-28T09:30:32Z) - Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues.
We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders.
We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z) - Clustering Binary Data by Application of Combinatorial Optimization
Heuristics [52.77024349608834]
We study clustering methods for binary data, first defining aggregation criteria that measure the compactness of clusters.
Five new and original methods are introduced, using neighborhoods and population behavior optimization metaheuristics.
From a set of 16 data tables generated by a quasi-Monte Carlo experiment, a comparison is performed for one of the aggregations using L1 dissimilarity, with hierarchical clustering, and a version of k-means: partitioning around medoids or PAM.
arXiv Detail & Related papers (2020-01-06T23:33:31Z) - Projection pursuit based on Gaussian mixtures and evolutionary
algorithms [0.0]
We propose a projection pursuit (PP) algorithm based on Gaussian mixture models (GMMs)
We show that this semi-parametric approach to PP is flexible and allows highly informative structures to be detected.
The performance of the proposed approach is shown on both artificial and real datasets.
arXiv Detail & Related papers (2019-12-27T10:25:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.