Related papers: Adversarially-regularized mixed effects deep learning (ARMED) models for improved interpretability, performance, and generalization on clustered data

Adversarially-regularized mixed effects deep learning (ARMED) models for improved interpretability, performance, and generalization on clustered data

URL: http://arxiv.org/abs/2202.11783v1
Date: Wed, 23 Feb 2022 20:58:22 GMT
Title: Adversarially-regularized mixed effects deep learning (ARMED) models for improved interpretability, performance, and generalization on clustered data
Authors: Kevin P. Nguyen, Albert Montillo (for the Alzheimer's Disease Neuroimaging Initiative)
Abstract summary: Mixed effects models separate cluster-invariant, population-level fixed effects from cluster-specific random effects. We propose a general-purpose framework for building Adversarially-Regularized Mixed Effects Deep learning (ARMED) models through 3 non-intrusive additions to existing networks. We apply this framework to dense feedforward neural networks (DFNNs), convolutional neural networks, and autoencoders on 4 applications including simulations, dementia prognosis and diagnosis, and cell microscopy.
Score: 0.974672460306765
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Data in the natural sciences frequently violate assumptions of independence. Such datasets have samples with inherent clustering (e.g. by study site, subject, experimental batch), which may lead to spurious associations, poor model fitting, and confounded analyses. While largely unaddressed in deep learning, mixed effects models have been used in traditional statistics for clustered data. Mixed effects models separate cluster-invariant, population-level fixed effects from cluster-specific random effects. We propose a general-purpose framework for building Adversarially-Regularized Mixed Effects Deep learning (ARMED) models through 3 non-intrusive additions to existing networks: 1) a domain adversarial classifier constraining the original model to learn only cluster-invariant features, 2) a random effects subnetwork capturing cluster-specific features, and 3) a cluster-inferencing approach to predict on clusters unseen during training. We apply this framework to dense feedforward neural networks (DFNNs), convolutional neural networks, and autoencoders on 4 applications including simulations, dementia prognosis and diagnosis, and cell microscopy. We compare to conventional models, domain adversarial-only models, and the naive inclusion of cluster membership as a covariate. Our models better distinguish confounded from true associations in simulations and emphasize more biologically plausible features in clinical applications. ARMED DFNNs quantify inter-cluster variance in clinical data while ARMED autoencoders visualize batch effects in cell images. Finally, ARMED improves accuracy on data from clusters seen during training (up to 28% vs. conventional models) and generalizes better to unseen clusters (up to 9% vs. conventional models). By incorporating powerful mixed effects modeling into deep learning, ARMED increases performance, interpretability, and generalization on clustered data.

Related papers

Interaction-Aware Gaussian Weighting for Clustered Federated Learning [58.92159838586751]
Federated Learning (FL) emerged as a decentralized paradigm to train models while preserving privacy. We propose a novel clustered FL method, FedGWC (Federated Gaussian Weighting Clustering), which groups clients based on their data distribution. Our experiments on benchmark datasets show that FedGWC outperforms existing FL algorithms in cluster quality and classification accuracy.
arXiv Detail & Related papers (2025-02-05T16:33:36Z)
Dynamic Post-Hoc Neural Ensemblers [55.15643209328513]
In this study, we explore employing neural networks as ensemble methods. Motivated by the risk of learning low-diversity ensembles, we propose regularizing the model by randomly dropping base model predictions. We demonstrate this approach lower bounds the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z)
Artificial Data Point Generation in Clustered Latent Space for Small Medical Datasets [4.542616945567623]
This paper introduces a novel method, Artificial Data Point Generation in Clustered Latent Space (AGCL) AGCL is designed to enhance classification performance on small medical datasets through synthetic data generation. It was applied to Parkinson's disease screening, utilizing facial expression data.
arXiv Detail & Related papers (2024-09-26T09:51:08Z)
Enabling Mixed Effects Neural Networks for Diverse, Clustered Data Using Monte Carlo Methods [9.035959289139102]
Mixed effects neural networks (MENNs) separate cluster-specific 'random effects' from cluster-invariant 'fixed effects' We present MC-GMENN, a novel approach employing Monte Carlo methods to train Generalized Mixed Effects Neural Networks.
arXiv Detail & Related papers (2024-07-01T09:24:04Z)
Fast leave-one-cluster-out cross-validation using clustered Network Information Criterion (NICc) [0.32141666878560626]
It is crucial to use cluster-based validation to assess model generalizability on unseen clusters. This paper introduces a clustered estimator of the Network Information Criterion (NICc) to approximate leave-one-cluster-out deviance.
arXiv Detail & Related papers (2024-05-30T18:10:02Z)
The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation. We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare. Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z)
Unified Multi-View Orthonormal Non-Negative Graph Based Clustering Framework [74.25493157757943]
We formulate a novel clustering model, which exploits the non-negative feature property and incorporates the multi-view information into a unified joint learning framework. We also explore, for the first time, the multi-model non-negative graph-based approach to clustering data based on deep features.
arXiv Detail & Related papers (2022-11-03T08:18:27Z)
Adaptive Personlization in Federated Learning for Highly Non-i.i.d. Data [37.667379000751325]
Federated learning (FL) is a distributed learning method that offers medical institutes the prospect of collaboration in a global model. In this work, we investigate an adaptive hierarchical clustering method for FL to produce intermediate semi-global models. Our experiments demonstrate significant performance gain in heterogeneous distribution compared to standard FL methods in classification accuracy.
arXiv Detail & Related papers (2022-07-07T17:25:04Z)
On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules. We study the generalization and adaption performance of such modular neural causal models. Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z)
Bayesian community detection for networks with covariates [16.230648949593153]
"Community detection" has arguably received the most attention in the scientific community. We propose a block model with a co-dependent random partition prior. Our model has the ability to learn the number of the communities via posterior inference without having to assume it to be known.
arXiv Detail & Related papers (2022-03-04T01:58:35Z)
No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data. We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model. Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z)
Semi-Decentralized Federated Edge Learning for Fast Convergence on Non-IID Data [14.269800282001464]
Federated edge learning (FEEL) has emerged as an effective approach to reduce the large communication latency in Cloud-based machine learning solutions. We investigate a novel framework of FEEL, namely semi-decentralized federated edge learning (SD-FEEL) By allowing model aggregation across different edge clusters, SD-FEEL enjoys the benefit of FEEL in reducing the training latency.
arXiv Detail & Related papers (2021-04-26T16:11:47Z)
Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously. We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework. The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.