Federated Survival Analysis with Discrete-Time Cox Models
- URL: http://arxiv.org/abs/2006.08997v1
- Date: Tue, 16 Jun 2020 08:53:19 GMT
- Title: Federated Survival Analysis with Discrete-Time Cox Models
- Authors: Mathieu Andreux, Andre Manoel, Romuald Menuet, Charlie Saillard,
Chlo\'e Simpson
- Abstract summary: We build machine learning models from decentralized datasets located in different centers with federated learning (FL)
We show that the resulting model may suffer from important performance loss in some adverse settings.
Using this approach, we train survival models using standard FL techniques on synthetic data, as well as real-world datasets from The Cancer Genome Atlas (TCGA)
- Score: 0.46331617589391827
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Building machine learning models from decentralized datasets located in
different centers with federated learning (FL) is a promising approach to
circumvent local data scarcity while preserving privacy. However, the prominent
Cox proportional hazards (PH) model, used for survival analysis, does not fit
the FL framework, as its loss function is non-separable with respect to the
samples. The na\"ive method to bypass this non-separability consists in
calculating the losses per center, and minimizing their sum as an approximation
of the true loss. We show that the resulting model may suffer from important
performance loss in some adverse settings. Instead, we leverage the
discrete-time extension of the Cox PH model to formulate survival analysis as a
classification problem with a separable loss function. Using this approach, we
train survival models using standard FL techniques on synthetic data, as well
as real-world datasets from The Cancer Genome Atlas (TCGA), showing similar
performance to a Cox PH model trained on aggregated data. Compared to previous
works, the proposed method is more communication-efficient, more generic, and
more amenable to using privacy-preserving techniques.
Related papers
- Mitigating Embedding Collapse in Diffusion Models for Categorical Data [52.90687881724333]
We introduce CATDM, a continuous diffusion framework within the embedding space that stabilizes training.
Experiments on benchmarks show that CATDM mitigates embedding collapse, yielding superior results on FFHQ, LSUN Churches, and LSUN Bedrooms.
arXiv Detail & Related papers (2024-10-18T09:12:33Z) - Fairness in Survival Analysis with Distributionally Robust Optimization [13.159777131162965]
We propose a general approach for encouraging fairness in survival analysis models based on minimizing a worst-case error across all subpopulations.
This approach can be used to convert many existing survival analysis models into ones that simultaneously encourage fairness.
arXiv Detail & Related papers (2024-08-31T15:03:20Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Variational Deep Survival Machines: Survival Regression with Censored Outcomes [11.82370259688716]
Survival regression aims to predict the time when an event of interest will take place, typically a death or a failure.
We present a novel method to predict the survival time by better clustering the survival data and combine primitive distributions.
arXiv Detail & Related papers (2024-04-24T02:16:00Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Fake It Till Make It: Federated Learning with Consensus-Oriented
Generation [52.82176415223988]
We propose federated learning with consensus-oriented generation (FedCOG)
FedCOG consists of two key components at the client side: complementary data generation and knowledge-distillation-based model training.
Experiments on classical and real-world FL datasets show that FedCOG consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-12-10T18:49:59Z) - A Federated Learning-based Industrial Health Prognostics for
Heterogeneous Edge Devices using Matched Feature Extraction [16.337207503536384]
We propose a pioneering FL-based health prognostic model with a feature similarity-matched parameter aggregation algorithm.
We show that the proposed method yields accuracy improvements as high as 44.5% and 39.3% for state-of-health estimation and remaining useful life estimation.
arXiv Detail & Related papers (2023-05-13T07:20:31Z) - FedPseudo: Pseudo value-based Deep Learning Models for Federated
Survival Analysis [9.659041001051415]
We propose a first-of-its-kind, pseudo value-based deep learning model for federated survival analysis called FedPseudo.
Our proposed FL framework achieves similar performance as the best centrally trained deep survival analysis model.
arXiv Detail & Related papers (2022-07-12T01:10:36Z) - Learn from Unpaired Data for Image Restoration: A Variational Bayes
Approach [18.007258270845107]
We propose LUD-VAE, a deep generative method to learn the joint probability density function from data sampled from marginal distributions.
We apply our method to real-world image denoising and super-resolution tasks and train the models using the synthetic data generated by the LUD-VAE.
arXiv Detail & Related papers (2022-04-21T13:27:17Z) - Contrastive Model Inversion for Data-Free Knowledge Distillation [60.08025054715192]
We propose Contrastive Model Inversion, where the data diversity is explicitly modeled as an optimizable objective.
Our main observation is that, under the constraint of the same amount of data, higher data diversity usually indicates stronger instance discrimination.
Experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet demonstrate that CMI achieves significantly superior performance when the generated data are used for knowledge distillation.
arXiv Detail & Related papers (2021-05-18T15:13:00Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.