Related papers: Efficient Training of Probabilistic Neural Networks for Survival Analysis

Efficient Training of Probabilistic Neural Networks for Survival Analysis

URL: http://arxiv.org/abs/2404.06421v3
Date: Wed, 19 Jun 2024 00:21:50 GMT
Title: Efficient Training of Probabilistic Neural Networks for Survival Analysis
Authors: Christian Marius Lillelund, Martin Magris, Christian Fischer Pedersen,
Abstract summary: Variational Inference (VI) is a commonly used technique for approximate Bayesian inference and uncertainty estimation in deep learning models. It comes at a computational cost, as it doubles the number of trainable parameters to represent uncertainty. We investigate how to train deep probabilistic survival models in large datasets without introducing additional overhead in model complexity.
Score: 0.6437284704257459
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Variational Inference (VI) is a commonly used technique for approximate Bayesian inference and uncertainty estimation in deep learning models, yet it comes at a computational cost, as it doubles the number of trainable parameters to represent uncertainty. This rapidly becomes challenging in high-dimensional settings and motivates the use of alternative techniques for inference, such as Monte Carlo Dropout (MCD) or Spectral-normalized Neural Gaussian Process (SNGP). However, such methods have seen little adoption in survival analysis, and VI remains the prevalent approach for training probabilistic neural networks. In this paper, we investigate how to train deep probabilistic survival models in large datasets without introducing additional overhead in model complexity. To achieve this, we adopt three probabilistic approaches, namely VI, MCD, and SNGP, and evaluate them in terms of their prediction performance, calibration performance, and model complexity. In the context of probabilistic survival analysis, we investigate whether non-VI techniques can offer comparable or possibly improved prediction performance and uncertainty calibration compared to VI. In the MIMIC-IV dataset, we find that MCD aligns with VI in terms of the concordance index (0.748 vs. 0.743) and mean absolute error (254.9 vs. 254.7) using hinge loss, while providing C-calibrated uncertainty estimates. Moreover, our SNGP implementation provides D-calibrated survival functions in all datasets compared to VI (4/4 vs. 2/4, respectively). Our work encourages the use of techniques alternative to VI for survival analysis in high-dimensional datasets, where computational efficiency and overhead are of concern.

Related papers

Advancing Tabular Stroke Modelling Through a Novel Hybrid Architecture and Feature-Selection Synergy [0.9999629695552196]
The present work develops and validates a data-driven and interpretable machine-learning framework designed to predict strokes.<n>Ten routinely gathered demographic, lifestyle, and clinical variables were sourced from a public cohort of 4,981 records.<n>The proposed model achieved an accuracy rate of 97.2% and an F1-score of 97.15%, indicating a significant enhancement compared to the leading individual model.
arXiv Detail & Related papers (2025-05-18T21:46:45Z)
Optimality and Adaptivity of Deep Neural Features for Instrumental Variable Regression [57.40108516085593]
Deep feature instrumental variable (DFIV) regression is a nonparametric approach to IV regression using data-adaptive features learned by deep neural networks. We prove that the DFIV algorithm achieves the minimax optimal learning rate when the target structural function lies in a Besov space.
arXiv Detail & Related papers (2025-01-09T01:22:22Z)
Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU. As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z)
fastHDMI: Fast Mutual Information Estimation for High-Dimensional Data [2.9901605297536027]
We introduce fastHDMI, a Python package designed for efficient variable screening in high-dimensional datasets. This work pioneers the application of three mutual information estimation methods for neuroimaging variable selection.
arXiv Detail & Related papers (2024-10-14T01:49:53Z)
MCDFN: Supply Chain Demand Forecasting via an Explainable Multi-Channel Data Fusion Network Model [0.0]
We introduce the Multi-Channel Data Fusion Network (MCDFN), a hybrid architecture that integrates CNN, Long Short-Term Memory networks (LSTM), and Gated Recurrent Units (GRU) Our comparative benchmarking demonstrates that MCDFN outperforms seven other deep-learning models. This research advances demand forecasting methodologies and offers practical guidelines for integrating MCDFN into supply chain systems.
arXiv Detail & Related papers (2024-05-24T14:30:00Z)
Partially factorized variational inference for high-dimensional mixed models [0.0]
Variational inference (VI) methods are a popular way to perform such computations. We show that standard VI (i.e. mean-field) dramatically underestimates posterior uncertainty in high-dimensions. We then show how appropriately relaxing the mean-field assumption leads to VI methods whose uncertainty quantification does not deteriorate in high-dimensions.
arXiv Detail & Related papers (2023-12-20T16:12:37Z)
The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation. We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare. Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z)
Amortised Inference in Bayesian Neural Networks [0.0]
We introduce the Amortised Pseudo-Observation Variational Inference Bayesian Neural Network (APOVI-BNN) We show that the amortised inference is of similar or better quality to those obtained through traditional variational inference. We then discuss how the APOVI-BNN may be viewed as a new member of the neural process family.
arXiv Detail & Related papers (2023-09-06T14:02:33Z)
Amortized Variational Inference: A Systematic Review [0.0]
The core principle of Variational Inference (VI) is to convert the statistical inference problem of computing complex posterior probability densities into a tractable optimization problem. The traditional VI algorithm is not scalable to large data sets and is unable to readily infer out-of-bounds data points. Recent developments in the field, like black box-, and amortized-VI, have helped address these issues.
arXiv Detail & Related papers (2022-09-22T09:45:10Z)
Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models. Common methods often consider the feature statistics as deterministic values measured from the learned features. We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z)
Missing Value Imputation on Multidimensional Time Series [16.709162372224355]
We present DeepMVI, a deep learning method for missing value imputation in multidimensional time-series datasets. DeepMVI combines fine-grained and coarse-grained patterns along a time series, and trends from related series across categorical dimensions. Experiments show that DeepMVI is significantly more accurate, reducing error by more than 50% in more than half the cases.
arXiv Detail & Related papers (2021-03-02T09:55:05Z)
Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators. They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions. We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z)
Meta-Learning Divergences of Variational Inference [49.164944557174294]
Variational inference (VI) plays an essential role in approximate Bayesian inference. We propose a meta-learning algorithm to learn the divergence metric suited for the task of interest. We demonstrate our approach outperforms standard VI on Gaussian mixture distribution approximation.
arXiv Detail & Related papers (2020-07-06T17:43:01Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.