FastSurvival: Hidden Computational Blessings in Training Cox Proportional Hazards Models
- URL: http://arxiv.org/abs/2410.19081v1
- Date: Thu, 24 Oct 2024 18:36:59 GMT
- Title: FastSurvival: Hidden Computational Blessings in Training Cox Proportional Hazards Models
- Authors: Jiachang Liu, Rui Zhang, Cynthia Rudin,
- Abstract summary: Cox proportional hazards (CPH) model is widely used for interpretability, flexibility, and predictive performance.
Current algorithms to train the CPH model have drawbacks, preventing us from using the CPH model at its full potential.
We propose new optimization methods by constructing and minimizing surrogate functions that exploit hidden mathematical structures of the CPH model.
- Score: 24.041562124587262
- License:
- Abstract: Survival analysis is an important research topic with applications in healthcare, business, and manufacturing. One essential tool in this area is the Cox proportional hazards (CPH) model, which is widely used for its interpretability, flexibility, and predictive performance. However, for modern data science challenges such as high dimensionality (both $n$ and $p$) and high feature correlations, current algorithms to train the CPH model have drawbacks, preventing us from using the CPH model at its full potential. The root cause is that the current algorithms, based on the Newton method, have trouble converging due to vanishing second order derivatives when outside the local region of the minimizer. To circumvent this problem, we propose new optimization methods by constructing and minimizing surrogate functions that exploit hidden mathematical structures of the CPH model. Our new methods are easy to implement and ensure monotonic loss decrease and global convergence. Empirically, we verify the computational efficiency of our methods. As a direct application, we show how our optimization methods can be used to solve the cardinality-constrained CPH problem, producing very sparse high-quality models that were not previously practical to construct. We list several extensions that our breakthrough enables, including optimization opportunities, theoretical questions on CPH's mathematical structure, as well as other CPH-related applications.
Related papers
- Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - Unconstrained Stochastic CCA: Unifying Multiview and Self-Supervised Learning [0.13654846342364307]
We present a family of fast algorithms for PLS, CCA, and Deep CCA on all standard CCA and Deep CCA benchmarks.
Our algorithms show far faster convergence and recover higher correlations than the previous state-of-the-art benchmarks.
These improvements allow us to perform a first-of-its-kind PLS analysis of an extremely large biomedical dataset.
arXiv Detail & Related papers (2023-10-02T09:03:59Z) - Understanding Augmentation-based Self-Supervised Representation Learning
via RKHS Approximation and Regression [53.15502562048627]
Recent work has built the connection between self-supervised learning and the approximation of the top eigenspace of a graph Laplacian operator.
This work delves into a statistical analysis of augmentation-based pretraining.
arXiv Detail & Related papers (2023-06-01T15:18:55Z) - Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.
We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z) - A Statistical Learning Take on the Concordance Index for Survival
Analysis [0.29005223064604074]
We provide C-index Fisher-consistency results and excess risk bounds for several commonly used cost functions in survival analysis.
We also study the general case where no model assumption is made and present a new, off-the-shelf method that is shown to be consistent with the C-index.
arXiv Detail & Related papers (2023-02-23T14:33:54Z) - NODAGS-Flow: Nonlinear Cyclic Causal Structure Learning [8.20217860574125]
We propose a novel framework for learning nonlinear cyclic causal models from interventional data, called NODAGS-Flow.
We show significant performance improvements with our approach compared to state-of-the-art methods with respect to structure recovery and predictive performance.
arXiv Detail & Related papers (2023-01-04T23:28:18Z) - FastCPH: Efficient Survival Analysis for Neural Networks [57.03275837523063]
We propose FastCPH, a new method that runs in linear time and supports both the standard Breslow and Efron methods for tied events.
We also demonstrate the performance of FastCPH combined with LassoNet, a neural network that provides interpretability through feature sparsity.
arXiv Detail & Related papers (2022-08-21T03:35:29Z) - COCO Denoiser: Using Co-Coercivity for Variance Reduction in Stochastic
Convex Optimization [4.970364068620608]
We exploit convexity and L-smoothness to improve the noisy estimates outputted by the gradient oracle.
We show that increasing the number and proximity of the queried points leads to better gradient estimates.
We also apply COCO in vanilla settings by plugging it in existing algorithms, such as SGD, Adam or STRSAGA.
arXiv Detail & Related papers (2021-09-07T17:21:09Z) - Offline Model-Based Optimization via Normalized Maximum Likelihood
Estimation [101.22379613810881]
We consider data-driven optimization problems where one must maximize a function given only queries at a fixed set of points.
This problem setting emerges in many domains where function evaluation is a complex and expensive process.
We propose a tractable approximation that allows us to scale our method to high-capacity neural network models.
arXiv Detail & Related papers (2021-02-16T06:04:27Z) - High-Dimensional Bayesian Optimization via Tree-Structured Additive
Models [40.497123136157946]
We consider generalized additive models in which low-dimensional functions with overlapping subsets of variables are composed to model a high-dimensional target function.
Our goal is to lower the computational resources required and facilitate faster model learning.
We demonstrate and discuss the efficacy of our approach via a range of experiments on synthetic functions and real-world datasets.
arXiv Detail & Related papers (2020-12-24T03:56:44Z) - Efficient Model-Based Reinforcement Learning through Optimistic Policy
Search and Planning [93.1435980666675]
We show how optimistic exploration can be easily combined with state-of-the-art reinforcement learning algorithms.
Our experiments demonstrate that optimistic exploration significantly speeds-up learning when there are penalties on actions.
arXiv Detail & Related papers (2020-06-15T18:37:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.