FasterRisk: Fast and Accurate Interpretable Risk Scores
- URL: http://arxiv.org/abs/2210.05846v1
- Date: Wed, 12 Oct 2022 00:36:01 GMT
- Title: FasterRisk: Fast and Accurate Interpretable Risk Scores
- Authors: Jiachang Liu, Chudi Zhong, Boxuan Li, Margo Seltzer, Cynthia Rudin
- Abstract summary: Risk scores are sparse linear models with integer coefficients.
Recent work used mathematical programming, which is computationally slow.
We introduce an approach for efficiently producing a collection of high-quality risk scores learned from data.
- Score: 17.015744876065195
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the last century, risk scores have been the most popular form of
predictive model used in healthcare and criminal justice. Risk scores are
sparse linear models with integer coefficients; often these models can be
memorized or placed on an index card. Typically, risk scores have been created
either without data or by rounding logistic regression coefficients, but these
methods do not reliably produce high-quality risk scores. Recent work used
mathematical programming, which is computationally slow. We introduce an
approach for efficiently producing a collection of high-quality risk scores
learned from data. Specifically, our approach produces a pool of almost-optimal
sparse continuous solutions, each with a different support set, using a
beam-search algorithm. Each of these continuous solutions is transformed into a
separate risk score through a "star ray" search, where a range of multipliers
are considered before rounding the coefficients sequentially to maintain low
logistic loss. Our algorithm returns all of these high-quality risk scores for
the user to consider. This method completes within minutes and can be valuable
in a broad variety of applications.
Related papers
- Diversification as Risk Minimization [55.503117695676416]
We show that diversification algorithms are no more robust than a naive, non-diversified algorithm.<n>We introduce VRisk, which measures the expected risk faced by the least-served fraction of intents in a query.<n>We then propose VRisker, a fast greedy re-ranker with provable approximation guarantees.
arXiv Detail & Related papers (2025-10-26T13:51:45Z) - Risk-averse Fair Multi-class Classification [0.42970700836450487]
We develop a new classification framework based on the theory of coherent risk measures and systemic risk.<n>The proposed approach is suitable for multi-class problems when the data is noisy, scarce (relative to the dimension of the problem), and the labeling might be unreliable.
arXiv Detail & Related papers (2025-09-06T16:54:00Z) - Asymptotically Optimal Linear Best Feasible Arm Identification with Fixed Budget [55.938644481736446]
We introduce a novel algorithm for best feasible arm identification that guarantees an exponential decay in the error probability.<n>We validate our algorithm through comprehensive empirical evaluations across various problem instances with different levels of complexity.
arXiv Detail & Related papers (2025-06-03T02:56:26Z) - SOREL: A Stochastic Algorithm for Spectral Risks Minimization [1.6574413179773761]
spectral risk has wide applications in machine learning, especially in real-world decision-making.
By assigning different weights to the losses of different sample points, it allows the model's performance to lie between the average performance and the worst-case performance.
We propose SOREL, the first gradient-based algorithm with convergence guarantees for the spectral risk minimization.
arXiv Detail & Related papers (2024-07-19T18:20:53Z) - Deep Ensembles Meets Quantile Regression: Uncertainty-aware Imputation for Time Series [45.76310830281876]
We propose Quantile Sub-Ensembles, a novel method to estimate uncertainty with ensemble of quantile-regression-based task networks.
Our method not only produces accurate imputations that is robust to high missing rates, but also is computationally efficient due to the fast training of its non-generative model.
arXiv Detail & Related papers (2023-12-03T05:52:30Z) - Best Subset Selection in Reduced Rank Regression [1.4699455652461724]
We show that our algorithm can achieve the reduced rank estimation with a significant probability.
The numerical studies and an application in the cancer studies demonstrate effectiveness and scalability.
arXiv Detail & Related papers (2022-11-29T02:51:15Z) - Distributed Dynamic Safe Screening Algorithms for Sparse Regularization [73.85961005970222]
We propose a new distributed dynamic safe screening (DDSS) method for sparsity regularized models and apply it on shared-memory and distributed-memory architecture respectively.
We prove that the proposed method achieves the linear convergence rate with lower overall complexity and can eliminate almost all the inactive features in a finite number of iterations almost surely.
arXiv Detail & Related papers (2022-04-23T02:45:55Z) - Minimax rate of consistency for linear models with missing values [0.0]
Missing values arise in most real-world data sets due to the aggregation of multiple sources and intrinsically missing information (sensor failure, unanswered questions in surveys...).
In this paper, we focus on the extensively-studied linear models, but in presence of missing values, which turns out to be quite a challenging task.
This eventually requires to solve a number of learning tasks, exponential in the number of input features, which makes predictions impossible for current real-world datasets.
arXiv Detail & Related papers (2022-02-03T08:45:34Z) - Mean-based Best Arm Identification in Stochastic Bandits under Reward
Contamination [80.53485617514707]
This paper proposes two algorithms, a gap-based algorithm and one based on the successive elimination, for best arm identification in sub-Gaussian bandits.
Specifically, for the gap-based algorithm, the sample complexity is optimal up to constant factors, while for the successive elimination, it is optimal up to logarithmic factors.
arXiv Detail & Related papers (2021-11-14T21:49:58Z) - Efficient First-Order Contextual Bandits: Prediction, Allocation, and
Triangular Discrimination [82.52105963476703]
A recurring theme in statistical learning, online learning, and beyond is that faster convergence rates are possible for problems with low noise.
First-order guarantees are relatively well understood in statistical and online learning.
We show that the logarithmic loss and an information-theoretic quantity called the triangular discrimination play a fundamental role in obtaining first-order guarantees.
arXiv Detail & Related papers (2021-07-05T19:20:34Z) - Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment.
Policy gradients for local search are often obtained from random perturbations.
We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z) - Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank.
Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z) - Run2Survive: A Decision-theoretic Approach to Algorithm Selection based
on Survival Analysis [75.64261155172856]
survival analysis (SA) naturally supports censored data and offers appropriate ways to use such data for learning distributional models of algorithm runtime.
We leverage such models as a basis of a sophisticated decision-theoretic approach to algorithm selection, which we dub Run2Survive.
In an extensive experimental study with the standard benchmark ASlib, our approach is shown to be highly competitive and in many cases even superior to state-of-the-art AS approaches.
arXiv Detail & Related papers (2020-07-06T15:20:17Z) - Fast Risk Assessment for Autonomous Vehicles Using Learned Models of
Agent Futures [10.358493658420173]
This paper presents fast non-sampling based methods to assess the risk of trajectories for autonomous vehicles.
The presented methods address a wide range of representations for uncertain predictions including both Gaussian and non-Gaussian mixture models.
The presented methods are demonstrated on realistic predictions from propagates trained on the Argoverse and CARLA datasets.
arXiv Detail & Related papers (2020-05-27T16:16:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.