A Huber loss-based super learner with applications to healthcare
expenditures
- URL: http://arxiv.org/abs/2205.06870v1
- Date: Fri, 13 May 2022 19:57:50 GMT
- Title: A Huber loss-based super learner with applications to healthcare
expenditures
- Authors: Ziyue Wu, David Benkeser
- Abstract summary: We propose a super learner based on the Huber loss, a "robust" loss function that combines squared error loss with absolute loss to downweight.
We show that the proposed method can be used both directly to optimize Huber risk, as well as in finite-sample settings.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Complex distributions of the healthcare expenditure pose challenges to
statistical modeling via a single model. Super learning, an ensemble method
that combines a range of candidate models, is a promising alternative for cost
estimation and has shown benefits over a single model. However, standard
approaches to super learning may have poor performance in settings where
extreme values are present, such as healthcare expenditure data. We propose a
super learner based on the Huber loss, a "robust" loss function that combines
squared error loss with absolute loss to down-weight the influence of outliers.
We derive oracle inequalities that establish bounds on the finite-sample and
asymptotic performance of the method. We show that the proposed method can be
used both directly to optimize Huber risk, as well as in finite-sample settings
where optimizing mean squared error is the ultimate goal. For this latter
scenario, we provide two methods for performing a grid search for values of the
robustification parameter indexing the Huber loss. Simulations and real data
analysis demonstrate appreciable finite-sample gains in cost prediction and
causal effect estimation using our proposed method.
Related papers
- Regret Minimization and Statistical Inference in Online Decision Making with High-dimensional Covariates [7.21848268647674]
We integrate the $varepsilon$-greedy bandit algorithm for decision-making with a hard thresholding algorithm for estimating sparse bandit parameters.
Under a margin condition, our method achieves either $O(T1/2)$ regret or classical $O(T1/2)$-consistent inference.
arXiv Detail & Related papers (2024-11-10T01:47:11Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - Asymptotic Characterisation of Robust Empirical Risk Minimisation
Performance in the Presence of Outliers [18.455890316339595]
We study robust linear regression in high-dimension, when both the dimension $d$ and the number of data points $n$ diverge with a fixed ratio $alpha=n/d$, and study a data model that includes outliers.
We provide exacts for the performances of the empirical risk minimisation (ERM) using $ell$-regularised $ell$, $ell_$, and Huber losses.
arXiv Detail & Related papers (2023-05-30T12:18:39Z) - Optimal Sparse Recovery with Decision Stumps [7.24496247221802]
We show that tree based methods attain strong feature selection properties under a wide variety of settings.
As a byproduct of our analysis, we show that we can provably guarantee recovery even when the number of active features $s$ is unknown.
arXiv Detail & Related papers (2023-03-08T00:43:06Z) - On the Pitfalls of Heteroscedastic Uncertainty Estimation with
Probabilistic Neural Networks [23.502721524477444]
We present a synthetic example illustrating how this approach can lead to very poor but stable estimates.
We identify the culprit to be the log-likelihood loss, along with certain conditions that exacerbate the issue.
We present an alternative formulation, termed $beta$-NLL, in which each data point's contribution to the loss is weighted by the $beta$-exponentiated variance estimate.
arXiv Detail & Related papers (2022-03-17T08:46:17Z) - Error-based Knockoffs Inference for Controlled Feature Selection [49.99321384855201]
We propose an error-based knockoff inference method by integrating the knockoff features, the error-based feature importance statistics, and the stepdown procedure together.
The proposed inference procedure does not require specifying a regression model and can handle feature selection with theoretical guarantees.
arXiv Detail & Related papers (2022-03-09T01:55:59Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Risk Minimization from Adaptively Collected Data: Guarantees for
Supervised and Policy Learning [57.88785630755165]
Empirical risk minimization (ERM) is the workhorse of machine learning, but its model-agnostic guarantees can fail when we use adaptively collected data.
We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class.
For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero.
arXiv Detail & Related papers (2021-06-03T09:50:13Z) - Scalable Personalised Item Ranking through Parametric Density Estimation [53.44830012414444]
Learning from implicit feedback is challenging because of the difficult nature of the one-class problem.
Most conventional methods use a pairwise ranking approach and negative samplers to cope with the one-class problem.
We propose a learning-to-rank approach, which achieves convergence speed comparable to the pointwise counterpart.
arXiv Detail & Related papers (2021-05-11T03:38:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.