Multi-Study Boosting: Theoretical Considerations for Merging vs.
Ensembling
- URL: http://arxiv.org/abs/2207.04588v2
- Date: Wed, 13 Jul 2022 02:29:10 GMT
- Title: Multi-Study Boosting: Theoretical Considerations for Merging vs.
Ensembling
- Authors: Cathy Shyr, Pragya Sur, Giovanni Parmigiani and Prasad Patil
- Abstract summary: Cross-study replicability is a powerful model evaluation criterion that emphasizes generalizability of predictions.
We study boosting algorithms in the presence of potential heterogeneity in predictor-outcome relationships across studies.
We compare two multi-study learning strategies: 1) merging all the studies and training a single model, and 2) multi-study ensembling.
- Score: 2.252304836689618
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross-study replicability is a powerful model evaluation criterion that
emphasizes generalizability of predictions. When training cross-study
replicable prediction models, it is critical to decide between merging and
treating the studies separately. We study boosting algorithms in the presence
of potential heterogeneity in predictor-outcome relationships across studies
and compare two multi-study learning strategies: 1) merging all the studies and
training a single model, and 2) multi-study ensembling, which involves training
a separate model on each study and ensembling the resulting predictions. In the
regression setting, we provide theoretical guidelines based on an analytical
transition point to determine whether it is more beneficial to merge or to
ensemble for boosting with linear learners. In addition, we characterize a
bias-variance decomposition of estimation error for boosting with
component-wise linear learners. We verify the theoretical transition point
result in simulation and illustrate how it can guide the decision on merging
vs. ensembling in an application to breast cancer gene expression data.
Related papers
- Cross-Entropy Is All You Need To Invert the Data Generating Process [29.94396019742267]
Empirical phenomena suggest that supervised models can learn interpretable factors of variation in a linear fashion.
Recent advances in self-supervised learning have shown that these methods can recover latent structures by inverting the data generating process.
We prove that even in standard classification tasks, models learn representations of ground-truth factors of variation up to a linear transformation.
arXiv Detail & Related papers (2024-10-29T09:03:57Z) - Generative vs. Discriminative modeling under the lens of uncertainty quantification [0.929965561686354]
In this paper, we undertake a comparative analysis of generative and discriminative approaches.
We compare the ability of both approaches to leverage information from various sources in an uncertainty aware inference.
We propose a general sampling scheme enabling supervised learning for both approaches, as well as semi-supervised learning when compatible with the considered modeling approach.
arXiv Detail & Related papers (2024-06-13T14:32:43Z) - Convergence Behavior of an Adversarial Weak Supervision Method [10.409652277630133]
Weak Supervision is a paradigm subsuming subareas of machine learning.
By using labeled data to train modern machine learning methods, the cost of acquiring large amounts of hand labeled data can be ameliorated.
Two approaches to combining the rules-of-thumb falls into two camps, reflecting different ideologies of statistical estimation.
arXiv Detail & Related papers (2024-05-25T02:33:17Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Less is More: Mitigate Spurious Correlations for Open-Domain Dialogue
Response Generation Models by Causal Discovery [52.95935278819512]
We conduct the first study on spurious correlations for open-domain response generation models based on a corpus CGDIALOG curated in our work.
Inspired by causal discovery algorithms, we propose a novel model-agnostic method for training and inference of response generation model.
arXiv Detail & Related papers (2023-03-02T06:33:48Z) - Benign-Overfitting in Conditional Average Treatment Effect Prediction
with Linear Regression [14.493176427999028]
We study the benign overfitting theory in the prediction of the conditional average treatment effect (CATE) with linear regression models.
We show that the T-learner fails to achieve the consistency except the random assignment, while the IPW-learner converges the risk to zero if the propensity score is known.
arXiv Detail & Related papers (2022-02-10T18:51:52Z) - Nonparametric Estimation of Heterogeneous Treatment Effects: From Theory
to Learning Algorithms [91.3755431537592]
We analyze four broad meta-learning strategies which rely on plug-in estimation and pseudo-outcome regression.
We highlight how this theoretical reasoning can be used to guide principled algorithm design and translate our analyses into practice.
arXiv Detail & Related papers (2021-01-26T17:11:40Z) - Double Robust Representation Learning for Counterfactual Prediction [68.78210173955001]
We propose a novel scalable method to learn double-robust representations for counterfactual predictions.
We make robust and efficient counterfactual predictions for both individual and average treatment effects.
The algorithm shows competitive performance with the state-of-the-art on real world and synthetic data.
arXiv Detail & Related papers (2020-10-15T16:39:26Z) - Merging versus Ensembling in Multi-Study Prediction: Theoretical Insight from Random Effects [1.2065918767980095]
We compare two multi-study prediction approaches in the presence of potential heterogeneity in predictor-outcome relationships across datasets.
For ridge regression, we show analytically and confirm via simulation that merging yields lower prediction error than ensembling.
We provide analytic expressions for the transition point in various scenarios, study properties, and illustrate how transition point theory can be used for deciding studies should be combined with application from metagenomics.
arXiv Detail & Related papers (2019-05-17T17:28:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.