Inference post Selection of Group-sparse Regression Models
- URL: http://arxiv.org/abs/2012.15664v1
- Date: Thu, 31 Dec 2020 15:43:26 GMT
- Title: Inference post Selection of Group-sparse Regression Models
- Authors: Snigdha Panigrahi, Peter W. MacDonald, Daniel Kessler
- Abstract summary: Conditional inference provides a rigorous approach to counter bias when data from automated model selections is reused for inference.
We develop in this paper a statistically consistent Bayesian framework to assess uncertainties within linear models.
Finding wide applications when genes, proteins, genetic variants, neuroimaging measurements are grouped respectively by their biological pathways, molecular functions, regulatory regions, cognitive roles, these models are selected through a useful class of group-sparse learning algorithms.
- Score: 2.1485350418225244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conditional inference provides a rigorous approach to counter bias when data
from automated model selections is reused for inference. We develop in this
paper a statistically consistent Bayesian framework to assess uncertainties
within linear models that are informed by grouped sparsities in covariates.
Finding wide applications when genes, proteins, genetic variants, neuroimaging
measurements are grouped respectively by their biological pathways, molecular
functions, regulatory regions, cognitive roles, these models are selected
through a useful class of group-sparse learning algorithms. An adjustment
factor to account precisely for the selection of promising groups, deployed
with a generalized version of Laplace-type approximations is the centerpiece of
our new methods. Accommodating well known group-sparse models such as those
selected by the Group LASSO, the overlapping Group LASSO, the sparse Group
LASSO etc., we illustrate the efficacy of our methodology in extensive
experiments and on data from a human neuroimaging application.
Related papers
- Selective inference using randomized group lasso estimators for general models [3.4034453928075865]
The method includes the use of exponential family distributions, as well as quasi-likelihood modeling for overdispersed count data.
A randomized group-regularized optimization problem is studied.
Confidence regions for the regression parameters in the selected model take the form of Wald-type regions and are shown to have bounded volume.
arXiv Detail & Related papers (2023-06-24T01:14:26Z) - Leveraging Structure for Improved Classification of Grouped Biased Data [8.121462458089143]
We consider semi-supervised binary classification for applications in which data points are naturally grouped.
We derive a semi-supervised algorithm that explicitly leverages the structure to learn an optimal, group-aware, probability-outputd classifier.
arXiv Detail & Related papers (2022-12-07T15:18:21Z) - Composite Feature Selection using Deep Ensembles [130.72015919510605]
We investigate the problem of discovering groups of predictive features without predefined grouping.
We introduce a novel deep learning architecture that uses an ensemble of feature selection models to find predictive groups.
We propose a new metric to measure similarity between discovered groups and the ground truth.
arXiv Detail & Related papers (2022-11-01T17:49:40Z) - Improving Group Lasso for high-dimensional categorical data [0.90238471756546]
Group Lasso is a well known efficient algorithm for selection continuous or categorical variables.
We propose a two-step procedure to obtain a sparse solution of the Group Lasso.
We show that our method performs better than the state of the art algorithms with respect to the prediction accuracy or model dimension.
arXiv Detail & Related papers (2022-10-25T13:43:57Z) - Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular
data [81.43750358586072]
We propose Data-IQ, a framework to systematically stratify examples into subgroups with respect to their outcomes.
We experimentally demonstrate the benefits of Data-IQ on four real-world medical datasets.
arXiv Detail & Related papers (2022-10-24T08:57:55Z) - Adversarial Sample Enhanced Domain Adaptation: A Case Study on
Predictive Modeling with Electronic Health Records [57.75125067744978]
We propose a data augmentation method to facilitate domain adaptation.
adversarially generated samples are used during domain adaptation.
Results confirm the effectiveness of our method and the generality on different tasks.
arXiv Detail & Related papers (2021-01-13T03:20:20Z) - Characterizing Fairness Over the Set of Good Models Under Selective
Labels [69.64662540443162]
We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance.
We provide tractable algorithms to compute the range of attainable group-level predictive disparities.
We extend our framework to address the empirically relevant challenge of selectively labelled data.
arXiv Detail & Related papers (2021-01-02T02:11:37Z) - LieTransformer: Equivariant self-attention for Lie Groups [49.9625160479096]
Group equivariant neural networks are used as building blocks of group invariant neural networks.
We extend the scope of the literature to self-attention, that is emerging as a prominent building block of deep learning models.
We propose the LieTransformer, an architecture composed of LieSelfAttention layers that are equivariant to arbitrary Lie groups and their discrete subgroups.
arXiv Detail & Related papers (2020-12-20T11:02:49Z) - LOGAN: Local Group Bias Detection by Clustering [86.38331353310114]
We argue that evaluating bias at the corpus level is not enough for understanding how biases are embedded in a model.
We propose LOGAN, a new bias detection technique based on clustering.
Experiments on toxicity classification and object classification tasks show that LOGAN identifies bias in a local region.
arXiv Detail & Related papers (2020-10-06T16:42:51Z) - Semi-nonparametric Latent Class Choice Model with a Flexible Class
Membership Component: A Mixture Model Approach [6.509758931804479]
The proposed model formulates the latent classes using mixture models as an alternative approach to the traditional random utility specification.
Results show that mixture models improve the overall performance of latent class choice models.
arXiv Detail & Related papers (2020-07-06T13:19:26Z) - Robust Grouped Variable Selection Using Distributionally Robust
Optimization [11.383869751239166]
We propose a Distributionally Robust Optimization (DRO) formulation with a Wasserstein-based uncertainty set for selecting grouped variables under perturbations.
We prove probabilistic bounds on the out-of-sample loss and the estimation bias, and establish the grouping effect of our estimator.
We show that our formulation produces an interpretable and parsimonious model that encourages sparsity at a group level.
arXiv Detail & Related papers (2020-06-10T22:32:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.