Related papers: Robustly estimating heterogeneity in factorial data using Rashomon Partitions

Robustly estimating heterogeneity in factorial data using Rashomon Partitions

URL: http://arxiv.org/abs/2404.02141v4
Date: Tue, 19 Aug 2025 05:45:12 GMT
Title: Robustly estimating heterogeneity in factorial data using Rashomon Partitions
Authors: Aparajithan Venkateswaran, Anirudh Sankar, Arun G. Chandrasekhar, Tyler H. McCormick,
Abstract summary: We propose a novel framework for model uncertainty called Rashomon Partition Sets (RPS)<n>RPS consists of all models that have posterior density close to the maximum a posteriori (MAP) model.<n>We give simulation evidence along with three empirical examples: price effects on charitable giving, heterogeneity in chromosomal structure, and the introduction of microfinance.
Score: 4.76518127830168
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In both observational data and randomized control trials, researchers select statistical models to articulate how the outcome of interest varies with combinations of observable covariates. Choosing a model that is too simple can obfuscate important heterogeneity in outcomes between covariate groups, while too much complexity risks identifying spurious patterns. In this paper, we propose a novel Bayesian framework for model uncertainty called Rashomon Partition Sets (RPSs). The RPS consists of all models that have posterior density close to the maximum a posteriori (MAP) model. We construct the RPS by enumeration, rather than sampling, which ensures that we explore all models models with high evidence in the data, even if they offer dramatically different substantive explanations. We use a l0 prior, which allows the allows us to capture complex heterogeneity without imposing strong assumptions about the associations between effects, showing this prior is minimax optimal from an information-theoretic perspective. We characterize the approximation error of (functions of) parameters computed conditional on being in the RPS relative to the entire posterior. We propose an algorithm to enumerate the RPS from the class of models that are interpretable and unique, then provide bounds on the size of the RPS. We give simulation evidence along with three empirical examples: price effects on charitable giving, heterogeneity in chromosomal structure, and the introduction of microfinance.

Related papers

Covariate-assisted Grade of Membership Models via Shared Latent Geometry [0.7939348535496568]
The grade of membership model is a flexible latent variable model for analyzing multivariate categorical data through individual-level mixed membership scores.<n>Traditional approaches to incorporating auxiliary covariates typically rely on fully specified joint likelihoods, which are computationally intensive and sensitive to misspecification.<n>We introduce a covariate-assisted grade of membership model that integrates response and covariate information by exploiting their shared low-rank simplex geometry.
arXiv Detail & Related papers (2026-01-24T02:30:36Z)
Model Correlation Detection via Random Selection Probing [62.093777777813756]
Existing similarity-based methods require access to model parameters or produce scores without thresholds.<n>We introduce Random Selection Probing (RSP), a hypothesis-testing framework that formulates model correlation detection as a statistical test.<n>RSP produces rigorous p-values that quantify evidence of correlation.
arXiv Detail & Related papers (2025-09-29T01:40:26Z)
Going from a Representative Agent to Counterfactuals in Combinatorial Choice [1.7074019866492325]
We study decision-making problems where data comprises points from a collection of binary polytopes.<n>We propose a nonparametric approach for counterfactual inference in this setting based on a representative agent model.
arXiv Detail & Related papers (2025-05-29T15:24:23Z)
Representation Learning Preserving Ignorability and Covariate Matching for Treatment Effects [18.60804431844023]
Estimating treatment effects from observational data is challenging due to hidden confounding. A common framework to address both hidden confounding and selection bias is missing.
arXiv Detail & Related papers (2025-04-29T09:33:56Z)
Towards Self-Supervised Covariance Estimation in Deep Heteroscedastic Regression [102.24287051757469]
We study self-supervised covariance estimation in deep heteroscedastic regression. We derive an upper bound on the 2-Wasserstein distance between normal distributions. Experiments over a wide range of synthetic and real datasets demonstrate that the proposed 2-Wasserstein bound coupled with pseudo label annotations results in a computationally cheaper yet accurate deep heteroscedastic regression.
arXiv Detail & Related papers (2025-02-14T22:37:11Z)
Semiparametric conformal prediction [79.6147286161434]
Risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables. We treat the scores as random vectors and aim to construct the prediction set accounting for their joint correlation structure. We report desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z)
Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation. In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model. We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z)
Conditional Generative Models are Sufficient to Sample from Any Causal Effect Estimand [9.460857822923842]
Causal inference from observational data plays critical role in many applications in trustworthy machine learning. We show how to sample from any identifiable interventional distribution given an arbitrary causal graph. We also generate high-dimensional interventional samples from the MIMIC-CXR dataset involving text and image variables.
arXiv Detail & Related papers (2024-02-12T05:48:31Z)
Random Models for Fuzzy Clustering Similarity Measures [0.0]
The Adjusted Rand Index (ARI) is a widely used method for comparing hard clusterings. We propose a single framework for computing the ARI with three random models that are intuitive and explainable for both hard and fuzzy clusterings.
arXiv Detail & Related papers (2023-12-16T00:07:04Z)
Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.<n>We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z)
TIC-TAC: A Framework for Improved Covariance Estimation in Deep Heteroscedastic Regression [109.69084997173196]
Deepscedastic regression involves jointly optimizing the mean and covariance of the predicted distribution using the negative log-likelihood. Recent works show that this may result in sub-optimal convergence due to the challenges associated with covariance estimation. We study two questions: (1) Does the predicted covariance truly capture the randomness of the predicted mean? Our results show that not only does TIC accurately learn the covariance, it additionally facilitates an improved convergence of the negative log-likelihood.
arXiv Detail & Related papers (2023-10-29T09:54:03Z)
Synthetic Combinations: A Causal Inference Framework for Combinatorial Interventions [8.491098180590447]
We learn unit-specific potential outcomes for any combination of interventions, i.e., $N times 2p$ causal parameters. Running $N times 2p$ experiments to estimate the various parameters is likely expensive and/or infeasible as $N$ and $p$ grow.
arXiv Detail & Related papers (2023-03-24T18:45:44Z)
Dual-sPLS: a family of Dual Sparse Partial Least Squares regressions for feature selection and prediction with tunable sparsity; evaluation on simulated and near-infrared (NIR) data [1.6099403809839032]
The variant presented in this paper, Dual-sPLS, generalizes the classical PLS1 algorithm. It provides balance between accurate prediction and efficient interpretation. Code is provided as an open-source package in R.
arXiv Detail & Related papers (2023-01-17T21:50:35Z)
Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis. We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z)
Robust and Agnostic Learning of Conditional Distributional Treatment Effects [62.44901952244514]
The conditional average treatment effect (CATE) is the best point prediction of individual causal effects. In aggregate analyses, this is usually addressed by measuring distributional treatment effect (DTE) We provide a new robust and model-agnostic methodology for learning the conditional DTE (CDTE) for a wide class of problems.
arXiv Detail & Related papers (2022-05-23T17:40:31Z)
Optimal Clustering with Bandit Feedback [57.672609011609886]
This paper considers the problem of online clustering with bandit feedback. It includes a novel stopping rule for sequential testing that circumvents the need to solve any NP-hard weighted clustering problem as its subroutines. We show through extensive simulations on synthetic and real-world datasets that BOC's performance matches the lower boundally, and significantly outperforms a non-adaptive baseline algorithm.
arXiv Detail & Related papers (2022-02-09T06:05:05Z)
Treatment Effect Risk: Bounds and Inference [58.442274475425144]
Since the average treatment effect measures the change in social welfare, even if positive, there is a risk of negative effect on, say, some 10% of the population. In this paper we consider how to nonetheless assess this important risk measure, formalized as the conditional value at risk (CVaR) of the ITE distribution. Some bounds can also be interpreted as summarizing a complex CATE function into a single metric and are of interest independently of being a bound.
arXiv Detail & Related papers (2022-01-15T17:21:26Z)
Inverting brain grey matter models with likelihood-free inference: a tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI. We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells. We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z)
Optimization-based Causal Estimation from Heterogenous Environments [35.74340459207312]
CoCo is an optimization algorithm that bridges the gap between pure prediction and causal inference. We describe the theoretical foundations of this approach and demonstrate its effectiveness on simulated and real datasets.
arXiv Detail & Related papers (2021-09-24T14:21:58Z)
The SKIM-FA Kernel: High-Dimensional Variable Selection and Nonlinear Interaction Discovery in Linear Time [26.11563787525079]
We show how a kernel trick can reduce computation with suitable Bayesian models to O(# covariates) time for both variable selection and estimation. Our approach outperforms existing methods used for large, high-dimensional datasets.
arXiv Detail & Related papers (2021-06-23T13:53:36Z)
Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics. We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data. Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z)
Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously. We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework. The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z)
Optimal Posteriors for Chi-squared Divergence based PAC-Bayesian Bounds and Comparison with KL-divergence based Optimal Posteriors and Cross-Validation Procedure [0.0]
We investigate optimal posteriors for chi-squared divergence based PACBayesian bounds in terms of their distribution, scalability of computations, and test set performance. Chi-squared divergence based posteriors have weaker bounds and worse test errors, hinting at an underlying regularization by KL-divergence based posteriors.
arXiv Detail & Related papers (2020-08-14T03:15:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.