Related papers: Instance-Adaptive Hypothesis Tests with Heterogeneous Agents

Instance-Adaptive Hypothesis Tests with Heterogeneous Agents

URL: http://arxiv.org/abs/2510.21178v1
Date: Fri, 24 Oct 2025 06:00:44 GMT
Title: Instance-Adaptive Hypothesis Tests with Heterogeneous Agents
Authors: Flora C. Shi, Martin J. Wainwright, Stephen Bates,
Abstract summary: We study hypothesis testing over a heterogeneous population of strategic agents with private information.<n>We show how it is possible to design menus of statistical contracts that pair type-optimal tests with payoff structures.
Score: 18.438776242257163
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study hypothesis testing over a heterogeneous population of strategic agents with private information. Any single test applied uniformly across the population yields statistical error that is sub-optimal relative to the performance of an oracle given access to the private information. We show how it is possible to design menus of statistical contracts that pair type-optimal tests with payoff structures, inducing agents to self-select according to their private information. This separating menu elicits agent types and enables the principal to match the oracle performance even without a priori knowledge of the agent type. Our main result fully characterizes the collection of all separating menus that are instance-adaptive, matching oracle performance for an arbitrary population of heterogeneous agents. We identify designs where information elicitation is essentially costless, requiring negligible additional expense relative to a single-test benchmark, while improving statistical performance. Our work establishes a connection between proper scoring rules and menu design, showing how the structure of the hypothesis test constrains the elicitable information. Numerical examples illustrate the geometry of separating menus and the improvements they deliver in error trade-offs. Overall, our results connect statistical decision theory with mechanism design, demonstrating how heterogeneity and strategic participation can be harnessed to improve efficiency in hypothesis testing.

Related papers

Learning Unified Representations from Heterogeneous Data for Robust Heart Rate Modeling [5.538168530326637]
Heart rate prediction is vital for personalized health monitoring and fitness, but it frequently faces a critical challenge when deploying in real-world data.<n>We classify it in two key dimensions: source heterogeneity from fragmented device markets with varying feature sets, and user heterogeneity reflecting distinct physiological patterns across individuals and activities.<n>Existing methods either discard device-specific information, or fail to model user-specific differences, limiting their real-world performance.<n>We propose a framework that learns latent representations to both heterogeneity, enabling downstream predictors to work consistently under heterogeneous data patterns.
arXiv Detail & Related papers (2025-08-29T17:03:05Z)
Sharp Results for Hypothesis Testing with Risk-Sensitive Agents [32.38246810091696]
We study a game-theoretic version of hypothesis testing in which a statistician, also known as a principal, interacts with strategic agents that can generate data.<n>The statistician seeks to design a testing protocol with controlled error, while the data-generating agents, guided by their utility and prior information, choose whether or not to opt in.
arXiv Detail & Related papers (2024-12-21T02:51:56Z)
Detecting and Identifying Selection Structure in Sequential Data [53.24493902162797]
We argue that the selective inclusion of data points based on latent objectives is common in practical situations, such as music sequences. We show that selection structure is identifiable without any parametric assumptions or interventional experiments. We also propose a provably correct algorithm to detect and identify selection structures as well as other types of dependencies.
arXiv Detail & Related papers (2024-06-29T20:56:34Z)
Causality and Independence Enhancement for Biased Node Classification [56.38828085943763]
We propose a novel Causality and Independence Enhancement (CIE) framework, applicable to various graph neural networks (GNNs) Our approach estimates causal and spurious features at the node representation level and mitigates the influence of spurious correlations. Our approach CIE not only significantly enhances the performance of GNNs but outperforms state-of-the-art debiased node classification methods.
arXiv Detail & Related papers (2023-10-14T13:56:24Z)
Generalization within in silico screening [19.58677466616286]
In silico screening uses predictive models to select a batch of compounds with favorable properties from a library for experimental validation. By extending learning theory, we show that the selectivity of the selection policy can significantly impact generalization. We show that generalization can be markedly enhanced when considering a model's ability to predict the fraction of desired outcomes in a batch.
arXiv Detail & Related papers (2023-07-18T16:01:01Z)
Delving into Identify-Emphasize Paradigm for Combating Unknown Bias [52.76758938921129]
We propose an effective bias-conflicting scoring method (ECS) to boost the identification accuracy. We also propose gradient alignment (GA) to balance the contributions of the mined bias-aligned and bias-conflicting samples. Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can mitigate the impact of unknown biases.
arXiv Detail & Related papers (2023-02-22T14:50:24Z)
Statistical discrimination in learning agents [64.78141757063142]
Statistical discrimination emerges in agent policies as a function of both the bias in the training population and of agent architecture. We show that less discrimination emerges with agents that use recurrent neural networks, and when their training environment has less bias.
arXiv Detail & Related papers (2021-10-21T18:28:57Z)
Federated Learning under Importance Sampling [49.17137296715029]
We study the effect of importance sampling and devise schemes for sampling agents and data non-uniformly guided by a performance measure. We find that in schemes involving sampling without replacement, the performance of the resulting architecture is controlled by two factors related to data variability at each agent.
arXiv Detail & Related papers (2020-12-14T10:08:55Z)
Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management. We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
A Causal Direction Test for Heterogeneous Populations [10.653162005300608]
Most causal models assume a single homogeneous population, an assumption that may fail to hold in many applications. We show that when the homogeneity assumption is violated, causal models developed based on such assumption can fail to identify the correct causal direction. We propose an adjustment to a commonly used causal direction test statistic by using a $k$-means type clustering algorithm.
arXiv Detail & Related papers (2020-06-08T18:59:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.