A Unified Combination Framework for Dependent Tests with Applications to Microbiome Association Studies
- URL: http://arxiv.org/abs/2404.09353v1
- Date: Sun, 14 Apr 2024 20:33:39 GMT
- Title: A Unified Combination Framework for Dependent Tests with Applications to Microbiome Association Studies
- Authors: Xiufan Yu, Linjun Zhang, Arun Srinivasan, Min-ge Xie, Lingzhou Xue,
- Abstract summary: We introduce a novel meta-analysis framework to combine dependent tests under a general setting.
We utilize it to synthesize various microbiome association tests that are calculated from the same dataset.
- Score: 12.579558827555273
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a novel meta-analysis framework to combine dependent tests under a general setting, and utilize it to synthesize various microbiome association tests that are calculated from the same dataset. Our development builds upon the classical meta-analysis methods of aggregating $p$-values and also a more recent general method of combining confidence distributions, but makes generalizations to handle dependent tests. The proposed framework ensures rigorous statistical guarantees, and we provide a comprehensive study and compare it with various existing dependent combination methods. Notably, we demonstrate that the widely used Cauchy combination method for dependent tests, referred to as the vanilla Cauchy combination in this article, can be viewed as a special case within our framework. Moreover, the proposed framework provides a way to address the problem when the distributional assumptions underlying the vanilla Cauchy combination are violated. Our numerical results demonstrate that ignoring the dependence among the to-be-combined components may lead to a severe size distortion phenomenon. Compared to the existing $p$-value combination methods, including the vanilla Cauchy combination method, the proposed combination framework can handle the dependence accurately and utilizes the information efficiently to construct tests with accurate size and enhanced power. The development is applied to Microbiome Association Studies, where we aggregate information from multiple existing tests using the same dataset. The combined tests harness the strengths of each individual test across a wide range of alternative spaces, %resulting in a significant enhancement of testing power across a wide range of alternative spaces, enabling more efficient and meaningful discoveries of vital microbiome associations.
Related papers
- Hierarchical Sparse Bayesian Multitask Model with Scalable Inference for Microbiome Analysis [1.361248247831476]
This paper proposes a hierarchical Bayesian multitask learning model that is applicable to the general multi-task binary classification learning problem.
We derive a computationally efficient inference algorithm based on variational inference to approximate the posterior distribution.
We demonstrate the potential of the new approach on various synthetic datasets and for predicting human health status based on microbiome profile.
arXiv Detail & Related papers (2025-02-04T18:23:22Z) - A Unified Comparative Study with Generalized Conformity Scores for Multi-Output Conformal Regression [1.747623282473278]
We present a unified comparative study of nine conformal methods with different multi-output base models.
We also introduce two novel classes of conformity scores for multi-output regression.
One class is compatible with any generative model, while the other is computationally efficient, leveraging the properties of invertible generative models.
arXiv Detail & Related papers (2025-01-17T20:13:24Z) - A Survey on Mixup Augmentations and Beyond [59.578288906956736]
Mixup and relevant data-mixing methods that convexly combine selected samples and the corresponding labels are widely adopted.
This survey presents a comprehensive review of foundational mixup methods and their applications.
arXiv Detail & Related papers (2024-09-08T19:32:22Z) - Combine and Conquer: A Meta-Analysis on Data Shift and Out-of-Distribution Detection [30.377446496559635]
This paper introduces a universal approach to seamlessly combine out-of-distribution (OOD) detection scores.
Our framework is easily for future developments in detection scores and stands as the first to combine decision boundaries in this context.
arXiv Detail & Related papers (2024-06-23T08:16:44Z) - Bayesian Joint Additive Factor Models for Multiview Learning [7.254731344123118]
A motivating application arises in the context of precision medicine where multi-omics data are collected to correlate with clinical outcomes.
We propose a joint additive factor regression model (JAFAR) with a structured additive design, accounting for shared and view-specific components.
Prediction of time-to-labor onset from immunome, metabolome, and proteome data illustrates performance gains against state-of-the-art competitors.
arXiv Detail & Related papers (2024-06-02T15:35:45Z) - CAVACHON: a hierarchical variational autoencoder to integrate multi-modal single-cell data [10.429856767305687]
We propose a novel probabilistic learning framework that explicitly incorporates conditional independence relationships between multi-modal data.
We demonstrate the versatility of our framework across various applications pertinent to single-cell multi-omics data integration.
arXiv Detail & Related papers (2024-05-28T23:44:09Z) - Rethinking Clustered Federated Learning in NOMA Enhanced Wireless
Networks [60.09912912343705]
This study explores the benefits of integrating the novel clustered federated learning (CFL) approach with non-independent and identically distributed (non-IID) datasets.
A detailed theoretical analysis of the generalization gap that measures the degree of non-IID in the data distribution is presented.
Solutions to address the challenges posed by non-IID conditions are proposed with the analysis of the properties.
arXiv Detail & Related papers (2024-03-05T17:49:09Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Differential privacy and robust statistics in high dimensions [49.50869296871643]
High-dimensional Propose-Test-Release (HPTR) builds upon three crucial components: the exponential mechanism, robust statistics, and the Propose-Test-Release mechanism.
We show that HPTR nearly achieves the optimal sample complexity under several scenarios studied in the literature.
arXiv Detail & Related papers (2021-11-12T06:36:40Z) - Bloom Origami Assays: Practical Group Testing [90.2899558237778]
Group testing is a well-studied problem with several appealing solutions.
Recent biological studies impose practical constraints for COVID-19 that are incompatible with traditional methods.
We develop a new method combining Bloom filters with belief propagation to scale to larger values of n (more than 100) with good empirical results.
arXiv Detail & Related papers (2020-07-21T19:31:41Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.