Differentially private testing for relevant dependencies in high dimensions
- URL: http://arxiv.org/abs/2511.17167v1
- Date: Fri, 21 Nov 2025 11:38:40 GMT
- Title: Differentially private testing for relevant dependencies in high dimensions
- Authors: Patrick Bastian, Holger Dette, Martin Dunsche,
- Abstract summary: We investigate the problem of detecting dependencies between the components of a high-dimensional vector.<n>Instead of testing whether the coordinates are pairwise independent, we are interested in determining whether certain pairwise associations do not exceed a given threshold in absolute value.<n>We propose a novel bootstrap based methodology that is especially powerful in sparse settings.
- Score: 1.809722301908016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate the problem of detecting dependencies between the components of a high-dimensional vector. Our approach advances the existing literature in two important respects. First, we consider the problem under privacy constraints. Second, instead of testing whether the coordinates are pairwise independent, we are interested in determining whether certain pairwise associations between the components (such as all pairwise Kendall's $τ$ coefficients) do not exceed a given threshold in absolute value. Considering hypotheses of this form is motivated by the observation that in the high-dimensional regime, it is rare and perhaps impossible to have a null hypothesis that can be modeled exactly by assuming that all pairwise associations are precisely equal to zero. The formulation of the null hypothesis as a composite hypothesis makes the problem of constructing tests already non-standard in the non-private setting. Additionally, under privacy constraints, state of the art procedures rely on permutation approaches that are rendered invalid under a composite null. We propose a novel bootstrap based methodology that is especially powerful in sparse settings, develop theoretical guarantees under mild assumptions and show that the proposed method enjoys good finite sample properties even in the high privacy regime. Additionally, we present applications in medical data that showcase the applicability of our methodology.
Related papers
- Discovering Causal Relationships using Proxy Variables under Unmeasured Confounding [42.70985072862832]
Inferring causal relationships between variable pairs in the observational study is crucial but challenging.<n>We develop a general nonparametric approach that accommodates both discrete and continuous settings for testing causal hypothesis under unmeasured confounders.<n>We demonstrate the effectiveness of our approach through extensive simulations and real-world data from the Intensive Care Data and World Values Survey.
arXiv Detail & Related papers (2025-10-20T05:13:12Z) - Mitigating LLM Hallucinations via Conformal Abstention [70.83870602967625]
We develop a principled procedure for determining when a large language model should abstain from responding in a general domain.
We leverage conformal prediction techniques to develop an abstention procedure that benefits from rigorous theoretical guarantees on the hallucination rate (error rate)
Experimentally, our resulting conformal abstention method reliably bounds the hallucination rate on various closed-book, open-domain generative question answering datasets.
arXiv Detail & Related papers (2024-04-04T11:32:03Z) - Non-Convex Robust Hypothesis Testing using Sinkhorn Uncertainty Sets [18.46110328123008]
We present a new framework to address the non-robust hypothesis testing problem.
The goal is to seek the optimal detector that minimizes the maximum numerical risk.
arXiv Detail & Related papers (2024-03-21T20:29:43Z) - Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.<n>We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Model-Agnostic Covariate-Assisted Inference on Partially Identified Causal Effects [1.9253333342733674]
Many causal estimands are only partially identifiable since they depend on the unobservable joint distribution between potential outcomes.
We propose a unified and model-agnostic inferential approach for a wide class of partially identified estimands.
arXiv Detail & Related papers (2023-10-12T08:17:30Z) - Data Association Aware POMDP Planning with Hypothesis Pruning
Performance Guarantees [7.928094304325113]
We introduce a pruning-based approach for planning with ambiguous data associations.
Our key contribution is to derive bounds between the value function based on the complete set of hypotheses and the value function based on a pruned-subset of the hypotheses.
We demonstrate how these bounds can both be used to certify any pruning in retrospect and propose a novel approach to determine which hypotheses to prune in order to ensure a predefined limit on the loss.
arXiv Detail & Related papers (2023-03-03T18:35:01Z) - Composed Image Retrieval with Text Feedback via Multi-grained
Uncertainty Regularization [73.04187954213471]
We introduce a unified learning approach to simultaneously modeling the coarse- and fine-grained retrieval.
The proposed method has achieved +4.03%, +3.38%, and +2.40% Recall@50 accuracy over a strong baseline.
arXiv Detail & Related papers (2022-11-14T14:25:40Z) - A Low Rank Promoting Prior for Unsupervised Contrastive Learning [108.91406719395417]
We construct a novel probabilistic graphical model that effectively incorporates the low rank promoting prior into the framework of contrastive learning.
Our hypothesis explicitly requires that all the samples belonging to the same instance class lie on the same subspace with small dimension.
Empirical evidences show that the proposed algorithm clearly surpasses the state-of-the-art approaches on multiple benchmarks.
arXiv Detail & Related papers (2021-08-05T15:58:25Z) - FRITL: A Hybrid Method for Causal Discovery in the Presence of Latent
Confounders [46.31784571870808]
We show that under some mild assumptions, the model is uniquely identified by a hybrid method.
Our method leverages the advantages of constraint-based methods and independent noise-based methods to handle both confounded and unconfounded situations.
arXiv Detail & Related papers (2021-03-26T03:12:14Z) - Causal Inference Under Unmeasured Confounding With Negative Controls: A
Minimax Learning Approach [84.29777236590674]
We study the estimation of causal parameters when not all confounders are observed and instead negative controls are available.
Recent work has shown how these can enable identification and efficient estimation via two so-called bridge functions.
arXiv Detail & Related papers (2021-03-25T17:59:19Z) - Fundamental Limits of Testing the Independence of Irrelevant
Alternatives in Discrete Choice [9.13127392774573]
The Multinomial Logit (MNL) model and the Independence of Irrelevant Alternatives (IIA) are the most widely used tools of discrete choice.
We show that any general test for IIA with low worst-case error would require a number of samples exponential in the number of alternatives of the choice problem.
Our lower bounds are structure-dependent, and as a potential cause for optimism, we find that if one restricts the test of IIA to violations that can occur in a specific collection of choice sets, one obtains structure-dependent lower bounds that are much less pessimistic.
arXiv Detail & Related papers (2020-01-20T10:15:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.