Related papers: Statistically Valid Information Bottleneck via Multiple Hypothesis Testing

Statistically Valid Information Bottleneck via Multiple Hypothesis Testing

URL: http://arxiv.org/abs/2409.07325v2
Date: Thu, 10 Oct 2024 14:09:17 GMT
Title: Statistically Valid Information Bottleneck via Multiple Hypothesis Testing
Authors: Amirmohammad Farzaneh, Osvaldo Simeone,
Abstract summary: We introduce a statistically valid solution to the information bottleneck (IB) problem via multiple hypothesis testing (IB-MHT) IB-MHT ensures that the learned features meet the IB constraints with high probability, regardless of the size of the available dataset. Results validate the effectiveness of IB-MHT in outperforming conventional methods in terms of statistical robustness and reliability.
Score: 35.59201763567714
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The information bottleneck (IB) problem is a widely studied framework in machine learning for extracting compressed features that are informative for downstream tasks. However, current approaches to solving the IB problem rely on a heuristic tuning of hyperparameters, offering no guarantees that the learned features satisfy information-theoretic constraints. In this work, we introduce a statistically valid solution to this problem, referred to as IB via multiple hypothesis testing (IB-MHT), which ensures that the learned features meet the IB constraints with high probability, regardless of the size of the available dataset. The proposed methodology builds on Pareto testing and learn-then-test (LTT), and it wraps around existing IB solvers to provide statistical guarantees on the IB constraints. We demonstrate the performance of IB-MHT on classical and deterministic IB formulations, including experiments on distillation of language models. The results validate the effectiveness of IB-MHT in outperforming conventional methods in terms of statistical robustness and reliability.

Related papers

Technical Report: Facilitating the Adoption of Causal Inference Methods Through LLM-Empowered Co-Pilot [44.336297829718795]
We introduce CATE-B, an open-source co-pilot system that uses large language models (LLMs) within an agentic framework to guide users through treatment effect estimation.<n>CATE-B assists in (i) constructing a structural causal model via causal discovery and LLM-based edge orientation, (ii) identifying robust adjustment sets through a novel Minimal Uncertainty Adjustment Set criterion, and (iii) selecting appropriate regression methods tailored to the causal structure and dataset characteristics.
arXiv Detail & Related papers (2025-08-14T12:20:51Z)
Testing and Improving the Robustness of Amortized Bayesian Inference for Cognitive Models [0.5223954072121659]
Contaminant observations and outliers often cause problems when estimating the parameters of cognitive models. In this study, we test and improve the robustness of parameter estimation using amortized Bayesian inference. The proposed method is straightforward and practical to implement and has a broad applicability in fields where outlier detection or removal is challenging.
arXiv Detail & Related papers (2024-12-29T21:22:24Z)
MedBN: Robust Test-Time Adaptation against Malicious Test Samples [11.397666167665484]
Test-time adaptation (TTA) has emerged as a promising solution to address performance decay due to unforeseen distribution shifts between training and test data. Previous studies have uncovered security vulnerabilities within TTA even when a small proportion of the test batch is maliciously manipulated. We propose median batch normalization (MedBN), leveraging the robustness of the median for statistics estimation within the batch normalization layer during test-time inference.
arXiv Detail & Related papers (2024-03-28T11:33:02Z)
Rapid and Scalable Bayesian AB Testing [0.0]
We propose a solution that applies hierarchical Bayesian estimation to address limitations of current AB testing methodology. We increase statistical power by exploiting correlations between factors, enabling sequential testing and progressive early stopping. We also demonstrate how this methodology can be extended to enable the extraction of composite global learnings from past AB tests.
arXiv Detail & Related papers (2023-07-27T05:08:49Z)
B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding [51.74479522965712]
We propose a meta-learner called the B-Learner, which can efficiently learn sharp bounds on the CATE function under limits on hidden confounding. We prove its estimates are valid, sharp, efficient, and have a quasi-oracle property with respect to the constituent estimators under more general conditions than existing methods.
arXiv Detail & Related papers (2023-04-20T18:07:19Z)
Validation Diagnostics for SBI algorithms based on Normalizing Flows [55.41644538483948]
This work proposes easy to interpret validation diagnostics for multi-dimensional conditional (posterior) density estimators based on NF. It also offers theoretical guarantees based on results of local consistency. This work should help the design of better specified models or drive the development of novel SBI-algorithms.
arXiv Detail & Related papers (2022-11-17T15:48:06Z)
Empirical Bayesian Approaches for Robust Constraint-based Causal Discovery under Insufficient Data [38.883810061897094]
Causal discovery methods assume data sufficiency, which may not be the case in many real world datasets. We propose Bayesian-augmented frequentist independence tests to improve the performance of constraint-based causal discovery methods under insufficient data. Experiments show significant performance improvement in terms of both accuracy and efficiency over SOTA methods.
arXiv Detail & Related papers (2022-06-16T21:08:49Z)
Differential privacy and robust statistics in high dimensions [49.50869296871643]
High-dimensional Propose-Test-Release (HPTR) builds upon three crucial components: the exponential mechanism, robust statistics, and the Propose-Test-Release mechanism. We show that HPTR nearly achieves the optimal sample complexity under several scenarios studied in the literature.
arXiv Detail & Related papers (2021-11-12T06:36:40Z)
SAMBA: Safe Model-Based & Active Reinforcement Learning [59.01424351231993]
SAMBA is a framework for safe reinforcement learning that combines aspects from probabilistic modelling, information theory, and statistics. We evaluate our algorithm on a variety of safe dynamical system benchmarks involving both low and high-dimensional state representations. We provide intuition as to the effectiveness of the framework by a detailed analysis of our active metrics and safety constraints.
arXiv Detail & Related papers (2020-06-12T10:40:46Z)
The Conditional Entropy Bottleneck [8.797368310561058]
We characterize failures of robust generalization as failures of accuracy or related metrics on a held-out set. We propose the Minimum Necessary Information (MNI) criterion for evaluating the quality of a model. In order to train models that perform well with respect to the MNI criterion, we present a new objective function, the Conditional Entropy Bottleneck (CEB) We experimentally test our hypothesis by comparing the performance of CEB models with deterministic models and Variational Information Bottleneck (VIB) models on a variety of different datasets.
arXiv Detail & Related papers (2020-02-13T07:46:38Z)
The empirical duality gap of constrained statistical learning [115.23598260228587]
We study the study of constrained statistical learning problems, the unconstrained version of which are at the core of virtually all modern information processing. We propose to tackle the constrained statistical problem overcoming its infinite dimensionality, unknown distributions, and constraints by leveraging finite dimensional parameterizations, sample averages, and duality theory. We demonstrate the effectiveness and usefulness of this constrained formulation in a fair learning application.
arXiv Detail & Related papers (2020-02-12T19:12:29Z)
Statistical Agnostic Mapping: a Framework in Neuroimaging based on Concentration Inequalities [0.0]
We derive a Statistical Agnostic (non-parametric) Mapping at voxel or multi-voxel level. We propose a novel framework in neuroimaging based on concentration inequalities.
arXiv Detail & Related papers (2019-12-27T18:27:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.