Information-Theoretic Generalization Bounds for Transductive Learning and its Applications
- URL: http://arxiv.org/abs/2311.04561v2
- Date: Mon, 10 Jun 2024 06:50:09 GMT
- Title: Information-Theoretic Generalization Bounds for Transductive Learning and its Applications
- Authors: Huayi Tang, Yong Liu,
- Abstract summary: We develop generalization bounds for transductive learning algorithms in the context of information theory and PAC-Bayesian theory.
Our theoretic results are validated on both synthetic and real-world datasets.
- Score: 16.408850979966623
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop generalization bounds for transductive learning algorithms in the context of information theory and PAC-Bayesian theory, covering both the random sampling setting and the random splitting setting. We show that the transductive generalization gap can be bounded by the mutual information between training labels selection and the hypothesis. By introducing the concept of transductive supersamples, we translate results depicted by various information measures from the inductive learning setting to the transductive learning setting. We further establish PAC-Bayesian bounds with weaker assumptions on the loss function and numbers of training and test data points. Finally, we present the upper bounds for adaptive optimization algorithms and demonstrate the applications of results on semi-supervised learning and graph learning scenarios. Our theoretic results are validated on both synthetic and real-world datasets.
Related papers
- Transformation-Invariant Learning and Theoretical Guarantees for OOD Generalization [34.036655200677664]
This paper focuses on a distribution shift setting where train and test distributions can be related by classes of (data) transformation maps.
We establish learning rules and algorithmic reductions to Empirical Risk Minimization (ERM)
We highlight that the learning rules we derive offer a game-theoretic viewpoint on distribution shift.
arXiv Detail & Related papers (2024-10-30T20:59:57Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Disentangled Representation Learning with Transmitted Information Bottleneck [57.22757813140418]
We present textbfDisTIB (textbfTransmitted textbfInformation textbfBottleneck for textbfDisd representation learning), a novel objective that navigates the balance between information compression and preservation.
arXiv Detail & Related papers (2023-11-03T03:18:40Z) - Information Theoretic Lower Bounds for Information Theoretic Upper
Bounds [14.268363583731848]
We examine the relationship between the output model and the empirical generalization and the algorithm in the context of convex optimization.
Our study reveals that, for true risk minimization, mutual information is necessary.
Existing information-theoretic generalization bounds fall short in capturing the capabilities of algorithms like SGD and regularized, which have-independent dimension sample complexity.
arXiv Detail & Related papers (2023-02-09T20:42:36Z) - Scalable PAC-Bayesian Meta-Learning via the PAC-Optimal Hyper-Posterior:
From Theory to Practice [54.03076395748459]
A central question in the meta-learning literature is how to regularize to ensure generalization to unseen tasks.
We present a generalization bound for meta-learning, which was first derived by Rothfuss et al.
We provide a theoretical analysis and empirical case study under which conditions and to what extent these guarantees for meta-learning improve upon PAC-Bayesian per-task learning bounds.
arXiv Detail & Related papers (2022-11-14T08:51:04Z) - Label Propagation Through Optimal Transport [0.0]
We tackle the transductive semi-supervised learning problem that aims to obtain label predictions for the given unlabeled data points.
Our proposed approach is based on optimal transport, a mathematical theory that has been successfully used to address various machine learning problems.
arXiv Detail & Related papers (2021-10-01T11:25:55Z) - A Scaling Law for Synthetic-to-Real Transfer: A Measure of Pre-Training [52.93808218720784]
Synthetic-to-real transfer learning is a framework in which we pre-train models with synthetically generated images and ground-truth annotations for real tasks.
Although synthetic images overcome the data scarcity issue, it remains unclear how the fine-tuning performance scales with pre-trained models.
We observe a simple and general scaling law that consistently describes learning curves in various tasks, models, and complexities of synthesized pre-training data.
arXiv Detail & Related papers (2021-08-25T02:29:28Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Perturbation Theory for the Information Bottleneck [6.117084972237769]
Information bottleneck (IB) method formalizes extracting relevant information from data.
nonlinearity of the IB problem makes it computationally expensive and analytically intractable in general.
We derive a perturbation theory for the IB method and report the first complete characterization of the learning onset.
arXiv Detail & Related papers (2021-05-28T16:59:01Z) - Information Complexity and Generalization Bounds [0.0]
We show a unifying picture of PAC-Bayesian and mutual information-based upper bounds on randomized learning algorithms.
We discuss two practical examples for learning with neural networks, namely, Entropy- and PAC-Bayes- SGD.
arXiv Detail & Related papers (2021-05-04T20:37:57Z) - Generalization Properties of Optimal Transport GANs with Latent
Distribution Learning [52.25145141639159]
We study how the interplay between the latent distribution and the complexity of the pushforward map affects performance.
Motivated by our analysis, we advocate learning the latent distribution as well as the pushforward map within the GAN paradigm.
arXiv Detail & Related papers (2020-07-29T07:31:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.