Related papers: Information-theoretic generalization bounds for black-box learning algorithms

Information-theoretic generalization bounds for black-box learning algorithms

URL: http://arxiv.org/abs/2110.01584v2
Date: Tue, 5 Oct 2021 06:40:39 GMT
Title: Information-theoretic generalization bounds for black-box learning algorithms
Authors: Hrayr Harutyunyan, Maxim Raginsky, Greg Ver Steeg, Aram Galstyan
Abstract summary: We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.
Score: 46.44597430985965
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.

Related papers

Improved Graph-based semi-supervised learning Schemes [0.0]
In this work, we improve the accuracy of several known algorithms to address the classification of large datasets when few labels are available. Our framework lies in the realm of graph-based semi-supervised learning.
arXiv Detail & Related papers (2024-06-30T16:50:08Z)
Structured Prediction in Online Learning [66.36004256710824]
We study a theoretical and algorithmic framework for structured prediction in the online learning setting. We show that our algorithm is a generalisation of optimal algorithms from the supervised learning setting. We consider a second algorithm designed especially for non-stationary data distributions, including adversarial data.
arXiv Detail & Related papers (2024-06-18T07:45:02Z)
Learning-Augmented Algorithms with Explicit Predictors [67.02156211760415]
Recent advances in algorithmic design show how to utilize predictions obtained by machine learning models from past and present data. Prior research in this context was focused on a paradigm where the predictor is pre-trained on past data and then used as a black box. In this work, we unpack the predictor and integrate the learning problem it gives rise for within the algorithmic challenge.
arXiv Detail & Related papers (2024-03-12T08:40:21Z)
Generalization error bounds for iterative learning algorithms with bounded updates [41.87646434714452]
We introduce a novel bound for the generalization of these algorithms with bounded updates. We employ a variance decomposition technique to decompose information across iterations. To bridge the gap between theory and practice, we also examine the previously observed scaling behavior in large language models.
arXiv Detail & Related papers (2023-09-10T16:55:59Z)
Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity. We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level. Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z)
Information Theoretic Lower Bounds for Information Theoretic Upper Bounds [14.268363583731848]
We examine the relationship between the output model and the empirical generalization and the algorithm in the context of convex optimization. Our study reveals that, for true risk minimization, mutual information is necessary. Existing information-theoretic generalization bounds fall short in capturing the capabilities of algorithms like SGD and regularized, which have-independent dimension sample complexity.
arXiv Detail & Related papers (2023-02-09T20:42:36Z)
On Leave-One-Out Conditional Mutual Information For Generalization [122.2734338600665]
We derive information theoretic generalization bounds for supervised learning algorithms based on a new measure of leave-one-out conditional mutual information (loo-CMI) Contrary to other CMI bounds, our loo-CMI bounds can be computed easily and can be interpreted in connection to other notions such as classical leave-one-out cross-validation. We empirically validate the quality of the bound by evaluating its predicted generalization gap in scenarios for deep learning.
arXiv Detail & Related papers (2022-07-01T17:58:29Z)
A Brief Look at Generalization in Visual Meta-Reinforcement Learning [56.50123642237106]
We evaluate the generalization performance of meta-reinforcement learning algorithms. We find that these algorithms can display strong overfitting when they are evaluated on challenging tasks.
arXiv Detail & Related papers (2020-06-12T15:17:17Z)
Enhancing accuracy of deep learning algorithms by training with low-discrepancy sequences [15.2292571922932]
We propose a deep supervised learning algorithm based on low-discrepancy sequences as the training set. We demonstrate that the proposed algorithm significantly outperforms standard deep learning algorithms for problems in moderately high dimensions.
arXiv Detail & Related papers (2020-05-26T08:14:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.