Generalization Bounds: Perspectives from Information Theory and PAC-Bayes
- URL: http://arxiv.org/abs/2309.04381v2
- Date: Wed, 27 Mar 2024 17:07:47 GMT
- Title: Generalization Bounds: Perspectives from Information Theory and PAC-Bayes
- Authors: Fredrik Hellström, Giuseppe Durisi, Benjamin Guedj, Maxim Raginsky,
- Abstract summary: The PAC-Bayesian approach has been established as a flexible framework to address the generalization capabilities of machine learning algorithms.
An information-theoretic view of generalization has developed, wherein the relation between generalization and various information measures has been established.
We present techniques and results that the two perspectives have in common, and discuss the approaches and interpretations that differ.
- Score: 31.803107987439784
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: A fundamental question in theoretical machine learning is generalization. Over the past decades, the PAC-Bayesian approach has been established as a flexible framework to address the generalization capabilities of machine learning algorithms, and design new ones. Recently, it has garnered increased interest due to its potential applicability for a variety of learning algorithms, including deep neural networks. In parallel, an information-theoretic view of generalization has developed, wherein the relation between generalization and various information measures has been established. This framework is intimately connected to the PAC-Bayesian approach, and a number of results have been independently discovered in both strands. In this monograph, we highlight this strong connection and present a unified treatment of PAC-Bayesian and information-theoretic generalization bounds. We present techniques and results that the two perspectives have in common, and discuss the approaches and interpretations that differ. In particular, we demonstrate how many proofs in the area share a modular structure, through which the underlying ideas can be intuited. We pay special attention to the conditional mutual information (CMI) framework; analytical studies of the information complexity of learning algorithms; and the application of the proposed methods to deep learning. This monograph is intended to provide a comprehensive introduction to information-theoretic generalization bounds and their connection to PAC-Bayes, serving as a foundation from which the most recent developments are accessible. It is aimed broadly towards researchers with an interest in generalization and theoretical machine learning.
Related papers
- Coding for Intelligence from the Perspective of Category [66.14012258680992]
Coding targets compressing and reconstructing data, and intelligence.
Recent trends demonstrate the potential homogeneity of these two fields.
We propose a novel problem of Coding for Intelligence from the category theory view.
arXiv Detail & Related papers (2024-07-01T07:05:44Z) - Discovering Common Information in Multi-view Data [35.37807004353416]
We introduce an innovative and mathematically rigorous definition for computing common information from multi-view data.
We develop a novel supervised multi-view learning framework to capture both common and unique information.
arXiv Detail & Related papers (2024-06-21T10:47:06Z) - A Unified and General Framework for Continual Learning [58.72671755989431]
Continual Learning (CL) focuses on learning from dynamic and changing data distributions while retaining previously acquired knowledge.
Various methods have been developed to address the challenge of catastrophic forgetting, including regularization-based, Bayesian-based, and memory-replay-based techniques.
This research aims to bridge this gap by introducing a comprehensive and overarching framework that encompasses and reconciles these existing methodologies.
arXiv Detail & Related papers (2024-03-20T02:21:44Z) - Exploring Machine Learning Models for Federated Learning: A Review of
Approaches, Performance, and Limitations [1.1060425537315088]
Federated learning is a distributed learning framework enhanced to preserve the privacy of individuals' data.
In times of crisis, when real-time decision-making is critical, federated learning allows multiple entities to work collectively without sharing sensitive data.
This paper is a systematic review of the literature on privacy-preserving machine learning in the last few years.
arXiv Detail & Related papers (2023-11-17T19:23:21Z) - Federated Learning for Generalization, Robustness, Fairness: A Survey
and Benchmark [55.898771405172155]
Federated learning has emerged as a promising paradigm for privacy-preserving collaboration among different parties.
We provide a systematic overview of the important and recent developments of research on federated learning.
arXiv Detail & Related papers (2023-11-12T06:32:30Z) - Bayesian Learning for Neural Networks: an algorithmic survey [95.42181254494287]
This self-contained survey engages and introduces readers to the principles and algorithms of Bayesian Learning for Neural Networks.
It provides an introduction to the topic from an accessible, practical-algorithmic perspective.
arXiv Detail & Related papers (2022-11-21T21:36:58Z) - Knowledge Graph Augmented Network Towards Multiview Representation
Learning for Aspect-based Sentiment Analysis [96.53859361560505]
We propose a knowledge graph augmented network (KGAN) to incorporate external knowledge with explicitly syntactic and contextual information.
KGAN captures the sentiment feature representations from multiple perspectives, i.e., context-, syntax- and knowledge-based.
Experiments on three popular ABSA benchmarks demonstrate the effectiveness and robustness of our KGAN.
arXiv Detail & Related papers (2022-01-13T08:25:53Z) - A survey of Bayesian Network structure learning [8.411014222942168]
This paper provides a review of 61 algorithms proposed for learning BN structure from data.
The basic approach of each algorithm is described in consistent terms, and the similarities and differences between them highlighted.
Approaches for dealing with data noise in real-world datasets and incorporating expert knowledge into the learning process are also covered.
arXiv Detail & Related papers (2021-09-23T14:54:00Z) - Investigating Bi-Level Optimization for Learning and Vision from a
Unified Perspective: A Survey and Beyond [114.39616146985001]
In machine learning and computer vision fields, despite the different motivations and mechanisms, a lot of complex problems contain a series of closely related subproblms.
In this paper, we first uniformly express these complex learning and vision problems from the perspective of Bi-Level Optimization (BLO)
Then we construct a value-function-based single-level reformulation and establish a unified algorithmic framework to understand and formulate mainstream gradient-based BLO methodologies.
arXiv Detail & Related papers (2021-01-27T16:20:23Z) - Reasoning About Generalization via Conditional Mutual Information [26.011933885798506]
We use Mutual Information (CMI) to quantify how well the input can be recognized.
We show that bounds on CMI can be obtained from VC dimension, compression schemes, differential privacy, and other methods.
We then show that bounded CMI implies various forms of generalization.
arXiv Detail & Related papers (2020-01-24T18:13:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.