Bridging Algorithmic Information Theory and Machine Learning: A New Approach to Kernel Learning
- URL: http://arxiv.org/abs/2311.12624v3
- Date: Wed, 10 Apr 2024 11:35:14 GMT
- Title: Bridging Algorithmic Information Theory and Machine Learning: A New Approach to Kernel Learning
- Authors: Boumediene Hamzi, Marcus Hutter, Houman Owhadi,
- Abstract summary: We adopt an AIT perspective on the problem of learning kernels from data, through the method of Sparse Kernel Flows.
This approach opens the door to reformulating algorithms in machine learning using tools from AIT.
- Score: 12.848057726330714
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine Learning (ML) and Algorithmic Information Theory (AIT) look at Complexity from different points of view. We explore the interface between AIT and Kernel Methods (that are prevalent in ML) by adopting an AIT perspective on the problem of learning kernels from data, in kernel ridge regression, through the method of Sparse Kernel Flows. In particular, by looking at the differences and commonalities between Minimal Description Length (MDL) and Regularization in Machine Learning (RML), we prove that the method of Sparse Kernel Flows is the natural approach to adopt to learn kernels from data. This approach aligns naturally with the MDL principle, offering a more robust theoretical basis than the existing reliance on cross-validation. The study reveals that deriving Sparse Kernel Flows does not require a statistical approach; instead, one can directly engage with code-lengths and complexities, concepts central to AIT. Thereby, this approach opens the door to reformulating algorithms in machine learning using tools from AIT, with the aim of providing them a more solid theoretical foundation.
Related papers
- Uncertainty quantification for Markov chains with application to temporal difference learning [63.49764856675643]
We develop novel high-dimensional concentration inequalities and Berry-Esseen bounds for vector- and matrix-valued functions of Markov chains.
We analyze the TD learning algorithm, a widely used method for policy evaluation in reinforcement learning.
arXiv Detail & Related papers (2025-02-19T15:33:55Z) - RESIST: Resilient Decentralized Learning Using Consensus Gradient Descent [11.22833419439317]
Empirical robustness risk (ERM) is a cornerstone of modern machine learning (ML)
This paper focuses on the man-in-the-middle (MITM) attack, which can cause models to deviate significantly from their intended ERM solutions.
We propose RESIST, an algorithm designed to be robust against adversarially compromised communication links.
arXiv Detail & Related papers (2025-02-11T21:48:10Z) - Learning for Cross-Layer Resource Allocation in MEC-Aided Cell-Free Networks [71.30914500714262]
Cross-layer resource allocation over mobile edge computing (MEC)-aided cell-free networks can sufficiently exploit the transmitting and computing resources to promote the data rate.
Joint subcarrier allocation and beamforming optimization are investigated for the MEC-aided cell-free network from the perspective of deep learning.
arXiv Detail & Related papers (2024-12-21T10:18:55Z) - A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time"
It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z) - Blind Super-Resolution via Meta-learning and Markov Chain Monte Carlo Simulation [46.5310645609264]
We propose a Meta-learning and Markov Chain Monte Carlo based SISR approach to learn kernel priors from organized randomness.
A lightweight network is adopted as kernel generator, and is optimized via learning from the MCMC simulation on random Gaussian distributions.
A meta-learning-based alternating optimization procedure is proposed to optimize the kernel generator and image restorer.
arXiv Detail & Related papers (2024-06-13T07:50:15Z) - Kernel Correlation-Dissimilarity for Multiple Kernel k-Means Clustering [21.685153346752124]
Current methods enhance information diversity and reduce redundancy by exploiting interdependencies among multiple kernels based on correlations or dissimilarities.
We introduce a novel method that systematically integrates both kernel correlation and dissimilarity.
By emphasizing the coherence between kernel correlation and dissimilarity, our method offers a more objective and transparent strategy for extracting non-linear information.
arXiv Detail & Related papers (2024-03-06T04:24:43Z) - On Leave-One-Out Conditional Mutual Information For Generalization [122.2734338600665]
We derive information theoretic generalization bounds for supervised learning algorithms based on a new measure of leave-one-out conditional mutual information (loo-CMI)
Contrary to other CMI bounds, our loo-CMI bounds can be computed easily and can be interpreted in connection to other notions such as classical leave-one-out cross-validation.
We empirically validate the quality of the bound by evaluating its predicted generalization gap in scenarios for deep learning.
arXiv Detail & Related papers (2022-07-01T17:58:29Z) - Clustered Federated Learning via Generalized Total Variation
Minimization [83.26141667853057]
We study optimization methods to train local (or personalized) models for local datasets with a decentralized network structure.
Our main conceptual contribution is to formulate federated learning as total variation minimization (GTV)
Our main algorithmic contribution is a fully decentralized federated learning algorithm.
arXiv Detail & Related papers (2021-05-26T18:07:19Z) - Self-organizing Democratized Learning: Towards Large-scale Distributed
Learning Systems [71.14339738190202]
democratized learning (Dem-AI) lays out a holistic philosophy with underlying principles for building large-scale distributed and democratized machine learning systems.
Inspired by Dem-AI philosophy, a novel distributed learning approach is proposed in this paper.
The proposed algorithms demonstrate better results in the generalization performance of learning models in agents compared to the conventional FL algorithms.
arXiv Detail & Related papers (2020-07-07T08:34:48Z) - Gradient tracking and variance reduction for decentralized optimization
and machine learning [19.54092620537586]
Decentralized methods to solve finite-sum problems are important in many signal processing and machine learning tasks.
We provide a unified algorithmic framework that combines variance-reduction with gradient tracking to achieve robust performance.
arXiv Detail & Related papers (2020-02-13T07:17:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.