Related papers: Information plane and compression-gnostic feedback in quantum machine learning

Information plane and compression-gnostic feedback in quantum machine learning

URL: http://arxiv.org/abs/2411.02313v1
Date: Mon, 04 Nov 2024 17:38:46 GMT
Title: Information plane and compression-gnostic feedback in quantum machine learning
Authors: Nathan Haboury, Mo Kordzanganeh, Alexey Melnikov, Pavel Sekatski,
Abstract summary: The information plane has been proposed as an analytical tool for studying the learning dynamics of neural networks. We study how the insight on how much the model compresses the input data can be used to improve a learning algorithm. We benchmark the proposed learning algorithms on several classification and regression tasks using variational quantum circuits.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The information plane (Tishby et al. arXiv:physics/0004057, Shwartz-Ziv et al. arXiv:1703.00810) has been proposed as an analytical tool for studying the learning dynamics of neural networks. It provides quantitative insight on how the model approaches the learned state by approximating a minimal sufficient statistics. In this paper we extend this tool to the domain of quantum learning models. In a second step, we study how the insight on how much the model compresses the input data (provided by the information plane) can be used to improve a learning algorithm. Specifically, we consider two ways to do so: via a multiplicative regularization of the loss function, or with a compression-gnostic scheduler of the learning rate (for algorithms based on gradient descent). Both ways turn out to be equivalent in our implementation. Finally, we benchmark the proposed learning algorithms on several classification and regression tasks using variational quantum circuits. The results demonstrate an improvement in test accuracy and convergence speed for both synthetic and real-world datasets. Additionally, with one example we analyzed the impact of the proposed modifications on the performances of neural networks in a classification task.

Related papers

Sequential Regression Learning with Randomized Algorithms [0.0]
This paper presents randomized SINDy", a sequential machine learning algorithm designed for dynamic data.<n>It employs a probabilistic approach, with its PAC learning property rigorously proven through the mathematical theory of functional analysis.<n>The algorithm's effectiveness is demonstrated through experimental results in regression and binary classification using real-world data.
arXiv Detail & Related papers (2025-07-04T18:14:36Z)
Statistically guided deep learning [10.619901778151336]
We present a theoretically well-founded deep learning algorithm for nonparametric regression. We show that a theoretical analysis of deep learning which takes into account simultaneously optimization, generalization and approximation can result in a new deep learning estimate.
arXiv Detail & Related papers (2025-04-11T12:36:06Z)
The Persian Rug: solving toy models of superposition using large-scale symmetries [0.0]
We present a complete mechanistic description of the algorithm learned by a minimal non-linear sparse data autoencoder in the limit of large input dimension. Our work contributes to neural network interpretability by introducing techniques for understanding the structure of autoencoders.
arXiv Detail & Related papers (2024-10-15T22:52:45Z)
Hypothesis Testing and Machine Learning: Interpreting Variable Effects in Deep Artificial Neural Networks using Cohen's f2 [0.0]
Deep artificial neural networks show high predictive performance in many fields. But they do not afford statistical inferences and their black-box operations are too complicated for humans to comprehend. This article extends current XAI methods and develops a model agnostic hypothesis testing framework for machine learning.
arXiv Detail & Related papers (2023-02-02T20:43:37Z)
An Information-Theoretic Analysis of Compute-Optimal Neural Scaling Laws [24.356906682593532]
We study the compute-optimal trade-off between model and training data set sizes for large neural networks. Our result suggests a linear relation similar to that supported by the empirical analysis of chinchilla.
arXiv Detail & Related papers (2022-12-02T18:46:41Z)
What learning algorithm is in-context learning? Investigations with linear models [87.91612418166464]
We investigate the hypothesis that transformer-based in-context learners implement standard learning algorithms implicitly. We show that trained in-context learners closely match the predictors computed by gradient descent, ridge regression, and exact least-squares regression. Preliminary evidence that in-context learners share algorithmic features with these predictors.
arXiv Detail & Related papers (2022-11-28T18:59:51Z)
A didactic approach to quantum machine learning with a single qubit [68.8204255655161]
We focus on the case of learning with a single qubit, using data re-uploading techniques. We implement the different proposed formulations in toy and real-world datasets using the qiskit quantum computing SDK.
arXiv Detail & Related papers (2022-11-23T18:25:32Z)
Convolutional generative adversarial imputation networks for spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods. We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z)
Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning. Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z)
Learning Dynamics from Noisy Measurements using Deep Learning with a Runge-Kutta Constraint [9.36739413306697]
We discuss a methodology to learn differential equation(s) using noisy and sparsely sampled measurements. In our methodology, the main innovation can be seen in of integration of deep neural networks with a classical numerical integration method.
arXiv Detail & Related papers (2021-09-23T15:43:45Z)
Malicious Network Traffic Detection via Deep Learning: An Information Theoretic View [0.0]
We study how homeomorphism affects learned representation of a malware traffic dataset. Our results suggest that although the details of learned representations and the specific coordinate system defined over the manifold of all parameters differ slightly, the functional approximations are the same.
arXiv Detail & Related papers (2020-09-16T15:37:44Z)
Information Theoretic Meta Learning with Gaussian Processes [74.54485310507336]
We formulate meta learning using information theoretic concepts; namely, mutual information and the information bottleneck. By making use of variational approximations to the mutual information, we derive a general and tractable framework for meta learning.
arXiv Detail & Related papers (2020-09-07T16:47:30Z)
The data-driven physical-based equations discovery using evolutionary approach [77.34726150561087]
We describe the algorithm for the mathematical equations discovery from the given observations data. The algorithm combines genetic programming with the sparse regression. It could be used for governing analytical equation discovery as well as for partial differential equations (PDE) discovery.
arXiv Detail & Related papers (2020-04-03T17:21:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.