Less is More: Rethinking Few-Shot Learning and Recurrent Neural Nets
- URL: http://arxiv.org/abs/2209.14267v1
- Date: Wed, 28 Sep 2022 17:33:11 GMT
- Title: Less is More: Rethinking Few-Shot Learning and Recurrent Neural Nets
- Authors: Deborah Pereg, Martin Villiger, Brett Bouma, Polina Golland
- Abstract summary: We provide theoretical guarantees for reliable learning under the information-theoretic AEP.
We then focus on a highly efficient recurrent neural net (RNN) framework and propose a reduced-entropy algorithm for few-shot learning.
Our experimental results demonstrate significant potential for improving learning models' sample efficiency, generalization, and time complexity.
- Score: 2.824895388993495
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The statistical supervised learning framework assumes an input-output set
with a joint probability distribution that is reliably represented by the
training dataset. The learner is then required to output a prediction rule
learned from the training dataset's input-output pairs. In this work, we
provide meaningful insights into the asymptotic equipartition property (AEP)
\citep{Shannon:1948} in the context of machine learning, and illuminate some of
its potential ramifications for few-shot learning. We provide theoretical
guarantees for reliable learning under the information-theoretic AEP, and for
the generalization error with respect to the sample size. We then focus on a
highly efficient recurrent neural net (RNN) framework and propose a
reduced-entropy algorithm for few-shot learning. We also propose a mathematical
intuition for the RNN as an approximation of a sparse coding solver. We verify
the applicability, robustness, and computational efficiency of the proposed
approach with image deblurring and optical coherence tomography (OCT) speckle
suppression. Our experimental results demonstrate significant potential for
improving learning models' sample efficiency, generalization, and time
complexity, that can therefore be leveraged for practical real-time
applications.
Related papers
- On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning [85.75164588939185]
We study the discriminative probabilistic modeling problem on a continuous domain for (multimodal) self-supervised representation learning.
We conduct generalization error analysis to reveal the limitation of current InfoNCE-based contrastive loss for self-supervised representation learning.
arXiv Detail & Related papers (2024-10-11T18:02:46Z) - Automatic debiasing of neural networks via moment-constrained learning [0.0]
Naively learning the regression function and taking a sample mean of the target functional results in biased estimators.
We propose moment-constrained learning as a new RR learning approach that addresses some shortcomings in automatic debiasing.
arXiv Detail & Related papers (2024-09-29T20:56:54Z) - A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time"
It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z) - Learning Latent Graph Structures and their Uncertainty [63.95971478893842]
Graph Neural Networks (GNNs) use relational information as an inductive bias to enhance the model's accuracy.
As task-relevant relations might be unknown, graph structure learning approaches have been proposed to learn them while solving the downstream prediction task.
arXiv Detail & Related papers (2024-05-30T10:49:22Z) - Surprisal Driven $k$-NN for Robust and Interpretable Nonparametric
Learning [1.4293924404819704]
We shed new light on the traditional nearest neighbors algorithm from the perspective of information theory.
We propose a robust and interpretable framework for tasks such as classification, regression, density estimation, and anomaly detection using a single model.
Our work showcases the architecture's versatility by achieving state-of-the-art results in classification and anomaly detection.
arXiv Detail & Related papers (2023-11-17T00:35:38Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural
Networks [89.28881869440433]
This paper provides the first theoretical characterization of joint edge-model sparse learning for graph neural networks (GNNs)
It proves analytically that both sampling important nodes and pruning neurons with the lowest-magnitude can reduce the sample complexity and improve convergence without compromising the test accuracy.
arXiv Detail & Related papers (2023-02-06T16:54:20Z) - A Free Lunch with Influence Functions? Improving Neural Network
Estimates with Concepts from Semiparametric Statistics [41.99023989695363]
We explore the potential for semiparametric theory to be used to improve neural networks and machine learning algorithms.
We propose a new neural network method MultiNet, which seeks the flexibility and diversity of an ensemble using a single architecture.
arXiv Detail & Related papers (2022-02-18T09:35:51Z) - FF-NSL: Feed-Forward Neural-Symbolic Learner [70.978007919101]
This paper introduces a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FF-NSL)
FF-NSL integrates state-of-the-art ILP systems based on the Answer Set semantics, with neural networks, in order to learn interpretable hypotheses from labelled unstructured data.
arXiv Detail & Related papers (2021-06-24T15:38:34Z) - Physics Informed Deep Kernel Learning [24.033468062984458]
Physics Informed Deep Kernel Learning (PI-DKL) exploits physics knowledge represented by differential equations with latent sources.
For efficient and effective inference, we marginalize out the latent variables and derive a collapsed model evidence lower bound (ELBO)
arXiv Detail & Related papers (2020-06-08T22:43:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.