Towards the One Learning Algorithm Hypothesis: A System-theoretic
Approach
- URL: http://arxiv.org/abs/2112.02256v1
- Date: Sat, 4 Dec 2021 05:54:33 GMT
- Title: Towards the One Learning Algorithm Hypothesis: A System-theoretic
Approach
- Authors: Christos Mavridis, John Baras
- Abstract summary: The existence of a universal learning architecture in human cognition is a widely spread conjecture supported by experimental findings from neuroscience.
We develop a closed-loop system with three main components: (i) a multi-resolution analysis pre-processor, (ii) a group-invariant feature extractor, and (iii) a progressive knowledge-based learning module.
We introduce a novel learning algorithm that constructs progressively growing knowledge representations in multiple resolutions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The existence of a universal learning architecture in human cognition is a
widely spread conjecture supported by experimental findings from neuroscience.
While no low-level implementation can be specified yet, an abstract outline of
human perception and learning is believed to entail three basic properties: (a)
hierarchical attention and processing, (b) memory-based knowledge
representation, and (c) progressive learning and knowledge compaction. We
approach the design of such a learning architecture from a system-theoretic
viewpoint, developing a closed-loop system with three main components: (i) a
multi-resolution analysis pre-processor, (ii) a group-invariant feature
extractor, and (iii) a progressive knowledge-based learning module.
Multi-resolution feedback loops are used for learning, i.e., for adapting the
system parameters to online observations. To design (i) and (ii), we build upon
the established theory of wavelet-based multi-resolution analysis and the
properties of group convolution operators. Regarding (iii), we introduce a
novel learning algorithm that constructs progressively growing knowledge
representations in multiple resolutions. The proposed algorithm is an extension
of the Online Deterministic Annealing (ODA) algorithm based on annealing
optimization, solved using gradient-free stochastic approximation. ODA has
inherent robustness and regularization properties and provides a means to
progressively increase the complexity of the learning model i.e. the number of
the neurons, as needed, through an intuitive bifurcation phenomenon. The
proposed multi-resolution approach is hierarchical, progressive,
knowledge-based, and interpretable. We illustrate the properties of the
proposed architecture in the context of the state-of-the-art learning
algorithms and deep learning methods.
Related papers
- Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time"
It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z) - Coding for Intelligence from the Perspective of Category [66.14012258680992]
Coding targets compressing and reconstructing data, and intelligence.
Recent trends demonstrate the potential homogeneity of these two fields.
We propose a novel problem of Coding for Intelligence from the category theory view.
arXiv Detail & Related papers (2024-07-01T07:05:44Z) - Reasoning Algorithmically in Graph Neural Networks [1.8130068086063336]
We aim to integrate the structured and rule-based reasoning of algorithms with adaptive learning capabilities of neural networks.
This dissertation provides theoretical and practical contributions to this area of research.
arXiv Detail & Related papers (2024-02-21T12:16:51Z) - Multi-Resolution Online Deterministic Annealing: A Hierarchical and
Progressive Learning Architecture [0.0]
We introduce a general-purpose hierarchical learning architecture that is based on the progressive partitioning of a possibly multi-resolution data space.
We show that the solution of each optimization problem can be estimated online using gradient-free approximation updates.
Asymptotic convergence analysis and experimental results are provided for supervised and unsupervised learning problems.
arXiv Detail & Related papers (2022-12-15T23:21:49Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - Adaptive Discretization in Online Reinforcement Learning [9.560980936110234]
Two major questions in designing discretization-based algorithms are how to create the discretization and when to refine it.
We provide a unified theoretical analysis of tree-based hierarchical partitioning methods for online reinforcement learning.
Our algorithms are easily adapted to operating constraints, and our theory provides explicit bounds across each of the three facets.
arXiv Detail & Related papers (2021-10-29T15:06:15Z) - LENAS: Learning-based Neural Architecture Search and Ensemble for 3D Radiotherapy Dose Prediction [42.38793195337463]
We propose a novel learning-based ensemble approach named LENAS, which integrates neural architecture search with knowledge distillation for 3D radiotherapy dose prediction.
Our approach starts by exhaustively searching each block from an enormous architecture space to identify multiple architectures that exhibit promising performance.
To mitigate the complexity introduced by the model ensemble, we adopt the teacher-student paradigm, leveraging the diverse outputs from multiple learned networks as supervisory signals.
arXiv Detail & Related papers (2021-06-12T10:08:52Z) - Investigating Bi-Level Optimization for Learning and Vision from a
Unified Perspective: A Survey and Beyond [114.39616146985001]
In machine learning and computer vision fields, despite the different motivations and mechanisms, a lot of complex problems contain a series of closely related subproblms.
In this paper, we first uniformly express these complex learning and vision problems from the perspective of Bi-Level Optimization (BLO)
Then we construct a value-function-based single-level reformulation and establish a unified algorithmic framework to understand and formulate mainstream gradient-based BLO methodologies.
arXiv Detail & Related papers (2021-01-27T16:20:23Z) - Self-organizing Democratized Learning: Towards Large-scale Distributed
Learning Systems [71.14339738190202]
democratized learning (Dem-AI) lays out a holistic philosophy with underlying principles for building large-scale distributed and democratized machine learning systems.
Inspired by Dem-AI philosophy, a novel distributed learning approach is proposed in this paper.
The proposed algorithms demonstrate better results in the generalization performance of learning models in agents compared to the conventional FL algorithms.
arXiv Detail & Related papers (2020-07-07T08:34:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.