The Lattice Overparametrization Paradigm for the Machine Learning of
Lattice Operators
- URL: http://arxiv.org/abs/2310.06639v2
- Date: Fri, 26 Jan 2024 21:31:05 GMT
- Title: The Lattice Overparametrization Paradigm for the Machine Learning of
Lattice Operators
- Authors: Diego Marcondes and Junior Barrera
- Abstract summary: We discuss a learning paradigm in which, by overparametrizing a class via elements in a lattice, an algorithm for minimizing functions in a lattice is applied to learn.
This learning paradigm has three properties that modern methods based on neural networks lack: control, transparency and interpretability.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The machine learning of lattice operators has three possible bottlenecks.
From a statistical standpoint, it is necessary to design a constrained class of
operators based on prior information with low bias, and low complexity relative
to the sample size. From a computational perspective, there should be an
efficient algorithm to minimize an empirical error over the class. From an
understanding point of view, the properties of the learned operator need to be
derived, so its behavior can be theoretically understood. The statistical
bottleneck can be overcome due to the rich literature about the representation
of lattice operators, but there is no general learning algorithm for them. In
this paper, we discuss a learning paradigm in which, by overparametrizing a
class via elements in a lattice, an algorithm for minimizing functions in a
lattice is applied to learn. We present the stochastic lattice descent
algorithm as a general algorithm to learn on constrained classes of operators
as long as a lattice overparametrization of it is fixed, and we discuss
previous works which are proves of concept. Moreover, if there are algorithms
to compute the basis of an operator from its overparametrization, then its
properties can be deduced and the understanding bottleneck is also overcome.
This learning paradigm has three properties that modern methods based on neural
networks lack: control, transparency and interpretability. Nowadays, there is
an increasing demand for methods with these characteristics, and we believe
that mathematical morphology is in a unique position to supply them. The
lattice overparametrization paradigm could be a missing piece for it to achieve
its full potential within modern machine learning.
Related papers
- A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time"
It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z) - Limits and Powers of Koopman Learning [0.0]
Dynamical systems provide a comprehensive way to study complex and changing behaviors across various sciences.
Koopman operators have emerged as a dominant approach because they allow the study of nonlinear dynamics using linear techniques.
This paper addresses a fundamental open question: textitWhen can we robustly learn the spectral properties of Koopman operators from trajectory data of dynamical systems, and when can we not?
arXiv Detail & Related papers (2024-07-08T18:24:48Z) - Operator Learning of Lipschitz Operators: An Information-Theoretic Perspective [2.375038919274297]
This work addresses the complexity of neural operator approximations for the general class of Lipschitz continuous operators.
Our main contribution establishes lower bounds on the metric entropy of Lipschitz operators in two approximation settings.
It is shown that, regardless of the activation function used, neural operator architectures attaining an approximation accuracy $epsilon$ must have a size that is exponentially large in $epsilon-1$.
arXiv Detail & Related papers (2024-06-26T23:36:46Z) - A General Framework for Learning from Weak Supervision [93.89870459388185]
This paper introduces a general framework for learning from weak supervision (GLWS) with a novel algorithm.
Central to GLWS is an Expectation-Maximization (EM) formulation, adeptly accommodating various weak supervision sources.
We also present an advanced algorithm that significantly simplifies the EM computational demands.
arXiv Detail & Related papers (2024-02-02T21:48:50Z) - Provably Efficient Representation Learning with Tractable Planning in
Low-Rank POMDP [81.00800920928621]
We study representation learning in partially observable Markov Decision Processes (POMDPs)
We first present an algorithm for decodable POMDPs that combines maximum likelihood estimation (MLE) and optimism in the face of uncertainty (OFU)
We then show how to adapt this algorithm to also work in the broader class of $gamma$-observable POMDPs.
arXiv Detail & Related papers (2023-06-21T16:04:03Z) - An Introduction to Kernel and Operator Learning Methods for
Homogenization by Self-consistent Clustering Analysis [0.48747801442240574]
The article presents a thorough analysis on the mathematical underpinnings of the operator learning paradigm.
The proposed kernel operator learning method uses graph kernel networks to come up with a mechanistic reduced order method for multiscale homogenization.
arXiv Detail & Related papers (2022-12-01T02:36:16Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - Demystifying Deep Neural Networks Through Interpretation: A Survey [3.566184392528658]
Modern deep learning algorithms tend to optimize an objective metric, such as minimize a cross entropy loss on a training dataset, to be able to learn.
The problem is that the single metric is an incomplete description of the real world tasks.
There are works done to tackle the problem of interpretability to provide insights into neural networks behavior and thought process.
arXiv Detail & Related papers (2020-12-13T17:56:41Z) - Learning outside the Black-Box: The pursuit of interpretable models [78.32475359554395]
This paper proposes an algorithm that produces a continuous global interpretation of any given continuous black-box function.
Our interpretation represents a leap forward from the previous state of the art.
arXiv Detail & Related papers (2020-11-17T12:39:44Z) - An Advance on Variable Elimination with Applications to Tensor-Based
Computation [11.358487655918676]
We present new results on the classical algorithm of variable elimination, which underlies many algorithms including for probabilistic inference.
The results relate to exploiting functional dependencies, allowing one to perform inference and learning efficiently on models that have very large treewidth.
arXiv Detail & Related papers (2020-02-21T14:17:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.