Related papers: The Lattice Overparametrization Paradigm for the Machine Learning of Lattice Operators

The Lattice Overparametrization Paradigm for the Machine Learning of Lattice Operators

URL: http://arxiv.org/abs/2310.06639v2
Date: Fri, 26 Jan 2024 21:31:05 GMT
Title: The Lattice Overparametrization Paradigm for the Machine Learning of Lattice Operators
Authors: Diego Marcondes and Junior Barrera
Abstract summary: We discuss a learning paradigm in which, by overparametrizing a class via elements in a lattice, an algorithm for minimizing functions in a lattice is applied to learn. This learning paradigm has three properties that modern methods based on neural networks lack: control, transparency and interpretability.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The machine learning of lattice operators has three possible bottlenecks. From a statistical standpoint, it is necessary to design a constrained class of operators based on prior information with low bias, and low complexity relative to the sample size. From a computational perspective, there should be an efficient algorithm to minimize an empirical error over the class. From an understanding point of view, the properties of the learned operator need to be derived, so its behavior can be theoretically understood. The statistical bottleneck can be overcome due to the rich literature about the representation of lattice operators, but there is no general learning algorithm for them. In this paper, we discuss a learning paradigm in which, by overparametrizing a class via elements in a lattice, an algorithm for minimizing functions in a lattice is applied to learn. We present the stochastic lattice descent algorithm as a general algorithm to learn on constrained classes of operators as long as a lattice overparametrization of it is fixed, and we discuss previous works which are proves of concept. Moreover, if there are algorithms to compute the basis of an operator from its overparametrization, then its properties can be deduced and the understanding bottleneck is also overcome. This learning paradigm has three properties that modern methods based on neural networks lack: control, transparency and interpretability. Nowadays, there is an increasing demand for methods with these characteristics, and we believe that mathematical morphology is in a unique position to supply them. The lattice overparametrization paradigm could be a missing piece for it to achieve its full potential within modern machine learning.

Related papers

A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time" It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z)
Limits and Powers of Koopman Learning [0.0]
Dynamical systems provide a comprehensive way to study complex and changing behaviors across various sciences. Koopman operators have emerged as a dominant approach because they allow the study of nonlinear dynamics using linear techniques. This paper addresses a fundamental open question: textitWhen can we robustly learn the spectral properties of Koopman operators from trajectory data of dynamical systems, and when can we not?
arXiv Detail & Related papers (2024-07-08T18:24:48Z)
Operator Learning of Lipschitz Operators: An Information-Theoretic Perspective [2.375038919274297]
This work addresses the complexity of neural operator approximations for the general class of Lipschitz continuous operators. Our main contribution establishes lower bounds on the metric entropy of Lipschitz operators in two approximation settings. It is shown that, regardless of the activation function used, neural operator architectures attaining an approximation accuracy $epsilon$ must have a size that is exponentially large in $epsilon-1$.
arXiv Detail & Related papers (2024-06-26T23:36:46Z)
A General Framework for Learning from Weak Supervision [93.89870459388185]
This paper introduces a general framework for learning from weak supervision (GLWS) with a novel algorithm. Central to GLWS is an Expectation-Maximization (EM) formulation, adeptly accommodating various weak supervision sources. We also present an advanced algorithm that significantly simplifies the EM computational demands.
arXiv Detail & Related papers (2024-02-02T21:48:50Z)
Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP [81.00800920928621]
We study representation learning in partially observable Markov Decision Processes (POMDPs) We first present an algorithm for decodable POMDPs that combines maximum likelihood estimation (MLE) and optimism in the face of uncertainty (OFU) We then show how to adapt this algorithm to also work in the broader class of $gamma$-observable POMDPs.
arXiv Detail & Related papers (2023-06-21T16:04:03Z)
An Introduction to Kernel and Operator Learning Methods for Homogenization by Self-consistent Clustering Analysis [0.48747801442240574]
The article presents a thorough analysis on the mathematical underpinnings of the operator learning paradigm. The proposed kernel operator learning method uses graph kernel networks to come up with a mechanistic reduced order method for multiscale homogenization.
arXiv Detail & Related papers (2022-12-01T02:36:16Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms. The learned algorithms are domain-agnostic and can generalize to new environments not seen during training. We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z)
Demystifying Deep Neural Networks Through Interpretation: A Survey [3.566184392528658]
Modern deep learning algorithms tend to optimize an objective metric, such as minimize a cross entropy loss on a training dataset, to be able to learn. The problem is that the single metric is an incomplete description of the real world tasks. There are works done to tackle the problem of interpretability to provide insights into neural networks behavior and thought process.
arXiv Detail & Related papers (2020-12-13T17:56:41Z)
Learning outside the Black-Box: The pursuit of interpretable models [78.32475359554395]
This paper proposes an algorithm that produces a continuous global interpretation of any given continuous black-box function. Our interpretation represents a leap forward from the previous state of the art.
arXiv Detail & Related papers (2020-11-17T12:39:44Z)
An Advance on Variable Elimination with Applications to Tensor-Based Computation [11.358487655918676]
We present new results on the classical algorithm of variable elimination, which underlies many algorithms including for probabilistic inference. The results relate to exploiting functional dependencies, allowing one to perform inference and learning efficiently on models that have very large treewidth.
arXiv Detail & Related papers (2020-02-21T14:17:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.