On the Convergence of Tsetlin Machines for the AND and the OR Operators
- URL: http://arxiv.org/abs/2109.09488v1
- Date: Fri, 17 Sep 2021 13:53:26 GMT
- Title: On the Convergence of Tsetlin Machines for the AND and the OR Operators
- Authors: Lei Jiao, Xuan Zhang, Ole-Christoffer Granmo
- Abstract summary: The Tsetlin Machine (TM) is a novel machine-learning algorithm based on propositional logic.
In this article, we analyze the convergence when input training samples follow AND and OR operators respectively.
- Score: 18.937937844147644
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Tsetlin Machine (TM) is a novel machine-learning algorithm based on
propositional logic, which has obtained state-of-the-art performance on several
pattern recognition problems. In previous studies, the convergence properties
of TM for 1-bit operation and XOR operation have been analyzed. To make the
analyses for the basic digital operations complete, in this article, we analyze
the convergence when input training samples follow AND and OR operators
respectively. Our analyses reveal that the TM can converge almost surely to
reproduce AND and OR operators, which are learnt from training data over an
infinite time horizon. The analyses on AND and OR operators, together with the
previously analysed 1-bit and XOR operations, complete the convergence analyses
on basic operators in Boolean algebra.
Related papers
- Learning Operators with Stochastic Gradient Descent in General Hilbert
Spaces [6.690174942973101]
This study investigates leveraging gradient descent (SGD) to learn operators between general Hilbert spaces.
We establish upper bounds for convergence rates of the SGD algorithm and conduct a minimax lower bound analysis.
Applying our analysis to operator learning problems based on vector-valued and real-valued kernel Hilbert spaces yields new convergence results.
arXiv Detail & Related papers (2024-02-07T09:31:01Z) - In-Context Convergence of Transformers [63.04956160537308]
We study the learning dynamics of a one-layer transformer with softmax attention trained via gradient descent.
For data with imbalanced features, we show that the learning dynamics take a stage-wise convergence process.
arXiv Detail & Related papers (2023-10-08T17:55:33Z) - A Mechanistic Interpretation of Arithmetic Reasoning in Language Models
using Causal Mediation Analysis [128.0532113800092]
We present a mechanistic interpretation of Transformer-based LMs on arithmetic questions.
This provides insights into how information related to arithmetic is processed by LMs.
arXiv Detail & Related papers (2023-05-24T11:43:47Z) - Rethinking Reinforcement Learning based Logic Synthesis [13.18408482571087]
We develop a new RL-based method that can automatically recognize critical operators and generate common operator sequences generalizable to unseen circuits.
Our algorithm is verified on both the EPFL benchmark, a private dataset and a circuit at industrial scale.
arXiv Detail & Related papers (2022-05-16T12:15:32Z) - Eigen Analysis of Self-Attention and its Reconstruction from Partial
Computation [58.80806716024701]
We study the global structure of attention scores computed using dot-product based self-attention.
We find that most of the variation among attention scores lie in a low-dimensional eigenspace.
We propose to compute scores only for a partial subset of token pairs, and use them to estimate scores for the remaining pairs.
arXiv Detail & Related papers (2021-06-16T14:38:42Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - A Simple and General Debiased Machine Learning Theorem with Finite
Sample Guarantees [4.55274575362193]
We provide a nonasymptotic debiased machine learning theorem that encompasses any global or local functional of any machine learning algorithm.
Our results culminate in a simple set of conditions that an analyst can use to translate modern learning theory rates into traditional statistical inference.
arXiv Detail & Related papers (2021-05-31T17:57:02Z) - Quantum Algorithms for Data Representation and Analysis [68.754953879193]
We provide quantum procedures that speed-up the solution of eigenproblems for data representation in machine learning.
The power and practical use of these subroutines is shown through new quantum algorithms, sublinear in the input matrix's size, for principal component analysis, correspondence analysis, and latent semantic analysis.
Results show that the run-time parameters that do not depend on the input's size are reasonable and that the error on the computed model is small, allowing for competitive classification performances.
arXiv Detail & Related papers (2021-04-19T00:41:43Z) - On the Convergence of Tsetlin Machines for the XOR Operator [16.85020149902772]
The Tsetlin Machine (TM) is a novel machine learning algorithm with several distinct properties.
We analyze the convergence of the TM when input is non-linearly related to output by the XOR-operator.
arXiv Detail & Related papers (2021-01-07T14:13:41Z) - On the Convergence of Tsetlin Machines for the IDENTITY- and NOT
Operators [14.186377732379045]
We analyze the convergence of the Tsetlin Machine with only one clause involved for classification.
Our analysis reveals that the TM, with just one clause, can converge correctly to the intended logical operator.
The analysis of the convergence of the two basic operators lays the foundation for analyzing other logical operators.
arXiv Detail & Related papers (2020-07-28T14:31:16Z) - A Constraint-Based Algorithm for the Structural Learning of
Continuous-Time Bayesian Networks [70.88503833248159]
We propose the first constraint-based algorithm for learning the structure of continuous-time Bayesian networks.
We discuss the different statistical tests and the underlying hypotheses used by our proposal to establish conditional independence.
arXiv Detail & Related papers (2020-07-07T07:34:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.