Boolformer: Symbolic Regression of Logic Functions with Transformers
- URL: http://arxiv.org/abs/2309.12207v1
- Date: Thu, 21 Sep 2023 16:11:38 GMT
- Title: Boolformer: Symbolic Regression of Logic Functions with Transformers
- Authors: St\'ephane d'Ascoli, Samy Bengio, Josh Susskind, Emmanuel Abb\'e
- Abstract summary: We introduce Boolformer, the first Transformer architecture trained to perform end-to-end symbolic regression of Boolean functions.
We show that it can predict compact formulas for complex functions which were not seen during training, when provided a clean truth table.
We evaluate the Boolformer on a broad set of real-world binary classification datasets, demonstrating its potential as an interpretable alternative to classic machine learning methods.
- Score: 26.946376237404994
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we introduce Boolformer, the first Transformer architecture
trained to perform end-to-end symbolic regression of Boolean functions. First,
we show that it can predict compact formulas for complex functions which were
not seen during training, when provided a clean truth table. Then, we
demonstrate its ability to find approximate expressions when provided
incomplete and noisy observations. We evaluate the Boolformer on a broad set of
real-world binary classification datasets, demonstrating its potential as an
interpretable alternative to classic machine learning methods. Finally, we
apply it to the widespread task of modelling the dynamics of gene regulatory
networks. Using a recent benchmark, we show that Boolformer is competitive with
state-of-the art genetic algorithms with a speedup of several orders of
magnitude. Our code and models are available publicly.
Related papers
- Learning Linear Attention in Polynomial Time [115.68795790532289]
We provide the first results on learnability of single-layer Transformers with linear attention.
We show that linear attention may be viewed as a linear predictor in a suitably defined RKHS.
We show how to efficiently identify training datasets for which every empirical riskr is equivalent to the linear Transformer.
arXiv Detail & Related papers (2024-10-14T02:41:01Z) - Learning on Transformers is Provable Low-Rank and Sparse: A One-layer Analysis [63.66763657191476]
We show that efficient numerical training and inference algorithms as low-rank computation have impressive performance for learning Transformer-based adaption.
We analyze how magnitude-based models affect generalization while improving adaption.
We conclude that proper magnitude-based has a slight on the testing performance.
arXiv Detail & Related papers (2024-06-24T23:00:58Z) - In-Context Convergence of Transformers [63.04956160537308]
We study the learning dynamics of a one-layer transformer with softmax attention trained via gradient descent.
For data with imbalanced features, we show that the learning dynamics take a stage-wise convergence process.
arXiv Detail & Related papers (2023-10-08T17:55:33Z) - Trained Transformers Learn Linear Models In-Context [39.56636898650966]
Attention-based neural networks as transformers have demonstrated a remarkable ability to exhibit inattention learning (ICL)
We show that when transformer training over random instances of linear regression problems, these models' predictions mimic nonlinear of ordinary squares.
arXiv Detail & Related papers (2023-06-16T15:50:03Z) - Transformers as Statisticians: Provable In-Context Learning with
In-Context Algorithm Selection [88.23337313766353]
This work first provides a comprehensive statistical theory for transformers to perform ICL.
We show that transformers can implement a broad class of standard machine learning algorithms in context.
A emphsingle transformer can adaptively select different base ICL algorithms.
arXiv Detail & Related papers (2023-06-07T17:59:31Z) - All Roads Lead to Rome? Exploring the Invariance of Transformers'
Representations [69.3461199976959]
We propose a model based on invertible neural networks, BERT-INN, to learn the Bijection Hypothesis.
We show the advantage of BERT-INN both theoretically and through extensive experiments.
arXiv Detail & Related papers (2023-05-23T22:30:43Z) - Generalization on the Unseen, Logic Reasoning and Degree Curriculum [25.7378861650474]
This paper considers the learning of logical (Boolean) functions with a focus on the generalization on the unseen (GOTU) setting.
We study how different network architectures trained by (S)GD perform under GOTU.
More specifically, this means an interpolator of the training data that has minimal Fourier mass on the higher degree basis elements.
arXiv Detail & Related papers (2023-01-30T17:44:05Z) - Transformers as Algorithms: Generalization and Implicit Model Selection
in In-context Learning [23.677503557659705]
In-context learning (ICL) is a type of prompting where a transformer model operates on a sequence of examples and performs inference on-the-fly.
We treat the transformer model as a learning algorithm that can be specialized via training to implement-at inference-time-another target algorithm.
We show that transformers can act as an adaptive learning algorithm and perform model selection across different hypothesis classes.
arXiv Detail & Related papers (2023-01-17T18:31:12Z) - Pre-Training a Graph Recurrent Network for Language Representation [34.4554387894105]
We consider a graph recurrent network for language model pre-training, which builds a graph structure for each sequence with local token-level communications.
We find that our model can generate more diverse outputs with less contextualized feature redundancy than existing attention-based models.
arXiv Detail & Related papers (2022-09-08T14:12:15Z) - Category-Learning with Context-Augmented Autoencoder [63.05016513788047]
Finding an interpretable non-redundant representation of real-world data is one of the key problems in Machine Learning.
We propose a novel method of using data augmentations when training autoencoders.
We train a Variational Autoencoder in such a way, that it makes transformation outcome predictable by auxiliary network.
arXiv Detail & Related papers (2020-10-10T14:04:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.