Arbitrarily Large Labelled Random Satisfiability Formulas for Machine
Learning Training
- URL: http://arxiv.org/abs/2211.15368v2
- Date: Sun, 4 Jun 2023 19:01:10 GMT
- Title: Arbitrarily Large Labelled Random Satisfiability Formulas for Machine
Learning Training
- Authors: Dimitris Achlioptas, Amrit Daswaney, Periklis A. Papakonstantinou
- Abstract summary: We show how to generate correctly labeled random formulas of any desired size without having to solve the underlying decision problem.
We train existing state-of-the-art models for the task of predicting satisfiability on formulas with 10,000 variables.
We find that they do no better than random guessing 99% on the same datasets.
- Score: 5.414308305392762
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Applying deep learning to solve real-life instances of hard combinatorial
problems has tremendous potential. Research in this direction has focused on
the Boolean satisfiability (SAT) problem, both because of its theoretical
centrality and practical importance. A major roadblock faced, though, is that
training sets are restricted to random formulas of size several orders of
magnitude smaller than formulas of practical interest, raising serious concerns
about generalization. This is because labeling random formulas of increasing
size rapidly becomes intractable. By exploiting the probabilistic method in a
fundamental way, we remove this roadblock entirely: we show how to generate
correctly labeled random formulas of any desired size, without having to solve
the underlying decision problem. Moreover, the difficulty of the classification
task for the formulas produced by our generator is tunable by varying a simple
scalar parameter. This opens up an entirely new level of sophistication for the
machine learning methods that can be brought to bear on Satisfiability. Using
our generator, we train existing state-of-the-art models for the task of
predicting satisfiability on formulas with 10,000 variables. We find that they
do no better than random guessing. As a first indication of what can be
achieved with the new generator, we present a novel classifier that performs
significantly better than random guessing 99% on the same datasets, for most
difficulty levels. Crucially, unlike past approaches that learn based on
syntactic features of a formula, our classifier performs its learning on a
short prefix of a solver's computation, an approach that we expect to be of
independent interest.
Related papers
- Learning the symmetric group: large from small [44.99833362998488]
We show that a transformer neural-network trained on predicting permutations can generalize to the symmetric group $S_25$ with near 100% accuracy.
We employ identity augmentation as a key tool to manage variable word lengths, and partitioned windows for training on adjacent transpositions.
arXiv Detail & Related papers (2025-02-18T10:28:25Z) - Simple and Provable Scaling Laws for the Test-Time Compute of Large Language Models [70.07661254213181]
We propose two principled algorithms for the test-time compute of large language models.
We prove theoretically that the failure probability of one algorithm decays to zero exponentially as its test-time compute grows.
arXiv Detail & Related papers (2024-11-29T05:29:47Z) - A Fast Convoluted Story: Scaling Probabilistic Inference for Integer Arithmetic [4.7223923266180785]
We formulate linear arithmetic over integer-valued random variables as tensor manipulations.
We obtain a differentiable data structure, which unlocks, virtually for free, gradient-based learning.
arXiv Detail & Related papers (2024-10-16T09:16:10Z) - Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Probabilistic Invariant Learning with Randomized Linear Classifiers [24.485477981244593]
We show how to leverage randomness and design models that are both expressive and invariant but use less resources.
Inspired by randomized algorithms, we propose a class of binary classification models called Randomized Linears (RLCs)
arXiv Detail & Related papers (2023-08-08T17:18:04Z) - Rough Randomness and its Application [0.0]
This research aims to capture a variety of rough processes, construct related models, and explore the validity of other machine learning algorithms.
A class of rough random functions termed large-minded reasoners have a central role in these.
arXiv Detail & Related papers (2023-03-21T12:22:33Z) - A Note on High-Probability versus In-Expectation Guarantees of
Generalization Bounds in Machine Learning [95.48744259567837]
Statistical machine learning theory often tries to give generalization guarantees of machine learning models.
Statements made about the performance of machine learning models have to take the sampling process into account.
We show how one may transform one statement to another.
arXiv Detail & Related papers (2020-10-06T09:41:35Z) - Optimization for Supervised Machine Learning: Randomized Algorithms for
Data and Parameters [10.279748604797911]
Key problems in machine learning and data science are routinely modeled as optimization problems and solved via optimization algorithms.
With the increase of the volume of data and the size and complexity of the statistical models used to formulate these often ill-conditioned optimization tasks, there is a need for new efficient algorithms able to cope with these challenges.
In this thesis, we deal with each of these sources of difficulty in a different way. To efficiently address the big data issue, we develop new methods which in each iteration examine a small random subset of the training data only.
To handle the big model issue, we develop methods which in each iteration update
arXiv Detail & Related papers (2020-08-26T21:15:18Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Continual Learning using a Bayesian Nonparametric Dictionary of Weight
Factors [75.58555462743585]
Naively trained neural networks tend to experience catastrophic forgetting in sequential task settings.
We propose a principled nonparametric approach based on the Indian Buffet Process (IBP) prior, letting the data determine how much to expand the model complexity.
We demonstrate the effectiveness of our method on a number of continual learning benchmarks and analyze how weight factors are allocated and reused throughout the training.
arXiv Detail & Related papers (2020-04-21T15:20:19Z) - The data-driven physical-based equations discovery using evolutionary
approach [77.34726150561087]
We describe the algorithm for the mathematical equations discovery from the given observations data.
The algorithm combines genetic programming with the sparse regression.
It could be used for governing analytical equation discovery as well as for partial differential equations (PDE) discovery.
arXiv Detail & Related papers (2020-04-03T17:21:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.