Related papers: Arbitrarily Large Labelled Random Satisfiability Formulas for Machine Learning Training

Arbitrarily Large Labelled Random Satisfiability Formulas for Machine Learning Training

URL: http://arxiv.org/abs/2211.15368v2
Date: Sun, 4 Jun 2023 19:01:10 GMT
Title: Arbitrarily Large Labelled Random Satisfiability Formulas for Machine Learning Training
Authors: Dimitris Achlioptas, Amrit Daswaney, Periklis A. Papakonstantinou
Abstract summary: We show how to generate correctly labeled random formulas of any desired size without having to solve the underlying decision problem. We train existing state-of-the-art models for the task of predicting satisfiability on formulas with 10,000 variables. We find that they do no better than random guessing 99% on the same datasets.
Score: 5.414308305392762
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Applying deep learning to solve real-life instances of hard combinatorial problems has tremendous potential. Research in this direction has focused on the Boolean satisfiability (SAT) problem, both because of its theoretical centrality and practical importance. A major roadblock faced, though, is that training sets are restricted to random formulas of size several orders of magnitude smaller than formulas of practical interest, raising serious concerns about generalization. This is because labeling random formulas of increasing size rapidly becomes intractable. By exploiting the probabilistic method in a fundamental way, we remove this roadblock entirely: we show how to generate correctly labeled random formulas of any desired size, without having to solve the underlying decision problem. Moreover, the difficulty of the classification task for the formulas produced by our generator is tunable by varying a simple scalar parameter. This opens up an entirely new level of sophistication for the machine learning methods that can be brought to bear on Satisfiability. Using our generator, we train existing state-of-the-art models for the task of predicting satisfiability on formulas with 10,000 variables. We find that they do no better than random guessing. As a first indication of what can be achieved with the new generator, we present a novel classifier that performs significantly better than random guessing 99% on the same datasets, for most difficulty levels. Crucially, unlike past approaches that learn based on syntactic features of a formula, our classifier performs its learning on a short prefix of a solver's computation, an approach that we expect to be of independent interest.

Related papers

Learning the symmetric group: large from small [44.99833362998488]
We show that a transformer neural-network trained on predicting permutations can generalize to the symmetric group $S_25$ with near 100% accuracy. We employ identity augmentation as a key tool to manage variable word lengths, and partitioned windows for training on adjacent transpositions.
arXiv Detail & Related papers (2025-02-18T10:28:25Z)
Simple and Provable Scaling Laws for the Test-Time Compute of Large Language Models [70.07661254213181]
We propose two principled algorithms for the test-time compute of large language models. We prove theoretically that the failure probability of one algorithm decays to zero exponentially as its test-time compute grows.
arXiv Detail & Related papers (2024-11-29T05:29:47Z)
A Fast Convoluted Story: Scaling Probabilistic Inference for Integer Arithmetic [4.7223923266180785]
We formulate linear arithmetic over integer-valued random variables as tensor manipulations. We obtain a differentiable data structure, which unlocks, virtually for free, gradient-based learning.
arXiv Detail & Related papers (2024-10-16T09:16:10Z)
Deep Linear Probe Generators for Weight Space Learning [39.90685550999956]
Probing represents a model by passing a set of learned inputs (probes) through the model, and training a predictor on top of the corresponding outputs. ProbeGen adds a shared generator module with a deep linear architecture, providing an inductive bias towards structured probes. ProbeGen performs significantly better than the state-of-the-art and is very efficient, requiring between 30 to 1000 times fewer FLOPs than other top approaches.
arXiv Detail & Related papers (2024-10-14T17:59:41Z)
Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models. We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z)
Probabilistic Invariant Learning with Randomized Linear Classifiers [24.485477981244593]
We show how to leverage randomness and design models that are both expressive and invariant but use less resources. Inspired by randomized algorithms, we propose a class of binary classification models called Randomized Linears (RLCs)
arXiv Detail & Related papers (2023-08-08T17:18:04Z)
Rough Randomness and its Application [0.0]
This research aims to capture a variety of rough processes, construct related models, and explore the validity of other machine learning algorithms. A class of rough random functions termed large-minded reasoners have a central role in these.
arXiv Detail & Related papers (2023-03-21T12:22:33Z)
A Note on High-Probability versus In-Expectation Guarantees of Generalization Bounds in Machine Learning [95.48744259567837]
Statistical machine learning theory often tries to give generalization guarantees of machine learning models. Statements made about the performance of machine learning models have to take the sampling process into account. We show how one may transform one statement to another.
arXiv Detail & Related papers (2020-10-06T09:41:35Z)
Optimization for Supervised Machine Learning: Randomized Algorithms for Data and Parameters [10.279748604797911]
Key problems in machine learning and data science are routinely modeled as optimization problems and solved via optimization algorithms. With the increase of the volume of data and the size and complexity of the statistical models used to formulate these often ill-conditioned optimization tasks, there is a need for new efficient algorithms able to cope with these challenges. In this thesis, we deal with each of these sources of difficulty in a different way. To efficiently address the big data issue, we develop new methods which in each iteration examine a small random subset of the training data only. To handle the big model issue, we develop methods which in each iteration update
arXiv Detail & Related papers (2020-08-26T21:15:18Z)
Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers. We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model. Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
Continual Learning using a Bayesian Nonparametric Dictionary of Weight Factors [75.58555462743585]
Naively trained neural networks tend to experience catastrophic forgetting in sequential task settings. We propose a principled nonparametric approach based on the Indian Buffet Process (IBP) prior, letting the data determine how much to expand the model complexity. We demonstrate the effectiveness of our method on a number of continual learning benchmarks and analyze how weight factors are allocated and reused throughout the training.
arXiv Detail & Related papers (2020-04-21T15:20:19Z)
The data-driven physical-based equations discovery using evolutionary approach [77.34726150561087]
We describe the algorithm for the mathematical equations discovery from the given observations data. The algorithm combines genetic programming with the sparse regression. It could be used for governing analytical equation discovery as well as for partial differential equations (PDE) discovery.
arXiv Detail & Related papers (2020-04-03T17:21:57Z)
Learning Gaussian Graphical Models via Multiplicative Weights [54.252053139374205]
We adapt an algorithm of Klivans and Meka based on the method of multiplicative weight updates. The algorithm enjoys a sample complexity bound that is qualitatively similar to others in the literature. It has a low runtime $O(mp2)$ in the case of $m$ samples and $p$ nodes, and can trivially be implemented in an online manner.
arXiv Detail & Related papers (2020-02-20T10:50:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.