Can Transformers Reason Logically? A Study in SAT Solving
- URL: http://arxiv.org/abs/2410.07432v1
- Date: Wed, 9 Oct 2024 21:01:52 GMT
- Title: Can Transformers Reason Logically? A Study in SAT Solving
- Authors: Leyan Pan, Vijay Ganesh, Jacob Abernethy, Chris Esposo, Wenke Lee,
- Abstract summary: We study the logical reasoning capabilities of LLMs in the context of the Boolean satisfiability (SAT) problem.
First, we construct a decoder-only Transformer that can solve SAT using backtracking and deduction via Chain-of-Thought (CoT)
We prove its correctness by showing trace equivalence to the well-known DPLL SAT-solving algorithm.
- Score: 17.15701291424892
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We theoretically and empirically study the logical reasoning capabilities of LLMs in the context of the Boolean satisfiability (SAT) problem. First, we construct a decoder-only Transformer that can solve SAT using backtracking and deduction via Chain-of-Thought (CoT). We prove its correctness by showing trace equivalence to the well-known DPLL SAT-solving algorithm. Second, to support the implementation of this abstract construction, we design a compiler $\texttt{PARAT}$ that takes as input a procedural specification and outputs a transformer model implementing this specification. Third, rather than $\textit{programming}$ a transformer to reason, we evaluate empirically whether it can be $\textit{trained}$ to do so by learning directly from algorithmic traces ("reasoning paths") of the DPLL algorithm.
Related papers
- Transformers Provably Learn Chain-of-Thought Reasoning with Length Generalization [53.89723291716722]
A crucial question about AI reasoning is whether models can extrapolate learned reasoning patterns to solve harder tasks with longer chain-of-thought (CoT)<n>We mathematically prove how the algebraic structure of state-tracking problems governs the degree of extrapolation of the learned CoT.<n>We provide the first optimization guarantee that constant-depth transformers provably learn $mathsfNC1$-complete problems with CoT.
arXiv Detail & Related papers (2025-11-10T18:40:24Z) - The Kinetics of Reasoning: How Chain-of-Thought Shapes Learning in Transformers? [25.29458951592086]
Chain-of-thought (CoT) supervision can substantially improve transformer performance.<n>We investigate these learning dynamics through the lens of grokking by pretraining transformers on symbolic reasoning tasks.
arXiv Detail & Related papers (2025-10-28T20:14:26Z) - Why Can't Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls [54.57326125204404]
Language models are increasingly capable, yet still fail at a seemingly simple task of multi-digit multiplication.<n>We study why, by reverse-engineering a model that successfully learns multiplication via emphimplicit chain-of-thought'
arXiv Detail & Related papers (2025-09-30T19:03:26Z) - Probing for Arithmetic Errors in Language Models [86.8227317662622]
Internal activations in language models can be used to detect arithmetic errors.<n>We show that simple probes can accurately decode both the model's predicted output and the correct answer from hidden states.<n>We train lightweight error detectors that predict model correctness with over 90% accuracy.
arXiv Detail & Related papers (2025-07-16T16:27:50Z) - On the Existence of Universal Simulators of Attention [17.01811978811789]
We present solutions to identically replicate attention outputs and the underlying elementary matrix and activation operations via RASP.<n>Our proofs, for the first time, show the existence of an algorithmically achievable data-agnostic solution, previously known to be approximated only by learning.
arXiv Detail & Related papers (2025-06-23T15:15:25Z) - Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking [41.3496135369579]
Chain-of-Thought (CoT) significantly enhances the performance of large language models (LLMs) across a wide range of tasks.
In this work, we evaluate the state tracking capabilities of Transformer+CoT and its variants, confirming the effectiveness of CoT.
We propose two metrics, compression and distinction, and show that the neuron sets for each state achieve nearly 100% accuracy.
arXiv Detail & Related papers (2025-02-27T14:24:51Z) - Self-Satisfied: An end-to-end framework for SAT generation and prediction [0.7340017786387768]
We introduce hardware accelerated algorithms for fast SAT problem generation and a geometric SAT encoding.
These advances allow us to scale our approach to SAT problems with thousands of variables and tens of thousands of clauses.
A fundamental aspect of our work concerns the very nature of SAT data and its suitability for training machine learning models.
arXiv Detail & Related papers (2024-10-18T22:25:54Z) - Transformers Provably Solve Parity Efficiently with Chain of Thought [40.78854925996]
This work provides the first theoretical analysis of training transformers to solve complex problems.
We consider training a one-layer transformer to solve the fundamental $k$-parity problem.
arXiv Detail & Related papers (2024-10-11T08:55:17Z) - Can Transformers Learn $n$-gram Language Models? [77.35809823602307]
We study transformers' ability to learn random $n$-gram LMs of two kinds.
We find that classic estimation techniques for $n$-gram LMs such as add-$lambda$ smoothing outperform transformers.
arXiv Detail & Related papers (2024-10-03T21:21:02Z) - Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis [82.51626700527837]
Chain-of-shift (CoT) is an efficient method that enables the reasoning ability of large language models by augmenting the query using examples with multiple intermediate steps.
We show that despite the theoretical success of CoT, it fails to provide an accurate generalization when CoT does.
arXiv Detail & Related papers (2024-10-03T03:12:51Z) - Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods [59.779795063072655]
Chain-of-Thought (CoT) prompting and its variants have gained popularity as effective methods for solving multi-step reasoning problems.
We analyze CoT prompting from a statistical estimation perspective, providing a comprehensive characterization of its sample complexity.
arXiv Detail & Related papers (2024-08-25T04:07:18Z) - On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning [87.73401758641089]
Chain-of-thought (CoT) reasoning has improved the performance of modern language models (LMs)
We show that LMs can represent the same family of distributions over strings as probabilistic Turing machines.
arXiv Detail & Related papers (2024-06-20T10:59:02Z) - Chain of Thought Empowers Transformers to Solve Inherently Serial Problems [57.58801785642868]
Chain of thought (CoT) is a highly effective method to improve the accuracy of large language models (LLMs) on arithmetics and symbolic reasoning tasks.
This work provides a theoretical understanding of the power of CoT for decoder-only transformers through the lens of expressiveness.
arXiv Detail & Related papers (2024-02-20T10:11:03Z) - Decomposing Hard SAT Instances with Metaheuristic Optimization [52.03315747221343]
We introduce the notion of decomposition hardness (d-hardness)
We show that the d-hardness expresses an estimate of the hardness of $C$ w.r.t.
arXiv Detail & Related papers (2023-12-16T12:44:36Z) - Learning Reliable Logical Rules with SATNet [7.951021955925275]
We build on SATNet, a differentiable MaxSAT solver that learns the underlying rules from input-output examples.
We introduce several effective verification techniques to validate it against the ground truth rules.
Experiments on stream transformations and Sudoku problems show that our decoded rules are highly reliable.
arXiv Detail & Related papers (2023-10-03T15:14:28Z) - Transformers as Statisticians: Provable In-Context Learning with
In-Context Algorithm Selection [88.23337313766353]
This work first provides a comprehensive statistical theory for transformers to perform ICL.
We show that transformers can implement a broad class of standard machine learning algorithms in context.
A emphsingle transformer can adaptively select different base ICL algorithms.
arXiv Detail & Related papers (2023-06-07T17:59:31Z) - Towards Revealing the Mystery behind Chain of Thought: A Theoretical
Perspective [39.47116013338394]
Chain-of-Thought prompting (CoT) can dramatically improve the performance of Large Language Models (LLMs)
We show that CoT can handle a general class of decision-making problems known as Dynamic Programming.
arXiv Detail & Related papers (2023-05-24T17:59:21Z) - Tracr: Compiled Transformers as a Laboratory for Interpretability [15.76027393879609]
We show how to "compile" human-readable programs into decoder-only transformer models.
Our compiler, Tracr, generates models with known structure.
We use it to study "superposition" in transformers that execute multi-step algorithms.
arXiv Detail & Related papers (2023-01-12T14:59:19Z) - Characterizing Intrinsic Compositionality in Transformers with Tree
Projections [72.45375959893218]
neural models like transformers can route information arbitrarily between different parts of their input.
We show that transformers for three different tasks become more treelike over the course of training.
These trees are predictive of model behavior, with more tree-like models generalizing better on tests of compositional generalization.
arXiv Detail & Related papers (2022-11-02T17:10:07Z) - Transformers Learn Shortcuts to Automata [52.015990420075944]
We find that a low-depth Transformer can represent the computations of any finite-state automaton.
We show that a Transformer with $O(log T)$ layers can exactly replicate the computation of an automaton on an input sequence of length $T$.
We further investigate the brittleness of these solutions and propose potential mitigations.
arXiv Detail & Related papers (2022-10-19T17:45:48Z) - Estimating the hardness of SAT encodings for Logical Equivalence
Checking of Boolean circuits [58.83758257568434]
We show that the hardness of SAT encodings for LEC instances can be estimated textitw.r.t some SAT partitioning.
The paper proposes several methods for constructing partitionings, which, when used in practice, allow one to estimate the hardness of SAT encodings for LEC with good accuracy.
arXiv Detail & Related papers (2022-10-04T09:19:13Z) - Transformer-based Machine Learning for Fast SAT Solvers and Logic
Synthesis [63.53283025435107]
CNF-based SAT and MaxSAT solvers are central to logic synthesis and verification systems.
In this work, we propose a one-shot model derived from the Transformer architecture to solve the MaxSAT problem.
arXiv Detail & Related papers (2021-07-15T04:47:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.