Neural Approaches to SAT Solving: Design Choices and Interpretability
- URL: http://arxiv.org/abs/2504.01173v1
- Date: Tue, 01 Apr 2025 20:31:01 GMT
- Title: Neural Approaches to SAT Solving: Design Choices and Interpretability
- Authors: David Mojžíšek, Jan Hůla, Ziwei Li, Ziyu Zhou, Mikoláš Janota,
- Abstract summary: We provide a comprehensive evaluation of graph neural networks applied to Boolean satisfiability problems.<n>We introduce several training improvements, particularly a novel closest assignment supervision method.<n>Our experiments demonstrate the suitability of variable-clause graph representations with recurrent neural network updates.
- Score: 3.876394880838347
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this contribution, we provide a comprehensive evaluation of graph neural networks applied to Boolean satisfiability problems, accompanied by an intuitive explanation of the mechanisms enabling the model to generalize to different instances. We introduce several training improvements, particularly a novel closest assignment supervision method that dynamically adapts to the model's current state, significantly enhancing performance on problems with larger solution spaces. Our experiments demonstrate the suitability of variable-clause graph representations with recurrent neural network updates, which achieve good accuracy on SAT assignment prediction while reducing computational demands. We extend the base graph neural network into a diffusion model that facilitates incremental sampling and can be effectively combined with classical techniques like unit propagation. Through analysis of embedding space patterns and optimization trajectories, we show how these networks implicitly perform a process very similar to continuous relaxations of MaxSAT, offering an interpretable view of their reasoning process. This understanding guides our design choices and explains the ability of recurrent architectures to scale effectively at inference time beyond their training distribution, which we demonstrate with test-time scaling experiments.
Related papers
- Predicting Steady-State Behavior in Complex Networks with Graph Neural Networks [0.0]
In complex systems, information propagation can be defined as diffused or delocalized, weakly localized, and strongly localized.<n>This study investigates the application of graph neural network models to learn the behavior of a linear dynamical system on networks.
arXiv Detail & Related papers (2025-02-02T17:29:10Z) - Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - Improving Equivariant Model Training via Constraint Relaxation [31.507956579770088]
We propose a novel framework for improving the optimization of such models by relaxing the hard equivariance constraint during training.<n>We provide experimental results on different state-of-the-art network architectures, demonstrating how this training framework can result in equivariant models with improved generalization performance.
arXiv Detail & Related papers (2024-08-23T17:35:08Z) - The Unreasonable Effectiveness of Solving Inverse Problems with Neural Networks [24.766470360665647]
We show that neural networks trained to learn solutions to inverse problems can find better solutions than classicals even on their training set.
Our findings suggest an alternative use for neural networks: rather than generalizing to new data for fast inference, they can also be used to find better solutions on known data.
arXiv Detail & Related papers (2024-08-15T12:38:10Z) - Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance.
Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning.
Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z) - Neural Network-Based Score Estimation in Diffusion Models: Optimization
and Generalization [12.812942188697326]
Diffusion models have emerged as a powerful tool rivaling GANs in generating high-quality samples with improved fidelity, flexibility, and robustness.
A key component of these models is to learn the score function through score matching.
Despite empirical success on various tasks, it remains unclear whether gradient-based algorithms can learn the score function with a provable accuracy.
arXiv Detail & Related papers (2024-01-28T08:13:56Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Ada-SISE: Adaptive Semantic Input Sampling for Efficient Explanation of
Convolutional Neural Networks [26.434705114982584]
We propose an efficient interpretation method for convolutional neural networks.
Experimental results show that the proposed method can reduce the execution time up to 30%.
arXiv Detail & Related papers (2021-02-15T19:10:00Z) - Network Diffusions via Neural Mean-Field Dynamics [52.091487866968286]
We propose a novel learning framework for inference and estimation problems of diffusion on networks.
Our framework is derived from the Mori-Zwanzig formalism to obtain an exact evolution of the node infection probabilities.
Our approach is versatile and robust to variations of the underlying diffusion network models.
arXiv Detail & Related papers (2020-06-16T18:45:20Z) - Parallelization Techniques for Verifying Neural Networks [52.917845265248744]
We introduce an algorithm based on the verification problem in an iterative manner and explore two partitioning strategies.
We also introduce a highly parallelizable pre-processing algorithm that uses the neuron activation phases to simplify the neural network verification problems.
arXiv Detail & Related papers (2020-04-17T20:21:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.