Towards Comprehensive Sampling of SMT Solutions
- URL: http://arxiv.org/abs/2511.10326v1
- Date: Fri, 14 Nov 2025 01:45:36 GMT
- Title: Towards Comprehensive Sampling of SMT Solutions
- Authors: Shuangyu Lyu, Chuan Luo, Ruizhi Shi, Wei Wu, Chanjuan Liu, Chunming Hu,
- Abstract summary: PanSampler is a novel SMT sampler that achieves high coverage with a small number of solutions.<n>It incorporates three novel techniques, i.e., diversity-aware SMT algorithm, abstract syntax tree (AST)-guided scoring function and post-sampling optimization technology.<n>PanSampler exhibits a significantly stronger capability to reach high target coverage, while requiring fewer solutions than current samplers to achieve the same coverage level.
- Score: 17.952635077005144
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work focuses on effectively generating diverse solutions for satisfiability modulo theories (SMT) formulas, targeting the theories of bit-vectors, arrays, and uninterpreted functions, which is a critical task in software and hardware testing. Generating diverse SMT solutions helps uncover faults and detect safety violations during the verification and testing process, resulting in the SMT sampling problem, i.e., constructing a small number of solutions while achieving comprehensive coverage of the constraint space. While high coverage is crucial for exploring system behaviors, reducing the number of solutions is of great importance, as excessive solutions increase testing time and resource usage, undermining efficiency. In this work, we introduce PanSampler, a novel SMT sampler that achieves high coverage with a small number of solutions. It incorporates three novel techniques, i.e., diversity-aware SMT algorithm, abstract syntax tree (AST)-guided scoring function and post-sampling optimization technology, enhancing its practical performance. It iteratively samples solutions, evaluates candidates, and employs local search to refine solutions, ensuring high coverage with a small number of samples. Extensive experiments on practical benchmarks demonstrate that PanSampler exhibits a significantly stronger capability to reach high target coverage, while requiring fewer solutions than current samplers to achieve the same coverage level. Furthermore, our empirical evaluation on practical subjects, which are collected from real-world software systems, shows that PanSampler achieves higher fault detection capability and reduces the number of required test cases from 32.6\% to 76.4\% to reach the same fault detection effectiveness, leading to a substantial improvement in testing efficiency. PanSampler advances SMT sampling, reducing the cost of software testing and hardware verification.
Related papers
- DetectAnyLLM: Towards Generalizable and Robust Detection of Machine-Generated Text Across Domains and Models [60.713908578319256]
We propose Direct Discrepancy Learning (DDL) to optimize the detector with task-oriented knowledge.<n>Built upon this, we introduce DetectAnyLLM, a unified detection framework that achieves state-of-the-art MGTD performance.<n>MIRAGE samples human-written texts from 10 corpora across 5 text-domains, which are then re-generated or revised using 17 cutting-edge LLMs.
arXiv Detail & Related papers (2025-09-15T10:59:57Z) - Requirements Coverage-Guided Minimization for Natural Language Test Cases [7.947774587906927]
Test suites tend to grow in size and often contain redundant test cases.<n>Test suite minimization aims to eliminate such redundancy while preserving key properties such as requirement coverage and fault detection capability.<n>We propose RTM (Requirement coverage-guided Test suite Minimization), a novel TSM approach designed for requirement-based testing.
arXiv Detail & Related papers (2025-05-26T13:55:33Z) - Add-One-In: Incremental Sample Selection for Large Language Models via a Choice-Based Greedy Paradigm [50.492124556982674]
This paper introduces a novel choice-based sample selection framework.<n>It shifts the focus from evaluating individual sample quality to comparing the contribution value of different samples.<n>We validate our approach on a larger medical dataset, highlighting its practical applicability in real-world applications.
arXiv Detail & Related papers (2025-03-04T07:32:41Z) - SMT(LIA) Sampling with High Diversity [12.198700284977962]
We have developed the first sampling framework that integrates local search with CDCL(T) techniques.<n>HighDiv is capable of generating a highly diverse set of solutions for constraints under linear integer theory.
arXiv Detail & Related papers (2025-02-25T07:53:46Z) - FastMCTS: A Simple Sampling Strategy for Data Synthesis [67.60823802317141]
We introduce FastMCTS, an innovative data synthesis strategy inspired by Monte Carlo Tree Search.<n>FastMCTS provides a more efficient sampling method for multi-step reasoning data, offering step-level evaluation signals.<n>Experiments on both English and Chinese reasoning datasets demonstrate that FastMCTS generates over 30% more correct reasoning paths.
arXiv Detail & Related papers (2025-02-17T06:27:57Z) - How Low Can We Go? Minimizing Interaction Samples for Configurable Systems [2.569213261297365]
We present a framework for combining near-optimal solutions with provable lower bounds on the required sample size.<n>Our algorithm SampLNS can reliably find samples of smaller size than previous methods in 85% of the cases.<n>This makes it possible to avoid cumbersome efforts of minimizing samples by researchers as well as practitioners.
arXiv Detail & Related papers (2025-01-12T12:02:26Z) - Faster Stochastic Variance Reduction Methods for Compositional MiniMax
Optimization [50.10952609321302]
compositional minimax optimization is a pivotal challenge across various machine learning domains.
Current methods of compositional minimax optimization are plagued by sub-optimal complexities or heavy reliance on sizable batch sizes.
This paper introduces a novel method, called Nested STOchastic Recursive Momentum (NSTORM), which can achieve the optimal sample complexity of $O(kappa3 /epsilon3 )$.
arXiv Detail & Related papers (2023-08-18T14:57:21Z) - Towards Automated Imbalanced Learning with Deep Hierarchical
Reinforcement Learning [57.163525407022966]
Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class.
Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class.
We propose AutoSMOTE, an automated over-sampling algorithm that can jointly optimize different levels of decisions.
arXiv Detail & Related papers (2022-08-26T04:28:01Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.