Related papers: Preguss: It Analyzes, It Specifies, It Verifies

Preguss: It Analyzes, It Specifies, It Verifies

URL: http://arxiv.org/abs/2508.14532v1
Date: Wed, 20 Aug 2025 08:40:22 GMT
Title: Preguss: It Analyzes, It Specifies, It Verifies
Authors: Zhongyi Wang, Tengjie Lin, Mingshuai Chen, Mingqi Yang, Haokun Li, Xiao Yi, Shengchao Qin, Jianwei Yin,
Abstract summary: Large language models (LLMs) have recently demonstrated their potential for enhancing the degree of automation in formal verification.<n>This paper outlines Preguss - a modular, fine-grained framework for automating the generation and refinement of formal specifications.
Score: 14.717270519465218
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Fully automated verification of large-scale software and hardware systems is arguably the holy grail of formal methods. Large language models (LLMs) have recently demonstrated their potential for enhancing the degree of automation in formal verification by, e.g., generating formal specifications as essential to deductive verification, yet exhibit poor scalability due to context-length limitations and, more importantly, the difficulty of inferring complex, interprocedural specifications. This paper outlines Preguss - a modular, fine-grained framework for automating the generation and refinement of formal specifications. Preguss synergizes between static analysis and deductive verification by orchestrating two components: (i) potential runtime error (RTE)-guided construction and prioritization of verification units, and (ii) LLM-aided synthesis of interprocedural specifications at the unit level. We envisage that Preguss paves a compelling path towards the automated verification of large-scale programs.

Related papers

Veri-Sure: A Contract-Aware Multi-Agent Framework with Temporal Tracing and Formal Verification for Correct RTL Code Generation [4.723302382132762]
silicon-grade correctness remains bottlenecked by: (i) limited test coverage and reliability of simulation-centric evaluation, (ii) regressions and repair hallucinations, and (iii) semantic drift as intent is reinterpreted across agent handoffs.<n>We propose Veri-Sure, a multi-agent framework that establishes a design contract to align agents' intent and uses a patching mechanism guided by static dependency slicing to perform precise, localized repairs.
arXiv Detail & Related papers (2026-01-27T16:10:23Z)
A Tale of 1001 LoC: Potential Runtime Error-Guided Specification Synthesis for Verifying Large-Scale Programs [34.387390697713556]
This paper presents Preguss -- a modular, fine-grained framework for automating the generation and refinement of formal specifications.<n>We show that Preguss substantially outperforms state-of-the-art LLM-based approaches.<n>It enables highly automated RTE-freeness verification for real-world programs with over a thousand LoC, with a reduction of 80.6%88.9% human verification effort.
arXiv Detail & Related papers (2025-12-31T03:31:51Z)
Neural Proofs for Sound Verification and Control of Complex Systems [64.97409769546992]
This informal contribution presents an ongoing line of research that is pursuing a new approach to the construction of sound proofs.<n>It is possible to generate provably-correct policies/strategies/controllers that, in conjunction with neural certificates, formally attain the given specifications for the models of interest.
arXiv Detail & Related papers (2025-12-20T15:01:41Z)
Autoformalizer with Tool Feedback [52.334957386319864]
Autoformalization addresses the scarcity of data for Automated Theorem Proving (ATP) by translating mathematical problems from natural language into formal statements.<n>Existing formalizer still struggles to consistently generate valid statements that meet syntactic validity and semantic consistency.<n>We propose the Autoformalizer with Tool Feedback (ATF), a novel approach that incorporates syntactic and consistency information as tools into the formalization process.
arXiv Detail & Related papers (2025-10-08T10:25:12Z)
PAT-Agent: Autoformalization for Model Checking [17.082027022913998]
PAT-Agent is an end-to-end framework for natural language autoformalization and formal model repair.<n>It combines the generative capabilities of large language models with the rigor of formal verification.
arXiv Detail & Related papers (2025-09-28T06:32:14Z)
Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny [68.00108157244952]
Large Language Models (LLMs) trained with Reinforcement Learning (RL) face a significant challenge: their verification processes are neither reliable nor scalable.<n>A promising yet largely uncharted alternative is formal language-based reasoning.<n>Grounding LLMs in rigorous formal systems where generative models operate in formal language spaces (e.g., Dafny) enables the automatic and mathematically provable verification of their reasoning processes and outcomes.
arXiv Detail & Related papers (2025-07-22T08:13:01Z)
Leveraging LLMs for Formal Software Requirements -- Challenges and Prospects [0.0]
VERIFAI1 aims to investigate automated and semi-automated approaches to bridge this gap.<n>This position paper presents a preliminary synthesis of relevant literature to identify recurring challenges and prospective research directions.
arXiv Detail & Related papers (2025-07-18T19:15:50Z)
Supporting Software Formal Verification with Large Language Models: An Experimental Study [9.688989142858954]
SpecVerify integrates large language models with formal verification tools.<n>This framework combines Claude 3.5 Sonnet with the ESBMC verifier to form an automated workflow.<n> Evaluated on nine cyber-physical systems from Lockheed Martin, SpecVerify achieves 46.5% verification accuracy, comparable to NASA's CoCoSim.
arXiv Detail & Related papers (2025-07-07T10:30:05Z)
Logic Mining from Process Logs: Towards Automated Specification and Verification [0.0]
This article presents an approach for generating logical specifications from process models discovered via workflow.<n>The study examines the impact of quality data, particularly noise, on the structure and testability of generated specifications.
arXiv Detail & Related papers (2025-06-10T09:44:19Z)
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models.<n>Our framework incorporates two complementary strategies: internal TTC and external TTC.<n>We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z)
A General Framework for Verification and Control of Dynamical Models via Certificate Synthesis [54.959571890098786]
We provide a framework to encode system specifications and define corresponding certificates. We present an automated approach to formally synthesise controllers and certificates. Our approach contributes to the broad field of safe learning for control, exploiting the flexibility of neural networks.
arXiv Detail & Related papers (2023-09-12T09:37:26Z)
Representing Timed Automata and Timing Anomalies of Cyber-Physical Production Systems in Knowledge Graphs [51.98400002538092]
This paper aims to improve model-based anomaly detection in CPPS by combining the learned timed automaton with a formal knowledge graph about the system. Both the model and the detected anomalies are described in the knowledge graph in order to allow operators an easier interpretation of the model and the detected anomalies.
arXiv Detail & Related papers (2023-08-25T15:25:57Z)
Automatic Program Instrumentation for Automatic Verification (Extended Technical Report) [0.0]
In deductive verification and software model checking, dealing with certain specification language constructs can be problematic. We propose instrumentation as a unifying verification paradigm that subsumes various existing ad-hoc approaches. We implement our approach in the MonoCera tool, which is tailored to the verification of programs with aggregation.
arXiv Detail & Related papers (2023-05-26T14:55:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.