Leveraging LLMs for Formal Software Requirements -- Challenges and Prospects
- URL: http://arxiv.org/abs/2507.14330v2
- Date: Sun, 10 Aug 2025 01:28:54 GMT
- Title: Leveraging LLMs for Formal Software Requirements -- Challenges and Prospects
- Authors: Arshad Beg, Diarmuid O'Donoghue, Rosemary Monahan,
- Abstract summary: VERIFAI1 aims to investigate automated and semi-automated approaches to bridge this gap.<n>This position paper presents a preliminary synthesis of relevant literature to identify recurring challenges and prospective research directions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Software correctness is ensured mathematically through formal verification, which involves the resources of generating formal requirement specifications and having an implementation that must be verified. Tools such as model-checkers and theorem provers ensure software correctness by verifying the implementation against the specification. Formal methods deployment is regularly enforced in the development of safety-critical systems e.g. aerospace, medical devices and autonomous systems. Generating these specifications from informal and ambiguous natural language requirements remains the key challenge. Our project, VERIFAI^{1}, aims to investigate automated and semi-automated approaches to bridge this gap, using techniques from Natural Language Processing (NLP), ontology-based domain modelling, artefact reuse, and large language models (LLMs). This position paper presents a preliminary synthesis of relevant literature to identify recurring challenges and prospective research directions in the generation of verifiable specifications from informal requirements.
Related papers
- REprompt: Prompt Generation for Intelligent Software Development Guided by Requirements Engineering [43.0808976544794]
Large language models increasingly function as foundation models within coding agents.<n> prompts play a central role in agent-based intelligent software development.<n>We propose REprompt, a multi-agent prompt optimization framework guided by requirements engineering.
arXiv Detail & Related papers (2026-01-23T07:14:34Z) - Generation of Programmatic Rules for Document Forgery Detection Using Large Language Models [10.32461766065764]
Document forgery poses a growing threat to legal, economic, and governmental processes.<n>Existing plausibility checks are manually implemented by software engineers, which is time-consuming.<n>Recent advances in code generation with large language models (LLMs) offer new potential for automating and scaling the generation of these checks.
arXiv Detail & Related papers (2025-12-22T10:08:25Z) - Automated Formalization of Probabilistic Requirements from Structured Natural Language [2.8065951726067726]
We extend NASA's Formal Requirement Elicitation Tool (FRET) with support for the specification of unambiguous and correct probabilistic requirements.<n>We propose and develop a formal, compositional, and automated approach for translating structured natural-language requirements into formulas in probabilistic temporal logic.
arXiv Detail & Related papers (2025-12-15T20:20:27Z) - Safe and Certifiable AI Systems: Concepts, Challenges, and Lessons Learned [45.44933002008943]
This white paper presents the T"UV AUSTRIA Trusted AI framework.<n>It is an end-to-end audit catalog and methodology for assessing and certifying machine learning systems.<n>Building on three pillars - Secure Software Development, Functional Requirements, and Ethics & Data Privacy - it translates the high-level obligations of the EU AI Act into specific, testable criteria.
arXiv Detail & Related papers (2025-09-08T17:52:08Z) - Requirements Development and Formalization for Reliable Code Generation: A Multi-Agent Vision [45.59678433715798]
We envision the first multi-agent framework for reliable code generation based on textscrequirements textscdevelopment and textscformalization, named textscReDeFo.<n>The core of textscReDeFo is the use of formal specifications to bridge the gap between potentially ambiguous natural language requirements and precise executable code.
arXiv Detail & Related papers (2025-08-26T04:45:04Z) - Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny [68.00108157244952]
Large Language Models (LLMs) trained with Reinforcement Learning (RL) face a significant challenge: their verification processes are neither reliable nor scalable.<n>A promising yet largely uncharted alternative is formal language-based reasoning.<n>Grounding LLMs in rigorous formal systems where generative models operate in formal language spaces (e.g., Dafny) enables the automatic and mathematically provable verification of their reasoning processes and outcomes.
arXiv Detail & Related papers (2025-07-22T08:13:01Z) - Specification-Guided Repair of Arithmetic Errors in Dafny Programs using LLMs [84.30534714651093]
We present an innovative APR tool for Dafny, a verification-aware programming language.<n>We localize faults through a series of steps, which include using Hoare Logic to determine the state of each statement within the program.<n>We evaluate our approach using DafnyBench, a benchmark of real-world Dafny programs.
arXiv Detail & Related papers (2025-07-04T15:36:12Z) - Formalising Software Requirements using Large Language Models [0.0]
The project addresses the challenges in the traceability and verification of formal specifications.<n>It provides support for the automatic generation of the formal specifications and the traceability of the requirements from the initial software design stage through the systems implementation and verification.
arXiv Detail & Related papers (2025-06-12T13:55:01Z) - Self-Steering Language Models [113.96916935955842]
DisCIPL is a method for "self-steering" language models.<n>DisCIPL uses a Planner model to generate a task-specific inference program.<n>Our work opens up a design space of highly-parallelized Monte Carlo inference strategies.
arXiv Detail & Related papers (2025-04-09T17:54:22Z) - Requirements-Driven Automated Software Testing: A Systematic Review [12.953746641112518]
This systematic literature review critically examines the current state of requirements input formats, transformation techniques, generated test artifacts, evaluation methods, and prevailing limitations.<n>Our findings highlight the predominance of functional requirements, model-based specifications, and natural language formats.<n>Most existing frameworks are sequential and dependent on singular intermediate representations, and while test cases, structured textual formats, and requirements coverage are common, full automation remains rare.
arXiv Detail & Related papers (2025-02-25T23:13:09Z) - Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering [51.31836988300326]
Verifier engineering is a novel post-training paradigm specifically designed for the era of foundation models.
We systematically categorize the verifier engineering process into three essential stages: search, verify, and feedback.
arXiv Detail & Related papers (2024-11-18T12:04:52Z) - Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph [83.90988015005934]
Uncertainty quantification is a key element of machine learning applications.<n>We introduce a novel benchmark that implements a collection of state-of-the-art UQ baselines.<n>We conduct a large-scale empirical investigation of UQ and normalization techniques across eleven tasks, identifying the most effective approaches.
arXiv Detail & Related papers (2024-06-21T20:06:31Z) - Enchanting Program Specification Synthesis by Large Language Models using Static Analysis and Program Verification [15.686651364655958]
AutoSpec is an automated approach to synthesize specifications for automated program verification.
It overcomes the shortcomings of existing work in specification versatility, synthesizing satisfiable and adequate specifications for full proof.
It can be successfully applied to verify the programs in a real-world X509-parser project.
arXiv Detail & Related papers (2024-03-31T18:15:49Z) - Validation of Rigorous Requirements Specifications and Document
Automation with the ITLingo RSL Language [0.0]
ITLingo initiative has introduced a requirements specification language named RSL to enhance the rigor and consistency of technical documentation.
This paper reviews existing research and tools in the fields of requirements validation and document automation.
We propose to extend RSL with validation of specifications based on customized checks, and on linguistic rules dynamically defined in the RSL itself.
arXiv Detail & Related papers (2023-12-17T21:39:26Z) - Natural Language Processing for Requirements Formalization: How to
Derive New Approaches? [0.32885740436059047]
We present and discuss principal ideas and state-of-the-art methodologies from the field of NLP.
We discuss two different approaches in detail and highlight the iterative development of rule sets.
The presented methods are demonstrated on two industrial use cases from the automotive and railway domains.
arXiv Detail & Related papers (2023-09-23T05:45:19Z) - A General Framework for Verification and Control of Dynamical Models via Certificate Synthesis [54.959571890098786]
We provide a framework to encode system specifications and define corresponding certificates.
We present an automated approach to formally synthesise controllers and certificates.
Our approach contributes to the broad field of safe learning for control, exploiting the flexibility of neural networks.
arXiv Detail & Related papers (2023-09-12T09:37:26Z) - Technical Report on Neural Language Models and Few-Shot Learning for
Systematic Requirements Processing in MDSE [1.6286277560322266]
This paper is based on the analysis of an open-source set of automotive requirements.
We derive domain-specific language constructs helping us to avoid ambiguities in requirements and increase the level of formality.
arXiv Detail & Related papers (2022-11-16T18:06:25Z) - Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs)
The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.