Related papers: Finding Inductive Loop Invariants using Large Language Models

Finding Inductive Loop Invariants using Large Language Models

URL: http://arxiv.org/abs/2311.07948v1
Date: Tue, 14 Nov 2023 06:58:09 GMT
Title: Finding Inductive Loop Invariants using Large Language Models
Authors: Adharsh Kamath, Aditya Senthilnathan, Saikat Chakraborty, Pantazis Deligiannis, Shuvendu K. Lahiri, Akash Lal, Aseem Rastogi, Subhajit Roy, Rahul Sharma
Abstract summary: Finding inductive loop invariants is an undecidable problem. Despite a long history of research towards practical solutions, it remains far from a solved problem. This paper investigates the capabilities of the Large Language Models in offering a new solution.
Score: 14.846222005558666
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Loop invariants are fundamental to reasoning about programs with loops. They establish properties about a given loop's behavior. When they additionally are inductive, they become useful for the task of formal verification that seeks to establish strong mathematical guarantees about program's runtime behavior. The inductiveness ensures that the invariants can be checked locally without consulting the entire program, thus are indispensable artifacts in a formal proof of correctness. Finding inductive loop invariants is an undecidable problem, and despite a long history of research towards practical solutions, it remains far from a solved problem. This paper investigates the capabilities of the Large Language Models (LLMs) in offering a new solution towards this old, yet important problem. To that end, we first curate a dataset of verification problems on programs with loops. Next, we design a prompt for exploiting LLMs, obtaining inductive loop invariants, that are checked for correctness using sound symbolic tools. Finally, we explore the effectiveness of using an efficient combination of a symbolic tool and an LLM on our dataset and compare it against a purely symbolic baseline. Our results demonstrate that LLMs can help improve the state-of-the-art in automated program verification.

Related papers

Enhancing Automated Loop Invariant Generation for Complex Programs with Large Language Models [2.243213786359577]
ACInv is an Automated Complex program loop Invariant generation tool. It combines static analysis with Large Language Models (LLMs) to generate the proper loop invariants. We conducted experiments on ACInv, which showed that ACInv outperformed previous tools on data sets with data structures.
arXiv Detail & Related papers (2024-12-13T10:36:18Z)
Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification [52.095460362197336]
Large language models (LLMs) struggle with consistent and accurate reasoning. LLMs are trained primarily on correct solutions, reducing their ability to detect and learn from errors. We propose a novel collaborative method integrating Chain-of-Thought (CoT) and Program-of-Thought (PoT) solutions for verification.
arXiv Detail & Related papers (2024-10-05T05:21:48Z)
Prover-Verifier Games improve legibility of LLM outputs [12.532113917099885]
We study legibility in the context of solving grade-school math problems. We propose a training algorithm inspired by Prover-Verifier Game from Anil et al. We show that legibility training transfers to time-constrained humans tasked with verifying solution correctness.
arXiv Detail & Related papers (2024-07-18T16:58:18Z)
Case2Code: Learning Inductive Reasoning with Synthetic Data [105.89741089673575]
We propose a textbfCase2Code task by exploiting the expressiveness and correctness of programs. We first evaluate representative LLMs on the synthesized Case2Code task and demonstrate that the Case-to-code induction is challenging for LLMs. Experimental results show that such induction training benefits not only in distribution Case2Code performance but also enhances various coding abilities of trained LLMs.
arXiv Detail & Related papers (2024-07-17T11:35:00Z)
Inductive Predicate Synthesis Modulo Programs (Extended) [1.7372615815088566]
A growing trend in program analysis is to encode verification conditions within the language of the input program. We propose Inductive Predicate Synthesis Modulo Programs (IPS-MP) which extends high-level languages with minimal synthesis features.
arXiv Detail & Related papers (2024-07-11T12:51:08Z)
LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback [71.95402654982095]
We propose Math-Minos, a natural language feedback-enhanced verifier. Our experiments reveal that a small set of natural language feedback can significantly boost the performance of the verifier.
arXiv Detail & Related papers (2024-06-20T06:42:27Z)
Distilling Algorithmic Reasoning from LLMs via Explaining Solution Programs [2.3020018305241337]
Distilling explicit chain-of-thought reasoning paths has emerged as an effective method for improving the reasoning abilities of large language models. We propose a novel approach to distill reasoning abilities from LLMs by leveraging their capacity to explain solutions. Our experiments demonstrate that learning from explanations enables the Reasoner to more effectively guide program implementation by a Coder.
arXiv Detail & Related papers (2024-04-11T22:19:50Z)
Evaluating and Improving Tool-Augmented Computation-Intensive Math Reasoning [75.74103236299477]
Chain-of-thought prompting(CoT) and tool augmentation have been validated as effective practices for improving large language models. We propose a new approach that can deliberate the reasoning steps with tool interfaces, namely textbfDELI. Experimental results on CARP and six other datasets show that the proposed DELI mostly outperforms competitive baselines.
arXiv Detail & Related papers (2023-06-04T17:02:59Z)
SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs) We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer. We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z)
Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models [81.01397924280612]
Large language models (LLMs) can achieve highly effective performance on various reasoning tasks by incorporating step-by-step chain-of-thought (CoT) prompting as demonstrations. We introduce Iter-CoT (Iterative bootstrapping in Chain-of-Thoughts Prompting), an iterative bootstrapping approach for selecting exemplars and generating reasoning chains.
arXiv Detail & Related papers (2023-04-23T13:54:39Z)
Will Bilevel Optimizers Benefit from Loops [63.22466953441521]
Two current popular bilevelimats AID-BiO and ITD-BiO naturally involve solving one or two sub-problems. We first establish unified convergence analysis for both AID-BiO and ITD-BiO that are applicable to all implementation choices of loops.
arXiv Detail & Related papers (2022-05-27T20:28:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.