Related papers: CoqPyt: Proof Navigation in Python in the Era of LLMs

CoqPyt: Proof Navigation in Python in the Era of LLMs

URL: http://arxiv.org/abs/2405.04282v1
Date: Tue, 7 May 2024 12:50:28 GMT
Title: CoqPyt: Proof Navigation in Python in the Era of LLMs
Authors: Pedro Carrott, Nuno Saavedra, Kyle Thompson, Sorin Lerner, João F. Ferreira, Emily First,
Abstract summary: This paper presents CoqPyt, a Python tool for interacting with the Coq proof assistant. CoqPyt improves on other Coq-related tools by providing novel features, such as the extraction of rich premise data.
Score: 5.029445580644576
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Proof assistants enable users to develop machine-checked proofs regarding software-related properties. Unfortunately, the interactive nature of these proof assistants imposes most of the proof burden on the user, making formal verification a complex, and time-consuming endeavor. Recent automation techniques based on neural methods address this issue, but require good programmatic support for collecting data and interacting with proof assistants. This paper presents CoqPyt, a Python tool for interacting with the Coq proof assistant. CoqPyt improves on other Coq-related tools by providing novel features, such as the extraction of rich premise data. We expect our work to aid development of tools and techniques, especially LLM-based, designed for proof synthesis and repair. A video describing and demonstrating CoqPyt is available at: https://youtu.be/fk74o0rePM8.

Related papers

Rango: Adaptive Retrieval-Augmented Proving for Automated Software Verification [11.115201117143929]
We present Rango, a fully automated synthesis proof tool for Coq. Rango identifies relevant premises and also similar proofs from the current project and uses them during synthesis. Our evaluation shows that Rango adding relevant to its context leads to a 47% increase in the number of theorems proven.
arXiv Detail & Related papers (2024-12-18T17:08:42Z)
Cobblestone: Iterative Automation for Formal Verification [11.445689801392657]
Formal verification using proof assistants, such as Coq, is an effective way of improving software quality, but it is expensive. Recent research has used machine learning to automatically synthesize proofs, reducing verification effort, but these tools are able to prove only a fraction of the desired software properties. We introduce Cobblestone, a new proof-synthesis approach that improves on the state of the art by taking advantage of partial progress in proof synthesis attempts.
arXiv Detail & Related papers (2024-10-25T19:25:00Z)
CoqPilot, a plugin for LLM-based generation of proofs [0.0]
CoqPilot is a VS Code extension designed to help automate writing of Coq proofs. The plugin collects the parts of proofs marked with the admit tactic in a Coq file. It combines LLMs along with non-machine-learning methods to generate proof candidates for the holes.
arXiv Detail & Related papers (2024-10-25T14:57:29Z)
Lean Copilot: Large Language Models as Copilots for Theorem Proving in Lean [81.94024084598598]
We introduce Lean Copilot, a framework for running LLM inference in Lean. We build tools that suggest proof steps, complete proof goals, and select relevant premises. When assisting humans, Lean Copilot requires only 2.08 manually-entered proof steps on average (3.86 required by aesop) When automating the theorem proving process, Lean Copilot 74.2% proof steps on average, 85% better than aesop (40.1%).
arXiv Detail & Related papers (2024-04-18T22:54:08Z)
Enhancing Formal Theorem Proving: A Comprehensive Dataset for Training AI Models on Coq Code [0.0]
The Coq proof assistant stands out for its rigorous approach to verifying mathematical assertions and software correctness. Despite the advances in artificial intelligence and machine learning, the specialized nature of Coq syntax and semantics poses unique challenges for Large Language Models (LLMs) This dataset, derived from a collection of over 10,000 Coq source files, encompasses a wide array of propositions, proofs, and definitions.
arXiv Detail & Related papers (2024-03-19T10:53:40Z)
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data [85.50740598523818]
MUSTARD is a framework that masters uniform synthesis of theorem and proof data of high quality and diversity. We present a theorem-and-proof benchmark MUSTARDSAUCE with 5,866 valid data points. We perform extensive analysis and demonstrate that MUSTARD generates validated high-quality step-by-step data.
arXiv Detail & Related papers (2024-02-14T05:57:58Z)
Towards Automatic Transformations of Coq Proof Scripts [0.0]
We present a framework to carry out a posteriori script transformations. These transformations are meant to be applied as an automated post-processing step, once the proof has been completed. We apply our tool to various Coq proof scripts, including some from the GeoCoq library.
arXiv Detail & Related papers (2024-01-22T12:48:34Z)
Fact-Checking Complex Claims with Program-Guided Reasoning [99.7212240712869]
Program-Guided Fact-Checking (ProgramFC) is a novel fact-checking model that decomposes complex claims into simpler sub-tasks. We first leverage the in-context learning ability of large language models to generate reasoning programs. We execute the program by delegating each sub-task to the corresponding sub-task handler.
arXiv Detail & Related papers (2023-05-22T06:11:15Z)
Open-domain Question Answering via Chain of Reasoning over Heterogeneous Knowledge [82.5582220249183]
We propose a novel open-domain question answering (ODQA) framework for answering single/multi-hop questions across heterogeneous knowledge sources. Unlike previous methods that solely rely on the retriever for gathering all evidence in isolation, our intermediary performs a chain of reasoning over the retrieved set. Our system achieves competitive performance on two ODQA datasets, OTT-QA and NQ, against tables and passages from Wikipedia.
arXiv Detail & Related papers (2022-10-22T03:21:32Z)
Effidit: Your AI Writing Assistant [60.588370965898534]
Effidit is a digital writing assistant that facilitates users to write higher-quality text more efficiently by using artificial intelligence (AI) technologies. In Effidit, we significantly expand the capacities of a writing assistant by providing functions in five categories: text completion, error checking, text polishing, keywords to sentences (K2S), and cloud input methods (cloud IME)
arXiv Detail & Related papers (2022-08-03T02:24:45Z)
Zero-shot Fact Verification by Claim Generation [85.27523983027471]
We develop QACG, a framework for training a robust fact verification model. We use automatically generated claims that can be supported, refuted, or unverifiable from evidence from Wikipedia. In a zero-shot scenario, QACG improves a RoBERTa model's F1 from 50% to 77%, equivalent in performance to 2K+ manually-curated examples.
arXiv Detail & Related papers (2021-05-31T03:13:52Z)
The Coq Proof Script Visualiser (coq-psv) [0.0]
This tool supports both education and review processes as all proof steps can be visualised. In contrast to the usual approach of visualising proofs as hypertext or markdown documents, the generated files can be easily printed.
arXiv Detail & Related papers (2021-01-15T08:52:31Z)
Understanding Unnatural Questions Improves Reasoning over Text [54.235828149899625]
Complex question answering (CQA) over raw text is a challenging task. Learning an effective CQA model requires large amounts of human-annotated data. We address the challenge of learning a high-quality programmer (parser) by projecting natural human-generated questions into unnatural machine-generated questions.
arXiv Detail & Related papers (2020-10-19T10:22:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.