Natural Language Specifications in Proof Assistants
- URL: http://arxiv.org/abs/2205.07811v1
- Date: Mon, 16 May 2022 17:05:45 GMT
- Title: Natural Language Specifications in Proof Assistants
- Authors: Colin S. Gordon, Sergey Matskevich
- Abstract summary: It is possible to build support for natural language specifications within existing proof assistants.
This paper argues that it is possible to build support for natural language specifications within existing proof assistants.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Interactive proof assistants are computer programs carefully constructed to
check a human-designed proof of a mathematical claim with high confidence in
the implementation. However, this only validates truth of a formal claim, which
may have been mistranslated from a claim made in natural language. This is
especially problematic when using proof assistants to formally verify the
correctness of software with respect to a natural language specification. The
translation from informal to formal remains a challenging, time-consuming
process that is difficult to audit for correctness. This paper argues that it
is possible to build support for natural language specifications within
existing proof assistants, in a way that complements the principles used to
establish trust and auditability in proof assistants themselves.
Related papers
- Lean-STaR: Learning to Interleave Thinking and Proving [53.923617816215774]
We present Lean-STaR, a framework for training language models to produce informal thoughts prior to each step of a proof.
Lean-STaR achieves state-of-the-art results on the miniF2F-test benchmark within the Lean theorem proving environment.
arXiv Detail & Related papers (2024-07-14T01:43:07Z) - LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback [71.95402654982095]
We propose Math-Minos, a natural language feedback-enhanced verifier.
Our experiments reveal that a small set of natural language feedback can significantly boost the performance of the verifier.
arXiv Detail & Related papers (2024-06-20T06:42:27Z) - Evaluating LLM-driven User-Intent Formalization for Verification-Aware Languages [6.0608817611709735]
We propose a metric for evaluating the quality of specifications for verification-aware languages.
We show that our metric agrees closely on a human-labeled dataset of Dafny specifications for the popular MBPP code-generation benchmark.
We also outline formal verification challenges that need to be addressed to apply the technique more widely.
arXiv Detail & Related papers (2024-06-14T06:52:08Z) - Trustworthy Formal Natural Language Specifications [3.8073142980733]
This paper shows it is possible to build support for specifications written in expressive subsets of natural language.
We implement a means to provide specifications in a modularly formal subset of English, and have them automatically translated into formal claims.
We produce proof certificates explaining how each word was interpreted and how the sentence's structure was used to compute the meaning.
arXiv Detail & Related papers (2023-10-05T20:41:47Z) - ProofBuddy: A Proof Assistant for Learning and Monitoring [0.0]
Proof competence, i.e. the ability to write and check (mathematical) proofs, is an important skill in Computer Science.
The main issues are the correct use of formal language and the ascertainment of whether proofs, especially the students' own, are complete and correct.
Many authors have suggested using proof assistants to assist in teaching proof competence, but the efficacy of the approach is unclear.
We have performed a preliminary usability study of ProofBuddy at the Technical University of Denmark.
arXiv Detail & Related papers (2023-08-14T07:08:55Z) - STL: Surprisingly Tricky Logic (for System Validation) [0.04301276597844757]
Ground-truth validity of a specification, subjects' familiarity with formal methods, and subjects' level of education were found to be significant factors in determining validation correctness.
Participants exhibited an affirmation bias, causing significantly increased accuracy on valid specifications, but significantly decreased accuracy on invalid specifications.
arXiv Detail & Related papers (2023-05-26T21:01:26Z) - Towards Autoformalization of Mathematics and Code Correctness:
Experiments with Elementary Proofs [5.045988012508899]
Autoformalization seeks to address this by translating proofs written in natural language into a formal representation that is computer-verifiable via interactive theorem provers.
We introduce a semantic parsing approach, based on the Universal Transformer architecture, that translates elementary mathematical proofs into an equivalent formalization in the language of the Coq interactive theorem prover.
arXiv Detail & Related papers (2023-01-05T17:56:00Z) - Language Models as Inductive Reasoners [125.99461874008703]
We propose a new paradigm (task) for inductive reasoning, which is to induce natural language rules from natural language facts.
We create a dataset termed DEER containing 1.2k rule-fact pairs for the task, where rules and facts are written in natural language.
We provide the first and comprehensive analysis of how well pretrained language models can induce natural language rules from natural language facts.
arXiv Detail & Related papers (2022-12-21T11:12:14Z) - NaturalProver: Grounded Mathematical Proof Generation with Language
Models [84.2064569475095]
Theorem proving in natural mathematical language plays a central role in mathematical advances and education.
We develop NaturalProver, a language model that generates proofs by conditioning on background references.
NaturalProver is capable of proving some theorems that require short (2-6 step) proofs, and providing next-step suggestions that are rated as correct and useful over 40% of the time.
arXiv Detail & Related papers (2022-05-25T17:01:18Z) - Generating Natural Language Proofs with Verifier-Guided Search [74.9614610172561]
We present a novel stepwise method NLProofS (Natural Language Proof Search)
NLProofS learns to generate relevant steps conditioning on the hypothesis.
It achieves state-of-the-art performance on EntailmentBank and RuleTaker.
arXiv Detail & Related papers (2022-05-25T02:22:30Z) - Data Annealing for Informal Language Understanding Tasks [66.2988222278475]
We propose a data annealing transfer learning procedure to bridge the performance gap on informal language tasks.
It successfully utilizes a pre-trained model such as BERT in informal language.
arXiv Detail & Related papers (2020-04-24T09:27:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.