Related papers: Who Finds the Short Proof? An Exploration of variants of Boolos' Curious Inference using Higher-order Automated Theorem Provers

Who Finds the Short Proof? An Exploration of variants of Boolos' Curious Inference using Higher-order Automated Theorem Provers

URL: http://arxiv.org/abs/2208.06879v1
Date: Sun, 14 Aug 2022 16:31:04 GMT
Title: Who Finds the Short Proof? An Exploration of variants of Boolos' Curious Inference using Higher-order Automated Theorem Provers
Authors: Christoph Benzm\"uller, David Fuenmayor, Alexander Steen, Geoff Sutcliffe
Abstract summary: We report on an exploration of variants of Boolos' curious inference using higher-order automated theorem provers. Surprisingly, only a single shorthand notation still had to be provided by hand. All higher-order lemmas required for obtaining short proof are automatically discovered by the computer.
Score: 62.997667081978825
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We report on an exploration of variants of Boolos' curious inference using higher-order automated theorem provers (ATPs). Surprisingly, only a single shorthand notation still had to be provided by hand. All higher-order lemmas required for obtaining short proof are automatically discovered by the computer. Given the observations and suggestions in this paper, full proof automation of Boolos' example on the speedup of proof lengths, and related examples, now seems to be within reach for higher-order ATPs.

Related papers

How To Discover Short, Shorter, and the Shortest Proofs of Unsatisfiability: A Branch-and-Bound Approach for Resolution Proof Length Minimization [1.4796543791607086]
This paper proposes a novel branch-and-bound algorithm for finding the shortest resolution proofs. We show that this representation breaks all permutational symmetries, thereby improving upon the state-of-the-art symmetry-breaking. Our experiments suggest that the proofs from state-of-the-art solvers could be shortened by 30-60% on the instances from SAT Competition 2002 and by 25-50% on small synthetic formulas.
arXiv Detail & Related papers (2024-11-12T17:31:35Z)
miniCodeProps: a Minimal Benchmark for Proving Code Properties [22.548472305010005]
We present miniCodeProps, a benchmark of 201 program specifications in the Lean proof assistant. Despite its simplicity, miniCodeProps is sufficient to break current LLM-based provers.
arXiv Detail & Related papers (2024-06-16T21:11:23Z)
Proving Theorems Recursively [80.42431358105482]
We propose POETRY, which proves theorems in a level-by-level manner. Unlike previous step-by-step methods, POETRY searches for a sketch of the proof at each level. We observe a substantial increase in the maximum proof length found by POETRY, from 10 to 26.
arXiv Detail & Related papers (2024-05-23T10:35:08Z)
Baldur: Whole-Proof Generation and Repair with Large Language Models [8.100054850290507]
We use large language models, trained on natural language text and code and fine-tuned on proofs, to generate whole proofs for theorems at once. We combine this proof generation model with a fine-tuned repair model to repair generated proofs, further increasing proving power. We evaluate our method in a prototype, Baldur, and evaluate it on a benchmark of 6,336 Isabelle/HOL theorems and their proofs.
arXiv Detail & Related papers (2023-03-08T22:00:15Z)
Generating Natural Language Proofs with Verifier-Guided Search [74.9614610172561]
We present a novel stepwise method NLProofS (Natural Language Proof Search) NLProofS learns to generate relevant steps conditioning on the hypothesis. It achieves state-of-the-art performance on EntailmentBank and RuleTaker.
arXiv Detail & Related papers (2022-05-25T02:22:30Z)
Conjectures, Tests and Proofs: An Overview of Theory Exploration [0.0]
We give a brief overview of a theory exploration system called QuickSpec. QuickSpec is able to automatically discover interesting conjectures about a given set of functions. It has been successfully applied to generate lemmas for automated inductive theorem proving.
arXiv Detail & Related papers (2021-09-07T01:57:07Z)
multiPRover: Generating Multiple Proofs for Improved Interpretability in Rule Reasoning [73.09791959325204]
We focus on a type of linguistic formal reasoning where the goal is to reason over explicit knowledge in the form of natural language facts and rules. A recent work, named PRover, performs such reasoning by answering a question and also generating a proof graph that explains the answer. In our work, we address a new and challenging problem of generating multiple proof graphs for reasoning over natural language rule-bases.
arXiv Detail & Related papers (2021-06-02T17:58:35Z)
Superposition for Lambda-Free Higher-Order Logic [62.997667081978825]
We introduce refutationally complete superposition calculi for intentional and extensional clausal $lambda$-free higher-order logic. The calculi are parameterized by a term order that need not be fully monotonic, making it possible to employ the $lambda$-free higher-order lexicographic path and Knuth-Bendix orders.
arXiv Detail & Related papers (2020-05-05T12:10:21Z)
Learning to Prove Theorems by Learning to Generate Theorems [71.46963489866596]
We learn a neural generator that automatically synthesizes theorems and proofs for the purpose of training a theorem prover. Experiments on real-world tasks demonstrate that synthetic data from our approach improves the theorem prover.
arXiv Detail & Related papers (2020-02-17T16:06:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.