MLFMF: Data Sets for Machine Learning for Mathematical Formalization
- URL: http://arxiv.org/abs/2310.16005v1
- Date: Tue, 24 Oct 2023 17:00:00 GMT
- Title: MLFMF: Data Sets for Machine Learning for Mathematical Formalization
- Authors: Andrej Bauer, Matej Petkovi\'c, Ljup\v{c}o Todorovski
- Abstract summary: MLFMF is a collection of data sets for benchmarking systems used to support formalization of mathematics with proof assistants.
Each data set is derived from a library of formalized mathematics written in proof assistants Agda or Lean.
With more than $250,000$ entries in total, this is currently the largest collection of formalized mathematical knowledge in machine learnable format.
- Score: 0.18416014644193068
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce MLFMF, a collection of data sets for benchmarking recommendation
systems used to support formalization of mathematics with proof assistants.
These systems help humans identify which previous entries (theorems,
constructions, datatypes, and postulates) are relevant in proving a new theorem
or carrying out a new construction. Each data set is derived from a library of
formalized mathematics written in proof assistants Agda or Lean. The collection
includes the largest Lean~4 library Mathlib, and some of the largest Agda
libraries: the standard library, the library of univalent mathematics
Agda-unimath, and the TypeTopology library. Each data set represents the
corresponding library in two ways: as a heterogeneous network, and as a list of
s-expressions representing the syntax trees of all the entries in the library.
The network contains the (modular) structure of the library and the references
between entries, while the s-expressions give complete and easily parsed
information about every entry. We report baseline results using standard graph
and word embeddings, tree ensembles, and instance-based learning algorithms.
The MLFMF data sets provide solid benchmarking support for further
investigation of the numerous machine learning approaches to formalized
mathematics. The methodology used to extract the networks and the s-expressions
readily applies to other libraries, and is applicable to other proof
assistants. With more than $250\,000$ entries in total, this is currently the
largest collection of formalized mathematical knowledge in machine learnable
format.
Related papers
- Library Learning Doesn't: The Curious Case of the Single-Use "Library" [20.25809428140996]
We study two library learning systems for mathematics which both reported increased accuracy: LEGO-Prover and TroVE.
We find that function reuse is extremely infrequent on miniF2F and MATH.
Our followup experiments suggest that, rather than reuse, self-correction and self-consistency are the primary drivers of the observed performance gains.
arXiv Detail & Related papers (2024-10-26T21:05:08Z) - DiscoveryBench: Towards Data-Driven Discovery with Large Language Models [50.36636396660163]
We present DiscoveryBench, the first comprehensive benchmark that formalizes the multi-step process of data-driven discovery.
Our benchmark contains 264 tasks collected across 6 diverse domains, such as sociology and engineering.
Our benchmark, thus, illustrates the challenges in autonomous data-driven discovery and serves as a valuable resource for the community to make progress.
arXiv Detail & Related papers (2024-07-01T18:58:22Z) - Autoformalizing Euclidean Geometry [74.72212706513318]
We introduce a neuro-symbolic framework for autoformalizing Euclidean geometry.
One challenge is that informal proofs rely on diagrams, leaving gaps in texts that are hard to formalize.
We provide automatic semantic evaluation for autoformalized theorem statements.
arXiv Detail & Related papers (2024-05-27T14:35:10Z) - MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data [85.50740598523818]
MUSTARD is a framework that masters uniform synthesis of theorem and proof data of high quality and diversity.
We present a theorem-and-proof benchmark MUSTARDSAUCE with 5,866 valid data points.
We perform extensive analysis and demonstrate that MUSTARD generates validated high-quality step-by-step data.
arXiv Detail & Related papers (2024-02-14T05:57:58Z) - A New Approach Towards Autoformalization [7.275550401145199]
Autoformalization is the task of translating natural language mathematics into a formal language that can be verified by a program.
Research paper mathematics requires large amounts of background and context.
We propose an avenue towards tackling autoformalization for research-level mathematics, by breaking the task into easier and more approachable subtasks.
arXiv Detail & Related papers (2023-10-12T00:50:24Z) - LeanDojo: Theorem Proving with Retrieval-Augmented Language Models [72.54339382005732]
Large language models (LLMs) have shown promise in proving formal theorems using proof assistants such as Lean.
Existing methods are difficult to reproduce or build on, due to private code, data, and compute requirements.
This paper introduces LeanDojo: an open-source Lean toolkit consisting of toolkits, data, models.
We develop ReProver: an LLM-based prover augmented with retrieval for selecting premises from a vast math library.
arXiv Detail & Related papers (2023-06-27T17:05:32Z) - FuzzyLogic.jl: a Flexible Library for Efficient and Productive Fuzzy
Inference [5.584060970507507]
This paper introduces textscFuzzyLogic.jl, a Julia library to perform fuzzy inference.
The library is fully open-source and released under a permissive license.
arXiv Detail & Related papers (2023-06-17T10:43:09Z) - Deep Learning Symmetries and Their Lie Groups, Algebras, and Subalgebras
from First Principles [55.41644538483948]
We design a deep-learning algorithm for the discovery and identification of the continuous group of symmetries present in a labeled dataset.
We use fully connected neural networks to model the transformations symmetry and the corresponding generators.
Our study also opens the door for using a machine learning approach in the mathematical study of Lie groups and their properties.
arXiv Detail & Related papers (2023-01-13T16:25:25Z) - GRAD-MATCH: A Gradient Matching Based Data Subset Selection for
Efficient Learning [23.75284126177203]
We propose a general framework, GRAD-MATCH, which finds subsets that closely match the gradient of the training or validation set.
We show that GRAD-MATCH significantly and consistently outperforms several recent data-selection algorithms.
arXiv Detail & Related papers (2021-02-27T04:09:32Z) - Learning Aggregation Functions [78.47770735205134]
We introduce LAF (Learning Aggregation Functions), a learnable aggregator for sets of arbitrary cardinality.
We report experiments on semi-synthetic and real data showing that LAF outperforms state-of-the-art sum- (max-) decomposition architectures.
arXiv Detail & Related papers (2020-12-15T18:28:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.