Related papers: MLFMF: Data Sets for Machine Learning for Mathematical Formalization

MLFMF: Data Sets for Machine Learning for Mathematical Formalization

URL: http://arxiv.org/abs/2310.16005v1
Date: Tue, 24 Oct 2023 17:00:00 GMT
Title: MLFMF: Data Sets for Machine Learning for Mathematical Formalization
Authors: Andrej Bauer, Matej Petkovi\'c, Ljup\v{c}o Todorovski
Abstract summary: MLFMF is a collection of data sets for benchmarking systems used to support formalization of mathematics with proof assistants. Each data set is derived from a library of formalized mathematics written in proof assistants Agda or Lean. With more than $250,000$ entries in total, this is currently the largest collection of formalized mathematical knowledge in machine learnable format.
Score: 0.18416014644193068
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce MLFMF, a collection of data sets for benchmarking recommendation systems used to support formalization of mathematics with proof assistants. These systems help humans identify which previous entries (theorems, constructions, datatypes, and postulates) are relevant in proving a new theorem or carrying out a new construction. Each data set is derived from a library of formalized mathematics written in proof assistants Agda or Lean. The collection includes the largest Lean~4 library Mathlib, and some of the largest Agda libraries: the standard library, the library of univalent mathematics Agda-unimath, and the TypeTopology library. Each data set represents the corresponding library in two ways: as a heterogeneous network, and as a list of s-expressions representing the syntax trees of all the entries in the library. The network contains the (modular) structure of the library and the references between entries, while the s-expressions give complete and easily parsed information about every entry. We report baseline results using standard graph and word embeddings, tree ensembles, and instance-based learning algorithms. The MLFMF data sets provide solid benchmarking support for further investigation of the numerous machine learning approaches to formalized mathematics. The methodology used to extract the networks and the s-expressions readily applies to other libraries, and is applicable to other proof assistants. With more than $250\,000$ entries in total, this is currently the largest collection of formalized mathematical knowledge in machine learnable format.

Related papers

Machine Learning meets Algebraic Combinatorics: A Suite of Datasets Capturing Research-level Conjecturing Ability in Pure Mathematics [4.229995708813431]
We introduce a new collection of datasets, the Algebraic Combinatorics dataset Repository (ACD Repo) Each dataset includes an open-ended research-level question and a large collection of examples. We describe all nine datasets, the different ways machine learning models can be applied to them.
arXiv Detail & Related papers (2025-03-09T00:11:40Z)
Data for Mathematical Copilots: Better Ways of Presenting Proofs for Machine Learning [85.635988711588]
We argue that enhancing the capabilities of large language models requires a paradigm shift in the design of mathematical datasets. We advocate for mathematical dataset developers to consider the concept of "motivated proof", introduced by G. P'olya in 1949, which can serve as a blueprint for datasets that offer a better proof learning signal. We provide a questionnaire designed specifically for math datasets that we urge creators to include with their datasets.
arXiv Detail & Related papers (2024-12-19T18:55:17Z)
Library Learning Doesn't: The Curious Case of the Single-Use "Library" [20.25809428140996]
We study two library learning systems for mathematics which both reported increased accuracy: LEGO-Prover and TroVE. We find that function reuse is extremely infrequent on miniF2F and MATH. Our followup experiments suggest that, rather than reuse, self-correction and self-consistency are the primary drivers of the observed performance gains.
arXiv Detail & Related papers (2024-10-26T21:05:08Z)
LeanAgent: Lifelong Learning for Formal Theorem Proving [85.39415834798385]
We present LeanAgent, a novel lifelong learning framework for formal theorem proving. LeanAgent continuously generalizes to and improves on ever-expanding mathematical knowledge. It generates formal proofs for 155 theorems across 23 diverse Lean repositories.
arXiv Detail & Related papers (2024-10-08T17:11:24Z)
Masked Image Modeling: A Survey [73.21154550957898]
Masked image modeling emerged as a powerful self-supervised learning technique in computer vision. We construct a taxonomy and review the most prominent papers in recent years. We aggregate the performance results of various masked image modeling methods on the most popular datasets.
arXiv Detail & Related papers (2024-08-13T07:27:02Z)
DiscoveryBench: Towards Data-Driven Discovery with Large Language Models [50.36636396660163]
We present DiscoveryBench, the first comprehensive benchmark that formalizes the multi-step process of data-driven discovery. Our benchmark contains 264 tasks collected across 6 diverse domains, such as sociology and engineering. Our benchmark, thus, illustrates the challenges in autonomous data-driven discovery and serves as a valuable resource for the community to make progress.
arXiv Detail & Related papers (2024-07-01T18:58:22Z)
PromptReps: Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval [76.50690734636477]
We propose PromptReps, which combines the advantages of both categories: no need for training and the ability to retrieve from the whole corpus. The retrieval system harnesses both dense text embedding and sparse bag-of-words representations.
arXiv Detail & Related papers (2024-04-29T04:51:30Z)
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data [85.50740598523818]
MUSTARD is a framework that masters uniform synthesis of theorem and proof data of high quality and diversity. We present a theorem-and-proof benchmark MUSTARDSAUCE with 5,866 valid data points. We perform extensive analysis and demonstrate that MUSTARD generates validated high-quality step-by-step data.
arXiv Detail & Related papers (2024-02-14T05:57:58Z)
A New Approach Towards Autoformalization [7.275550401145199]
Autoformalization is the task of translating natural language mathematics into a formal language that can be verified by a program. Research paper mathematics requires large amounts of background and context. We propose an avenue towards tackling autoformalization for research-level mathematics, by breaking the task into easier and more approachable subtasks.
arXiv Detail & Related papers (2023-10-12T00:50:24Z)
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models [72.54339382005732]
Large language models (LLMs) have shown promise in proving formal theorems using proof assistants such as Lean. Existing methods are difficult to reproduce or build on, due to private code, data, and compute requirements. This paper introduces LeanDojo: an open-source Lean toolkit consisting of toolkits, data, models. We develop ReProver: an LLM-based prover augmented with retrieval for selecting premises from a vast math library.
arXiv Detail & Related papers (2023-06-27T17:05:32Z)
FuzzyLogic.jl: a Flexible Library for Efficient and Productive Fuzzy Inference [5.584060970507507]
This paper introduces textscFuzzyLogic.jl, a Julia library to perform fuzzy inference. The library is fully open-source and released under a permissive license.
arXiv Detail & Related papers (2023-06-17T10:43:09Z)
Deep Learning Symmetries and Their Lie Groups, Algebras, and Subalgebras from First Principles [55.41644538483948]
We design a deep-learning algorithm for the discovery and identification of the continuous group of symmetries present in a labeled dataset. We use fully connected neural networks to model the transformations symmetry and the corresponding generators. Our study also opens the door for using a machine learning approach in the mathematical study of Lie groups and their properties.
arXiv Detail & Related papers (2023-01-13T16:25:25Z)
GRAD-MATCH: A Gradient Matching Based Data Subset Selection for Efficient Learning [23.75284126177203]
We propose a general framework, GRAD-MATCH, which finds subsets that closely match the gradient of the training or validation set. We show that GRAD-MATCH significantly and consistently outperforms several recent data-selection algorithms.
arXiv Detail & Related papers (2021-02-27T04:09:32Z)
Learning Aggregation Functions [78.47770735205134]
We introduce LAF (Learning Aggregation Functions), a learnable aggregator for sets of arbitrary cardinality. We report experiments on semi-synthetic and real data showing that LAF outperforms state-of-the-art sum- (max-) decomposition architectures.
arXiv Detail & Related papers (2020-12-15T18:28:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.