Learning to Match Mathematical Statements with Proofs
- URL: http://arxiv.org/abs/2102.02110v1
- Date: Wed, 3 Feb 2021 15:38:54 GMT
- Title: Learning to Match Mathematical Statements with Proofs
- Authors: Maximin Coavoux, Shay B. Cohen
- Abstract summary: The task is designed to improve the processing of research-level mathematical texts.
We release a dataset for the task, consisting of over 180k statement-proof pairs.
We show that considering the assignment problem globally and using weighted bipartite matching algorithms helps a lot in tackling the task.
- Score: 37.38969121408295
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a novel task consisting in assigning a proof to a given
mathematical statement. The task is designed to improve the processing of
research-level mathematical texts. Applying Natural Language Processing (NLP)
tools to research level mathematical articles is both challenging, since it is
a highly specialized domain which mixes natural language and mathematical
formulae. It is also an important requirement for developing tools for
mathematical information retrieval and computer-assisted theorem proving. We
release a dataset for the task, consisting of over 180k statement-proof pairs
extracted from mathematical research articles. We carry out preliminary
experiments to assess the difficulty of the task. We first experiment with two
bag-of-words baselines. We show that considering the assignment problem
globally and using weighted bipartite matching algorithms helps a lot in
tackling the task. Finally, we introduce a self-attention-based model that can
be trained either locally or globally and outperforms baselines by a wide
margin.
Related papers
- Mathematical Entities: Corpora and Benchmarks [0.8766411351797883]
There has been relatively little research on natural language processing for mathematical texts.
We provide annotated corpora that can be used to study the language of mathematics in different contexts.
arXiv Detail & Related papers (2024-06-17T14:11:00Z) - BERT is not The Count: Learning to Match Mathematical Statements with
Proofs [34.61792250254876]
The task fits well within current research on Mathematical Information Retrieval and, more generally, mathematical article analysis.
We present a dataset consisting of over 180k statement-proof pairs extracted from modern mathematical research articles.
We propose a bilinear similarity model and two decoding methods to match statements to proofs effectively.
arXiv Detail & Related papers (2023-02-18T14:48:20Z) - Towards a Holistic Understanding of Mathematical Questions with
Contrastive Pre-training [65.10741459705739]
We propose a novel contrastive pre-training approach for mathematical question representations, namely QuesCo.
We first design two-level question augmentations, including content-level and structure-level, which generate literally diverse question pairs with similar purposes.
Then, to fully exploit hierarchical information of knowledge concepts, we propose a knowledge hierarchy-aware rank strategy.
arXiv Detail & Related papers (2023-01-18T14:23:29Z) - A Survey of Deep Learning for Mathematical Reasoning [71.88150173381153]
We review the key tasks, datasets, and methods at the intersection of mathematical reasoning and deep learning over the past decade.
Recent advances in large-scale neural language models have opened up new benchmarks and opportunities to use deep learning for mathematical reasoning.
arXiv Detail & Related papers (2022-12-20T18:46:16Z) - Lila: A Unified Benchmark for Mathematical Reasoning [59.97570380432861]
LILA is a unified mathematical reasoning benchmark consisting of 23 diverse tasks along four dimensions.
We construct our benchmark by extending 20 datasets benchmark by collecting task instructions and solutions in the form of Python programs.
We introduce BHASKARA, a general-purpose mathematical reasoning model trained on LILA.
arXiv Detail & Related papers (2022-10-31T17:41:26Z) - ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational
Finance Question Answering [70.6359636116848]
We propose a new large-scale dataset, ConvFinQA, to study the chain of numerical reasoning in conversational question answering.
Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations.
arXiv Detail & Related papers (2022-10-07T23:48:50Z) - JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem
Understanding [74.12405417718054]
This paper aims to advance the mathematical intelligence of machines by presenting the first Chinese mathematical pre-trained language model(PLM)
Unlike other standard NLP tasks, mathematical texts are difficult to understand, since they involve mathematical terminology, symbols and formulas in the problem statement.
We design a novel curriculum pre-training approach for improving the learning of mathematical PLMs, consisting of both basic and advanced courses.
arXiv Detail & Related papers (2022-06-13T17:03:52Z) - Tackling Math Word Problems with Fine-to-Coarse Abstracting and
Reasoning [22.127301797950572]
We propose to model a math word problem in a fine-to-coarse manner to capture both the local fine-grained information and the global logical structure of it.
Our model is naturally sensitive to local variations and can better generalize to unseen problem types.
arXiv Detail & Related papers (2022-05-17T12:14:44Z) - Natural Language Premise Selection: Finding Supporting Statements for
Mathematical Text [3.42658286826597]
We propose a new NLP task, the natural premise selection, which is used to retrieve supporting definitions and supporting propositions.
We also make available a dataset, NL-PS, which can be used to evaluate different approaches for the natural premise selection task.
arXiv Detail & Related papers (2020-04-30T17:08:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.