Parmesan: mathematical concept extraction for education
- URL: http://arxiv.org/abs/2307.06699v2
- Date: Mon, 17 Jul 2023 12:21:55 GMT
- Title: Parmesan: mathematical concept extraction for education
- Authors: Jacob Collard, Valeria de Paiva, Eswaran Subrahmanian
- Abstract summary: We develop a prototype system for searching for and defining mathematical concepts in context, focusing on the field of category theory.
This system depends on natural language processing components including concept extraction, relation extraction, definition extraction, and entity linking.
We also provide two cleaned mathematical corpora that power the prototype system, which are based on journal articles and wiki pages.
- Score: 0.5520082338220947
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Mathematics is a highly specialized domain with its own unique set of
challenges that has seen limited study in natural language processing. However,
mathematics is used in a wide variety of fields and multidisciplinary research
in many different domains often relies on an understanding of mathematical
concepts. To aid researchers coming from other fields, we develop a prototype
system for searching for and defining mathematical concepts in context,
focusing on the field of category theory. This system, Parmesan, depends on
natural language processing components including concept extraction, relation
extraction, definition extraction, and entity linking. In developing this
system, we show that existing techniques cannot be applied directly to the
category theory domain, and suggest hybrid techniques that do perform well,
though we expect the system to evolve over time. We also provide two cleaned
mathematical corpora that power the prototype system, which are based on
journal articles and wiki pages, respectively. The corpora have been annotated
with dependency trees, lemmas, and part-of-speech tags.
Related papers
- Towards a Categorical Foundation of Deep Learning: A Survey [0.0]
This thesis is a survey that covers some recent work attempting to study machine learning categorically.
acting as a lingua franca of mathematics and science, category theory might be able to give a unifying structure to the field of machine learning.
arXiv Detail & Related papers (2024-10-07T13:11:16Z) - Mathematical Entities: Corpora and Benchmarks [0.8766411351797883]
There has been relatively little research on natural language processing for mathematical texts.
We provide annotated corpora that can be used to study the language of mathematics in different contexts.
arXiv Detail & Related papers (2024-06-17T14:11:00Z) - Domain Embeddings for Generating Complex Descriptions of Concepts in
Italian Language [65.268245109828]
We propose a Distributional Semantic resource enriched with linguistic and lexical information extracted from electronic dictionaries.
The resource comprises 21 domain-specific matrices, one comprehensive matrix, and a Graphical User Interface.
Our model facilitates the generation of reasoned semantic descriptions of concepts by selecting matrices directly associated with concrete conceptual knowledge.
arXiv Detail & Related papers (2024-02-26T15:04:35Z) - math-PVS: A Large Language Model Framework to Map Scientific
Publications to PVS Theories [10.416375584563728]
This work investigates the applicability of large language models (LLMs) in formalizing advanced mathematical concepts.
We envision an automated process, called emphmath-PVS, to extract and formalize mathematical theorems from research papers.
arXiv Detail & Related papers (2023-10-25T23:54:04Z) - OntoMath${}^{\mathbf{PRO}}$ 2.0 Ontology: Updates of the Formal Model [68.8204255655161]
The main attention is paid to the development of a formal model for the representation of mathematical statements in the Open Linked Data cloud.
The proposed model is intended for applications that extract mathematical facts from natural language mathematical texts and represent these facts as Linked Open Data.
The model is used in development of a new version of the OntoMath$mathrmPRO$ ontology of professional mathematics is described.
arXiv Detail & Related papers (2023-03-17T20:29:17Z) - Tree-Based Representation and Generation of Natural and Mathematical
Language [77.34726150561087]
Mathematical language in scientific communications and educational scenarios is important yet relatively understudied.
Recent works on mathematical language focus either on representing stand-alone mathematical expressions, or mathematical reasoning in pre-trained natural language models.
We propose a series of modifications to existing language models to jointly represent and generate text and math.
arXiv Detail & Related papers (2023-02-15T22:38:34Z) - A Survey of Deep Learning for Mathematical Reasoning [71.88150173381153]
We review the key tasks, datasets, and methods at the intersection of mathematical reasoning and deep learning over the past decade.
Recent advances in large-scale neural language models have opened up new benchmarks and opportunities to use deep learning for mathematical reasoning.
arXiv Detail & Related papers (2022-12-20T18:46:16Z) - Artificial Cognitively-inspired Generation of the Notion of Topological
Group in the Context of Artificial Mathematical Intelligence [0.0]
We provide the explicit artificial generation (or conceptual computation) for the fundamental mathematical notion of topological groups.
The concept of topological groups is explicitly generated through three different artificial specifications.
arXiv Detail & Related papers (2021-12-05T01:39:34Z) - NaturalProofs: Mathematical Theorem Proving in Natural Language [132.99913141409968]
We develop NaturalProofs, a multi-domain corpus of mathematical statements and their proofs.
NaturalProofs unifies broad coverage, deep coverage, and low-resource mathematical sources.
We benchmark strong neural methods on mathematical reference retrieval and generation tasks.
arXiv Detail & Related papers (2021-03-24T03:14:48Z) - Formalising Concepts as Grounded Abstractions [68.24080871981869]
This report shows how representation learning can be used to induce concepts from raw data.
The main technical goal of this report is to show how techniques from representation learning can be married with a lattice-theoretic formulation of conceptual spaces.
arXiv Detail & Related papers (2021-01-13T15:22:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.