Multilingual Mathematical Autoformalization
- URL: http://arxiv.org/abs/2311.03755v2
- Date: Thu, 9 Nov 2023 09:58:15 GMT
- Title: Multilingual Mathematical Autoformalization
- Authors: Albert Q. Jiang, Wenda Li, Mateja Jamnik
- Abstract summary: Autoformalization is the task of translating natural language materials into machine-verifiable formalisations.
Existing methods tend to circumvent this challenge by manually curating small corpora.
In this work, we create $textttMMA$, a large, flexible, multilingual, and multi-domain dataset of informal-formal pairs.
- Score: 14.433478397963123
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Autoformalization is the task of translating natural language materials into
machine-verifiable formalisations. Progress in autoformalization research is
hindered by the lack of a sizeable dataset consisting of informal-formal pairs
expressing the same essence. Existing methods tend to circumvent this challenge
by manually curating small corpora or using few-shot learning with large
language models. But these methods suffer from data scarcity and formal
language acquisition difficulty. In this work, we create $\texttt{MMA}$, a
large, flexible, multilingual, and multi-domain dataset of informal-formal
pairs, by using a language model to translate in the reverse direction, that
is, from formal mathematical statements into corresponding informal ones.
Experiments show that language models fine-tuned on $\texttt{MMA}$ produce
$16-18\%$ of statements acceptable with minimal corrections on the
$\texttt{miniF2F}$ and $\texttt{ProofNet}$ benchmarks, up from $0\%$ with the
base model. We demonstrate that fine-tuning on multilingual formal data results
in more capable autoformalization models even when deployed on monolingual
tasks.
Related papers
- A Fundamental Trade-off in Aligned Language Models and its Relation to Sampling Adaptors [50.046717886067555]
Given a general language model and its aligned version, there exists a trade-off between the average reward and average log-likelihood of the strings under the general language model.
We provide a formal treatment of this issue and demonstrate how a choice of sampling adaptor allows for a selection of how much likelihood we exchange for the reward.
arXiv Detail & Related papers (2024-06-14T17:38:21Z) - Split and Rephrase with Large Language Models [2.499907423888049]
Split and Rephrase (SPRP) task consists in splitting complex sentences into a sequence of shorter grammatical sentences.
We evaluate large language models on the task, showing that they can provide large improvements over the state of the art on the main metrics.
arXiv Detail & Related papers (2023-12-18T10:16:37Z) - Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image
Generation [86.65991476980648]
We adapt a pre-trained language model for auto-regressive text-to-image generation.
We find that pre-trained language models offer limited help.
arXiv Detail & Related papers (2023-11-27T07:19:26Z) - Machine Translation to Control Formality Features in the Target Language [0.9208007322096532]
This research explores how machine learning methods are used to translate from English to languages with formality.
It was done by training a bilingual model in a formality-controlled setting and comparing its performance with a pre-trained multilingual model.
We evaluate the official formality accuracy(ACC) by comparing the predicted masked tokens with the ground truth.
arXiv Detail & Related papers (2023-11-22T15:42:51Z) - In What Languages are Generative Language Models the Most Formal?
Analyzing Formality Distribution across Languages [2.457872341625575]
In this work, we focus on one language property highly influenced by culture: formality.
We analyze the formality distributions of XGLM and BLOOM's predictions, two popular generative multilingual language models, in 5 languages.
We classify 1,200 generations per language as formal, informal, or incohesive and measure the impact of the prompt formality on the predictions.
arXiv Detail & Related papers (2023-02-23T19:39:52Z) - Controlling Translation Formality Using Pre-trained Multilingual
Language Models [19.465727478912072]
This paper describes the University of Maryland's submission to the Special Task on Formality Control for Spoken Language Translation at iwslt.
We investigate to what extent this problem can be addressed with a textitsingle multilingual model.
Results show that this strategy can approach the translation quality and formality control achieved by dedicated translation models.
arXiv Detail & Related papers (2022-05-13T13:47:28Z) - Constrained Language Models Yield Few-Shot Semantic Parsers [73.50960967598654]
We explore the use of large pretrained language models as few-shot semantics.
The goal in semantic parsing is to generate a structured meaning representation given a natural language input.
We use language models to paraphrase inputs into a controlled sublanguage resembling English that can be automatically mapped to a target meaning representation.
arXiv Detail & Related papers (2021-04-18T08:13:06Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - Data Annealing for Informal Language Understanding Tasks [66.2988222278475]
We propose a data annealing transfer learning procedure to bridge the performance gap on informal language tasks.
It successfully utilizes a pre-trained model such as BERT in informal language.
arXiv Detail & Related papers (2020-04-24T09:27:09Z) - A Simple Joint Model for Improved Contextual Neural Lemmatization [60.802451210656805]
We present a simple joint neural model for lemmatization and morphological tagging that achieves state-of-the-art results on 20 languages.
Our paper describes the model in addition to training and decoding procedures.
arXiv Detail & Related papers (2019-04-04T02:03:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.