Formal Specifications from Natural Language
- URL: http://arxiv.org/abs/2206.01962v1
- Date: Sat, 4 Jun 2022 10:49:30 GMT
- Title: Formal Specifications from Natural Language
- Authors: Christopher Hahn, Frederik Schmitt, Julia J. Tillman, Niklas Metzger,
Julian Siber, Bernd Finkbeiner
- Abstract summary: We study the ability of language models to translate natural language into formal specifications with complex semantics.
In particular, we fine-tune off-the-shelf language models on three datasets consisting of structured English sentences.
- Score: 3.1806743741013657
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the ability of language models to translate natural language into
formal specifications with complex semantics. In particular, we fine-tune
off-the-shelf language models on three datasets consisting of structured
English sentences and their corresponding formal representation: 1) First-order
logic (FOL), commonly used in software verification and theorem proving; 2)
linear-time temporal logic (LTL), which forms the basis for industrial hardware
specification languages; and 3) regular expressions (regex), frequently used in
programming and search. Our experiments show that, in these diverse domains,
the language models achieve competitive performance to the respective
state-of-the-art with the benefits of being easy to access, cheap to fine-tune,
and without a particular need for domain-specific reasoning. Additionally, we
show that the language models have a unique selling point: they benefit from
their generalization capabilities from pre-trained knowledge on natural
language, e.g., to generalize to unseen variable names.
Related papers
- Improving Arithmetic Reasoning Ability of Large Language Models through Relation Tuples, Verification and Dynamic Feedback [14.938401898546553]
We propose to use a semi-structured form to represent reasoning steps of large language models.
Specifically, we use relations, which are not only human but also machine-friendly and easier to verify than natural language.
arXiv Detail & Related papers (2024-06-25T18:21:00Z) - How Proficient Are Large Language Models in Formal Languages? An In-Depth Insight for Knowledge Base Question Answering [52.86931192259096]
Knowledge Base Question Answering (KBQA) aims to answer natural language questions based on facts in knowledge bases.
Recent works leverage the capabilities of large language models (LLMs) for logical form generation to improve performance.
arXiv Detail & Related papers (2024-01-11T09:27:50Z) - Using Large Language Models for (De-)Formalization and Natural Argumentation Exercises for Beginner's Students [0.0]
We describe two systems currently being developed that use large language models for the automatized correction of (i) exercises in translating back and forth between natural language and the languages of propositional logic and first-order predicate logic.
arXiv Detail & Related papers (2023-04-12T23:05:02Z) - nl2spec: Interactively Translating Unstructured Natural Language to
Temporal Logics with Large Language Models [3.1143846686797314]
We present nl2spec, a framework for applying Large Language Models (LLMs) derive formal specifications from unstructured natural language.
We introduce a new methodology to detect and resolve the inherent ambiguity of system requirements in natural language.
Users iteratively add, delete, and edit these sub-translations to amend erroneous formalizations, which is easier than manually redrafting the entire formalization.
arXiv Detail & Related papers (2023-03-08T20:08:53Z) - Language Embeddings Sometimes Contain Typological Generalizations [0.0]
We train neural models for a range of natural language processing tasks on a massively multilingual dataset of Bible translations in 1295 languages.
The learned language representations are then compared to existing typological databases as well as to a novel set of quantitative syntactic and morphological features.
We conclude that some generalizations are surprisingly close to traditional features from linguistic typology, but that most models, as well as those of previous work, do not appear to have made linguistically meaningful generalizations.
arXiv Detail & Related papers (2023-01-19T15:09:59Z) - Language Models as Inductive Reasoners [125.99461874008703]
We propose a new paradigm (task) for inductive reasoning, which is to induce natural language rules from natural language facts.
We create a dataset termed DEER containing 1.2k rule-fact pairs for the task, where rules and facts are written in natural language.
We provide the first and comprehensive analysis of how well pretrained language models can induce natural language rules from natural language facts.
arXiv Detail & Related papers (2022-12-21T11:12:14Z) - CLSE: Corpus of Linguistically Significant Entities [58.29901964387952]
We release a Corpus of Linguistically Significant Entities (CLSE) annotated by experts.
CLSE covers 74 different semantic types to support various applications from airline ticketing to video games.
We create a linguistically representative NLG evaluation benchmark in three languages: French, Marathi, and Russian.
arXiv Detail & Related papers (2022-11-04T12:56:12Z) - Benchmarking Language Models for Code Syntax Understanding [79.11525961219591]
Pre-trained language models have demonstrated impressive performance in both natural language processing and program understanding.
In this work, we perform the first thorough benchmarking of the state-of-the-art pre-trained models for identifying the syntactic structures of programs.
Our findings point out key limitations of existing pre-training methods for programming languages, and suggest the importance of modeling code syntactic structures.
arXiv Detail & Related papers (2022-10-26T04:47:18Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Constrained Language Models Yield Few-Shot Semantic Parsers [73.50960967598654]
We explore the use of large pretrained language models as few-shot semantics.
The goal in semantic parsing is to generate a structured meaning representation given a natural language input.
We use language models to paraphrase inputs into a controlled sublanguage resembling English that can be automatically mapped to a target meaning representation.
arXiv Detail & Related papers (2021-04-18T08:13:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.