Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning
- URL: http://arxiv.org/abs/2309.10814v2
- Date: Fri, 29 Mar 2024 01:18:18 GMT
- Title: Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning
- Authors: Tianhua Zhang, Jiaxin Ge, Hongyin Luo, Yung-Sung Chuang, Mingye Gao, Yuan Gong, Xixin Wu, Yoon Kim, Helen Meng, James Glass,
- Abstract summary: We propose natural language embedded programs (NLEP) as a unifying framework for addressing math/symbolic reasoning, natural language understanding, and instruction following tasks.
Our approach prompts a language model to generate full Python programs that define functions over data structures which contain natural language representations of structured knowledge.
A Python interpreter then executes the generated code and prints the output.
- Score: 84.12154024070024
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: How can we perform computations over natural language representations to solve tasks that require symbolic and numeric reasoning? We propose natural language embedded programs (NLEP) as a unifying framework for addressing math/symbolic reasoning, natural language understanding, and instruction following tasks. Our approach prompts a language model to generate full Python programs that define functions over data structures which contain natural language representations of structured knowledge. A Python interpreter then executes the generated code and prints the output. Despite using a task-general prompt, we find that this approach can improve upon strong baselines across a range of different tasks including math and symbolic reasoning, text classification, question answering, and instruction following. We found that the generated programs are interpretable since they outline the exact reasoning process followed by the program interpreter.
Related papers
- Which Programming Language and What Features at Pre-training Stage Affect Downstream Logical Inference Performance? [26.91104188917787]
Large language models (LLMs) have demonstrated remarkable generalization abilities in mathematics and logical reasoning tasks.
Our research aims to verify which programming languages and features during pre-training affect logical inference performance.
arXiv Detail & Related papers (2024-10-09T10:13:13Z) - CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution [50.7413285637879]
The CRUXEVAL-X code reasoning benchmark contains 19 programming languages.
It comprises at least 600 subjects for each language, along with 19K content-consistent tests in total.
Even a model trained solely on Python can achieve at most 34.4% Pass@1 in other languages.
arXiv Detail & Related papers (2024-08-23T11:43:00Z) - AIOS Compiler: LLM as Interpreter for Natural Language Programming and Flow Programming of AI Agents [38.580779075892636]
We develop a novel system for Code Representation and Execution (CoRE)
The proposed system unifies natural language programming, pseudo-code programming, and flow programming under the same representation for constructing language agents.
During the execution, we incorporate external memory to minimize redundancy.
arXiv Detail & Related papers (2024-05-11T04:29:03Z) - Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models [17.76252625790628]
This paper presents Think-and-Execute, a framework that decomposes the reasoning process of language models into two steps.
With extensive experiments on seven algorithmic reasoning tasks, we demonstrate the effectiveness of Think-and-Execute.
arXiv Detail & Related papers (2024-04-03T08:49:11Z) - Understanding Programs by Exploiting (Fuzzing) Test Cases [26.8259045248779]
We propose to incorporate the relationship between inputs and possible outputs/behaviors into learning, for achieving a deeper semantic understanding of programs.
To obtain inputs that are representative enough to trigger the execution of most part of the code, we resort to fuzz testing and propose fuzz tuning.
The effectiveness of the proposed method is verified on two program understanding tasks including code clone detection and code classification, and it outperforms current state-of-the-arts by large margins.
arXiv Detail & Related papers (2023-05-23T01:51:46Z) - Language Models as Inductive Reasoners [125.99461874008703]
We propose a new paradigm (task) for inductive reasoning, which is to induce natural language rules from natural language facts.
We create a dataset termed DEER containing 1.2k rule-fact pairs for the task, where rules and facts are written in natural language.
We provide the first and comprehensive analysis of how well pretrained language models can induce natural language rules from natural language facts.
arXiv Detail & Related papers (2022-12-21T11:12:14Z) - Python Code Generation by Asking Clarification Questions [57.63906360576212]
In this work, we introduce a novel and more realistic setup for this task.
We hypothesize that the under-specification of a natural language description can be resolved by asking clarification questions.
We collect and introduce a new dataset named CodeClarQA containing pairs of natural language descriptions and code with created synthetic clarification questions and answers.
arXiv Detail & Related papers (2022-12-19T22:08:36Z) - Categorical Tools for Natural Language Processing [0.0]
This thesis develops the translation between category theory and computational linguistics.
The three chapters deal with syntax, semantics and pragmatics.
The resulting functorial models can be composed to form games where equilibria are the solutions of language processing tasks.
arXiv Detail & Related papers (2022-12-13T15:12:37Z) - Benchmarking Language Models for Code Syntax Understanding [79.11525961219591]
Pre-trained language models have demonstrated impressive performance in both natural language processing and program understanding.
In this work, we perform the first thorough benchmarking of the state-of-the-art pre-trained models for identifying the syntactic structures of programs.
Our findings point out key limitations of existing pre-training methods for programming languages, and suggest the importance of modeling code syntactic structures.
arXiv Detail & Related papers (2022-10-26T04:47:18Z) - Natural Language to Code Translation with Execution [82.52142893010563]
Execution result--minimum Bayes risk decoding for program selection.
We show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks.
arXiv Detail & Related papers (2022-04-25T06:06:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.