Investigating the (De)Composition Capabilities of Large Language Models in Natural-to-Formal Language Conversion
- URL: http://arxiv.org/abs/2501.14649v1
- Date: Fri, 24 Jan 2025 17:15:09 GMT
- Title: Investigating the (De)Composition Capabilities of Large Language Models in Natural-to-Formal Language Conversion
- Authors: Ziyao Xu, Houfeng Wang,
- Abstract summary: Large language models (LLMs) need to have strong capabilities of decomposition and composition in generalized and robust natural-to-formal language conversion (N2F)<n>We propose the DEDC framework, which performs sample and task construction, allowing evaluation of the set of decomposition and composition capabilities of LLMs in N2F.<n>Our work provides a new perspective for investigating the basic capabilities of decomposition and composition of LLMs in N2F.
- Score: 21.68354181391989
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To achieve generalized and robust natural-to-formal language conversion (N2F), large language models (LLMs) need to have strong capabilities of decomposition and composition in N2F when faced with an unfamiliar formal language and be able to cope with compositional gaps and counter-intuitive symbolic names. To investigate whether LLMs have this set of basic capabilities in N2F, we propose the DEDC framework. This framework semi-automatically performs sample and task construction, allowing decoupled evaluation of the set of decomposition and composition capabilities of LLMs in N2F. Based on this framework, we evaluate and analyze the most advanced LLMs, and the main findings include that: (1) the LLMs are deficient in both decomposition and composition; (2) the LLMs show a wide coverage of error types that can be attributed to deficiencies in natural language understanding and the learning and use of symbolic systems; (3) compositional gaps and counter-intuitive symbolic names both affect the decomposition and composition of the LLMs. Our work provides a new perspective for investigating the basic capabilities of decomposition and composition of LLMs in N2F. The detailed analysis of deficiencies and attributions can help subsequent improvements of LLMs.
Related papers
- Faithful and Robust LLM-Driven Theorem Proving for NLI Explanations [13.485604499678262]
Natural language explanations play a fundamental role in Natural Language Inference (NLI)<n>Recent work has shown that the interaction of large language models (LLMs) with theorem provers (TPs) can help verify and improve the validity of NLI explanations.<n>This paper investigates strategies to alleviate semantic loss during autoformalisation.
arXiv Detail & Related papers (2025-05-30T06:38:39Z) - Feasibility with Language Models for Open-World Compositional Zero-Shot Learning [96.6544564242316]
In Open-World Compositional Zero-Shot Learning, all possible state-object combinations are considered as unseen classes.<n>Our work focuses on using external auxiliary knowledge to determine the feasibility of state-object combinations.
arXiv Detail & Related papers (2025-05-16T12:37:08Z) - SR-LLM: Rethinking the Structured Representation in Large Language Model [25.876300810298797]
We propose SR-LLM to explore a superior way of integrating structured representation with Large Language Models.
Performance improvements were observed in widely downstream datasets, with particularly notable gains of 3.17% and 12.38% in PAWS.
arXiv Detail & Related papers (2025-02-20T08:17:56Z) - Enhancing LLM Character-Level Manipulation via Divide and Conquer [74.55804812450164]
Large Language Models (LLMs) have demonstrated strong generalization capabilities across a wide range of natural language processing (NLP) tasks.
They exhibit notable weaknesses in character-level string manipulation, struggling with fundamental operations such as character deletion, insertion, and substitution.
We propose Character-Level Manipulation via Divide and Conquer, a novel approach designed to bridge the gap between token-level processing and character-level manipulation.
arXiv Detail & Related papers (2025-02-12T07:37:39Z) - CryptoX : Compositional Reasoning Evaluation of Large Language Models [18.927129952741904]
We introduce CryptoX, an evaluation framework that combines existing benchmarks and cryptographic.
We conduct detailed experiments on widely used open-source and closed-source LLMs using CryptoBench.
We highlight the value of independently studying compositional reasoning and emphasize the need to enhance the compositional reasoning capabilities of LLMs.
arXiv Detail & Related papers (2025-02-08T17:19:43Z) - KcMF: A Knowledge-compliant Framework for Schema and Entity Matching with Fine-tuning-free LLMs [14.376057807754668]
Large language models (LLMs) suffer from hallucinations and confusion about task instructions.
We present the Knowledge-Compliant Matching Framework (KcMF) that addresses these issues without the need for domain-specific fine-tuning.
KcMF employs a pseudo-code-based task decomposition strategy to adopt task-specific natural language statements.
arXiv Detail & Related papers (2024-10-16T11:50:02Z) - Enhancing LLM's Cognition via Structurization [41.13997892843677]
Large language models (LLMs) process input contexts through a causal and sequential perspective.
This paper presents a novel concept of context structurization.
Specifically, we transform the plain, unordered contextual sentences into well-ordered and hierarchically structurized elements.
arXiv Detail & Related papers (2024-07-23T12:33:58Z) - Benchmarking Complex Instruction-Following with Multiple Constraints Composition [72.82640456309821]
How to evaluate the ability of complex instruction-following of large language models (LLMs) has become a critical research problem.
Existing benchmarks mainly focus on modeling different types of constraints in human instructions while neglecting the composition of different constraints.
We propose ComplexBench, a benchmark for comprehensively evaluating the ability of LLMs to follow complex instructions composed of multiple constraints.
arXiv Detail & Related papers (2024-07-04T14:50:45Z) - Large Language Models are Interpretable Learners [53.56735770834617]
In this paper, we show a combination of Large Language Models (LLMs) and symbolic programs can bridge the gap between expressiveness and interpretability.
The pretrained LLM with natural language prompts provides a massive set of interpretable modules that can transform raw input into natural language concepts.
As the knowledge learned by LSP is a combination of natural language descriptions and symbolic rules, it is easily transferable to humans (interpretable) and other LLMs.
arXiv Detail & Related papers (2024-06-25T02:18:15Z) - Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing [56.75702900542643]
We introduce AlphaLLM for the self-improvements of Large Language Models.
It integrates Monte Carlo Tree Search (MCTS) with LLMs to establish a self-improving loop.
Our experimental results show that AlphaLLM significantly enhances the performance of LLMs without additional annotations.
arXiv Detail & Related papers (2024-04-18T15:21:34Z) - CausalBench: A Comprehensive Benchmark for Causal Learning Capability of LLMs [27.362012903540492]
The ability to understand causality significantly impacts the competence of large language models (LLMs) in output explanation and counterfactual reasoning.
The ability to understand causality significantly impacts the competence of large language models (LLMs) in output explanation and counterfactual reasoning.
arXiv Detail & Related papers (2024-04-09T14:40:08Z) - FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition [56.76951887823882]
Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks.
We present FAC$2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation.
arXiv Detail & Related papers (2024-02-29T21:05:37Z) - FaithLM: Towards Faithful Explanations for Large Language Models [67.29893340289779]
Large Language Models (LLMs) have become proficient in addressing complex tasks by leveraging their internal knowledge and reasoning capabilities.
The black-box nature of these models complicates the task of explaining their decision-making processes.
We introduce FaithLM to explain the decision of LLMs with natural language (NL) explanations.
arXiv Detail & Related papers (2024-02-07T09:09:14Z) - Improving Open Information Extraction with Large Language Models: A
Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text.
Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.