Related papers: Self-Correction Makes LLMs Better Parsers

Self-Correction Makes LLMs Better Parsers

URL: http://arxiv.org/abs/2504.14165v1
Date: Sat, 19 Apr 2025 03:50:59 GMT
Title: Self-Correction Makes LLMs Better Parsers
Authors: Ziyan Zhang, Yang Hou, Chen Gong, Zhenghua Li,
Abstract summary: Large language models (LLMs) have achieved remarkable success across various natural language processing (NLP) tasks.<n>Recent studies suggest that they still face challenges in performing fundamental NLP tasks essential for deep language understanding.<n>We propose a self-correction method that leverages grammar rules from existing treebanks to guide LLMs in correcting previous errors.
Score: 19.20952673157709
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) have achieved remarkable success across various natural language processing (NLP) tasks. However, recent studies suggest that they still face challenges in performing fundamental NLP tasks essential for deep language understanding, particularly syntactic parsing. In this paper, we conduct an in-depth analysis of LLM parsing capabilities, delving into the specific shortcomings of their parsing results. We find that LLMs may stem from limitations to fully leverage grammar rules in existing treebanks, which restricts their capability to generate valid syntactic structures. To help LLMs acquire knowledge without additional training, we propose a self-correction method that leverages grammar rules from existing treebanks to guide LLMs in correcting previous errors. Specifically, we automatically detect potential errors and dynamically search for relevant rules, offering hints and examples to guide LLMs in making corrections themselves. Experimental results on three datasets with various LLMs, demonstrate that our method significantly improves performance in both in-domain and cross-domain settings on the English and Chinese datasets.

Related papers

Prompt and circumstance: A word-by-word LLM prompting approach to interlinear glossing for low-resource languages [6.4977738682502295]
We investigate the effectiveness of a retrieval-based LLM prompting approach to glossing, applied to the seven languages from the SIGMORPHON 2023 shared task.<n>Our system beats the BERT-based shared task baseline for every language in the morpheme-level score category.<n>In a case study on Tsez, we ask the LLM to automatically create and follow linguistic instructions, reducing errors on a confusing grammatical feature.
arXiv Detail & Related papers (2025-02-13T21:23:16Z)
Can LLMs Help Create Grammar?: Automating Grammar Creation for Endangered Languages with In-Context Learning [0.0]
This paper explores how Large Language Models (LLMs) can assist in generating grammatical information for low-resource languages with limited amount of data.<n>Our methodology involves organising the existing linguistic data and prompting to efficiently enable to generate formal XLE grammar.<n>This study highlights the potential of LLMs to enhance language documentation efforts, providing a cost-effective solution for generating linguistic data and contributing to the preservation of endangered languages.
arXiv Detail & Related papers (2024-12-14T20:43:12Z)
Tokenization Matters! Degrading Large Language Models through Challenging Their Tokenization [12.885866125783618]
Large Language Models (LLMs) tend to produce inaccurate responses to specific queries. We construct an adversarial dataset, named as $textbfADT (Adrial dataset for Tokenizer)$ to challenge LLMs' tokenization. Our empirical results reveal that our ADT is highly effective on challenging the tokenization of leading LLMs, including GPT-4o, Llama-3, Qwen2.5-max and so on.
arXiv Detail & Related papers (2024-05-27T11:39:59Z)
Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL [78.80673954827773]
Large Language Models (LLMs) play a crucial role in capturing structured semantics to enhance language understanding, improve interpretability, and reduce bias. We propose using Semantic Role Labeling (SRL) as a fundamental task to explore LLMs' ability to extract structured semantics. We find interesting potential: LLMs can indeed capture semantic structures, and scaling-up doesn't always mirror potential. We are surprised to discover that significant overlap in the errors is made by both LLMs and untrained humans, accounting for almost 30% of all errors.
arXiv Detail & Related papers (2024-05-10T11:44:05Z)
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing [56.75702900542643]
We introduce AlphaLLM for the self-improvements of Large Language Models.<n>It integrates Monte Carlo Tree Search (MCTS) with LLMs to establish a self-improving loop.<n>Our experimental results show that AlphaLLM significantly enhances the performance of LLMs without additional annotations.
arXiv Detail & Related papers (2024-04-18T15:21:34Z)
FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition [56.76951887823882]
Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks. We present FAC$2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation.
arXiv Detail & Related papers (2024-02-29T21:05:37Z)
Rethinking the Roles of Large Language Models in Chinese Grammatical Error Correction [62.409807640887834]
Chinese Grammatical Error Correction (CGEC) aims to correct all potential grammatical errors in the input sentences. LLMs' performance as correctors on CGEC remains unsatisfactory due to its challenging task focus. We rethink the roles of LLMs in the CGEC task so that they can be better utilized and explored in CGEC.
arXiv Detail & Related papers (2024-02-18T01:40:34Z)
Self-Augmented In-Context Learning for Unsupervised Word Translation [23.495503962839337]
Large language models (LLMs) demonstrate strong word translation or bilingual lexicon induction (BLI) capabilities in few-shot setups. We propose self-augmented in-context learning (SAIL) for unsupervised BLI. Our method shows substantial gains over zero-shot prompting of LLMs on two established BLI benchmarks.
arXiv Detail & Related papers (2024-02-15T15:43:05Z)
Constituency Parsing using LLMs [47.17239291933248]
Constituency parsing is a fundamental yet unsolved challenge in natural language processing.<n>We evaluate the performance of recent large language models (LLMs) under zero-shot, few-shot, and supervised fine-tuning learning paradigms.<n>Motivated by this observation, we propose two strategies to guide LLMs to generate more accurate constituent trees.
arXiv Detail & Related papers (2023-10-30T11:39:11Z)
Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity [61.54815512469125]
This survey addresses the crucial issue of factuality in Large Language Models (LLMs) As LLMs find applications across diverse domains, the reliability and accuracy of their outputs become vital.
arXiv Detail & Related papers (2023-10-11T14:18:03Z)
TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models [52.734140807634624]
Aligned large language models (LLMs) demonstrate exceptional capabilities in task-solving, following instructions, and ensuring safety. Existing continual learning benchmarks lack sufficient challenge for leading aligned LLMs. We introduce TRACE, a novel benchmark designed to evaluate continual learning in LLMs.
arXiv Detail & Related papers (2023-10-10T16:38:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.