Fine-tuning of Large Language Models for Constituency Parsing Using a Sequence to Sequence Approach
- URL: http://arxiv.org/abs/2510.16604v1
- Date: Sat, 18 Oct 2025 18:00:20 GMT
- Title: Fine-tuning of Large Language Models for Constituency Parsing Using a Sequence to Sequence Approach
- Authors: Francisco Jose Cortes Delgado, Eduardo Martinez Gracia, Rafael Valencia Garcia,
- Abstract summary: This work explores a novel approach to phrase-structure analysis by fine-tuning large language models.<n>The main objective is to extend the capabilities of MiSintaxis, a tool designed for teaching Spanish syntax.<n>The results demonstrate high accuracy in phrase-structure analysis and highlight the potential of this methodology.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recent advances in natural language processing with large neural models have opened new possibilities for syntactic analysis based on machine learning. This work explores a novel approach to phrase-structure analysis by fine-tuning large language models (LLMs) to translate an input sentence into its corresponding syntactic structure. The main objective is to extend the capabilities of MiSintaxis, a tool designed for teaching Spanish syntax. Several models from the Hugging Face repository were fine-tuned using training data generated from the AnCora-ES corpus, and their performance was evaluated using the F1 score. The results demonstrate high accuracy in phrase-structure analysis and highlight the potential of this methodology.
Related papers
- Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback [50.84142264245052]
This work introduces the Align-SLM framework to enhance the semantic understanding of textless Spoken Language Models (SLMs)<n>Our approach generates multiple speech continuations from a given prompt and uses semantic metrics to create preference data for Direct Preference Optimization (DPO)<n>We evaluate the framework using ZeroSpeech 2021 benchmarks for lexical and syntactic modeling, the spoken version of the StoryCloze dataset for semantic coherence, and other speech generation metrics, including the GPT4-o score and human evaluation.
arXiv Detail & Related papers (2024-11-04T06:07:53Z) - Split and Rephrase with Large Language Models [2.499907423888049]
Split and Rephrase (SPRP) task consists in splitting complex sentences into a sequence of shorter grammatical sentences.
We evaluate large language models on the task, showing that they can provide large improvements over the state of the art on the main metrics.
arXiv Detail & Related papers (2023-12-18T10:16:37Z) - RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models [57.12888828853409]
RAVEN is a model that combines retrieval-augmented masked language modeling and prefix language modeling.
Fusion-in-Context Learning enables the model to leverage more in-context examples without requiring additional training.
Our work underscores the potential of retrieval-augmented encoder-decoder language models for in-context learning.
arXiv Detail & Related papers (2023-08-15T17:59:18Z) - Syntax-Aware Complex-Valued Neural Machine Translation [14.772317918560548]
We propose a method to incorporate syntax information into a complex-valued-Decoder architecture.
The proposed model jointly learns word-level and syntax-level attention scores from the source side to the target side using an attention mechanism.
The experimental results demonstrate that the proposed method can bring significant improvements in BLEU scores on two datasets.
arXiv Detail & Related papers (2023-07-17T15:58:05Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - Constructing Word-Context-Coupled Space Aligned with Associative
Knowledge Relations for Interpretable Language Modeling [0.0]
The black-box structure of the deep neural network in pre-trained language models seriously limits the interpretability of the language modeling process.
A Word-Context-Coupled Space (W2CSpace) is proposed by introducing the alignment processing between uninterpretable neural representation and interpretable statistical logic.
Our language model can achieve better performance and highly credible interpretable ability compared to related state-of-the-art methods.
arXiv Detail & Related papers (2023-05-19T09:26:02Z) - Towards Linguistically Informed Multi-Objective Pre-Training for Natural
Language Inference [0.38233569758620045]
We introduce a linguistically enhanced combination of pre-training methods for transformers.
The pre-training objectives include POS-tagging, synset prediction based on semantic knowledge graphs, and parent prediction based on dependency parse trees.
Our approach achieves competitive results on the Natural Language Inference task, compared to the state of the art.
arXiv Detail & Related papers (2022-12-14T10:50:13Z) - Few-shot Subgoal Planning with Language Models [58.11102061150875]
We show that language priors encoded in pre-trained language models allow us to infer fine-grained subgoal sequences.
In contrast to recent methods which make strong assumptions about subgoal supervision, our experiments show that language models can infer detailed subgoal sequences without any fine-tuning.
arXiv Detail & Related papers (2022-05-28T01:03:30Z) - TAGPRIME: A Unified Framework for Relational Structure Extraction [71.88926365652034]
TAGPRIME is a sequence tagging model that appends priming words about the information of the given condition to the input text.
With the self-attention mechanism in pre-trained language models, the priming words make the output contextualized representations contain more information about the given condition.
Extensive experiments and analyses on three different tasks that cover ten datasets across five different languages demonstrate the generality and effectiveness of TAGPRIME.
arXiv Detail & Related papers (2022-05-25T08:57:46Z) - Exploiting Language Model for Efficient Linguistic Steganalysis: An
Empirical Study [23.311007481830647]
We present two methods to efficient linguistic steganalysis.
One is to pre-train a language model based on RNN, and the other is to pre-train a sequence autoencoder.
arXiv Detail & Related papers (2021-07-26T12:37:18Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z) - Exploring Software Naturalness through Neural Language Models [56.1315223210742]
The Software Naturalness hypothesis argues that programming languages can be understood through the same techniques used in natural language processing.
We explore this hypothesis through the use of a pre-trained transformer-based language model to perform code analysis tasks.
arXiv Detail & Related papers (2020-06-22T21:56:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.