Related papers: Align to Structure: Aligning Large Language Models with Structural Information

Align to Structure: Aligning Large Language Models with Structural Information

URL: http://arxiv.org/abs/2504.03622v1
Date: Fri, 04 Apr 2025 17:40:04 GMT
Title: Align to Structure: Aligning Large Language Models with Structural Information
Authors: Zae Myung Kim, Anand Ramachandran, Farideh Tavazoee, Joo-Kyung Kim, Oleg Rokhlenko, Dongyeop Kang,
Abstract summary: We introduce Structural Alignment, a novel method that aligns large language models with human-like discourse structures to enhance long-form text generation.<n>We employ a dense reward scheme within a Proximal Policy Optimization framework, assigning fine-grained, token-level rewards based on the discourse distinctiveness relative to human writing.
Score: 26.960069076925386
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Generating long, coherent text remains a challenge for large language models (LLMs), as they lack hierarchical planning and structured organization in discourse generation. We introduce Structural Alignment, a novel method that aligns LLMs with human-like discourse structures to enhance long-form text generation. By integrating linguistically grounded discourse frameworks into reinforcement learning, our approach guides models to produce coherent and well-organized outputs. We employ a dense reward scheme within a Proximal Policy Optimization framework, assigning fine-grained, token-level rewards based on the discourse distinctiveness relative to human writing. Two complementary reward models are evaluated: the first improves readability by scoring surface-level textual features to provide explicit structuring, while the second reinforces deeper coherence and rhetorical sophistication by analyzing global discourse patterns through hierarchical discourse motifs, outperforming both standard and RLHF-enhanced models in tasks such as essay generation and long-document summarization. All training data and code will be publicly shared at https://github.com/minnesotanlp/struct_align.

Related papers

Detecting Document-level Paraphrased Machine Generated Content: Mimicking Human Writing Style and Involving Discourse Features [57.34477506004105]
Machine-generated content poses challenges such as academic plagiarism and the spread of misinformation. We introduce novel methodologies and datasets to overcome these challenges. We propose MhBART, an encoder-decoder model designed to emulate human writing style. We also propose DTransformer, a model that integrates discourse analysis through PDTB preprocessing to encode structural features.
arXiv Detail & Related papers (2024-12-17T08:47:41Z)
Annotating FrameNet via Structure-Conditioned Language Generation [15.877232416259805]
We propose a framework to produce novel frame-semantically annotated sentences following an overgenerate-and-filter approach. Our results show that conditioning on rich, explicit semantic information tends to produce generations with high human acceptance.
arXiv Detail & Related papers (2024-06-07T11:01:15Z)
Instruct-SCTG: Guiding Sequential Controlled Text Generation through Instructions [42.67608830386934]
Instruct-SCTG is a sequential framework that harnesses instruction-tuned language models to generate structurally coherent text. Our framework generates articles in a section-by-section manner, aligned with the desired human structure using natural language instructions.
arXiv Detail & Related papers (2023-12-19T16:20:49Z)
Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models [43.56153167864033]
We propose a novel approach to harnessing structured knowledge in large language models (LLMs) We introduce a relationship-guided attention module to capture pair-wise associations among entities and attributes for low-level prompt learning. In addition, by incorporating high-level and global-level prompts, the proposed hierarchical structure forges cross-level interlinks and empowers the model to handle more complex and long-term relationships.
arXiv Detail & Related papers (2023-12-11T12:14:06Z)
Revisiting Conversation Discourse for Dialogue Disentanglement [88.3386821205896]
We propose enhancing dialogue disentanglement by taking full advantage of the dialogue discourse characteristics. We develop a structure-aware framework to integrate the rich structural features for better modeling the conversational semantic context. Our work has great potential to facilitate broader multi-party multi-thread dialogue applications.
arXiv Detail & Related papers (2023-06-06T19:17:47Z)
Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs. Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z)
Model Criticism for Long-Form Text Generation [113.13900836015122]
We apply a statistical tool, model criticism in latent space, to evaluate the high-level structure of generated text. We perform experiments on three representative aspects of high-level discourse -- coherence, coreference, and topicality. We find that transformer-based language models are able to capture topical structures but have a harder time maintaining structural coherence or modeling coreference.
arXiv Detail & Related papers (2022-10-16T04:35:58Z)
Long Text Generation by Modeling Sentence-Level and Discourse-Level Coherence [59.51720326054546]
We propose a long text generation model, which can represent the prefix sentences at sentence level and discourse level in the decoding process. Our model can generate more coherent texts than state-of-the-art baselines.
arXiv Detail & Related papers (2021-05-19T07:29:08Z)
An End-to-End Document-Level Neural Discourse Parser Exploiting Multi-Granularity Representations [24.986030179701405]
We exploit robust representations derived from multiple levels of granularity across syntax and semantics. We incorporate such representations in an end-to-end encoder-decoder neural architecture for more resourceful discourse processing.
arXiv Detail & Related papers (2020-12-21T08:01:04Z)
SLM: Learning a Discourse Language Representation with Sentence Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation. We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.