Align to Structure: Aligning Large Language Models with Structural Information
- URL: http://arxiv.org/abs/2504.03622v1
- Date: Fri, 04 Apr 2025 17:40:04 GMT
- Title: Align to Structure: Aligning Large Language Models with Structural Information
- Authors: Zae Myung Kim, Anand Ramachandran, Farideh Tavazoee, Joo-Kyung Kim, Oleg Rokhlenko, Dongyeop Kang,
- Abstract summary: We introduce Structural Alignment, a novel method that aligns large language models with human-like discourse structures to enhance long-form text generation.<n>We employ a dense reward scheme within a Proximal Policy Optimization framework, assigning fine-grained, token-level rewards based on the discourse distinctiveness relative to human writing.
- Score: 26.960069076925386
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Generating long, coherent text remains a challenge for large language models (LLMs), as they lack hierarchical planning and structured organization in discourse generation. We introduce Structural Alignment, a novel method that aligns LLMs with human-like discourse structures to enhance long-form text generation. By integrating linguistically grounded discourse frameworks into reinforcement learning, our approach guides models to produce coherent and well-organized outputs. We employ a dense reward scheme within a Proximal Policy Optimization framework, assigning fine-grained, token-level rewards based on the discourse distinctiveness relative to human writing. Two complementary reward models are evaluated: the first improves readability by scoring surface-level textual features to provide explicit structuring, while the second reinforces deeper coherence and rhetorical sophistication by analyzing global discourse patterns through hierarchical discourse motifs, outperforming both standard and RLHF-enhanced models in tasks such as essay generation and long-document summarization. All training data and code will be publicly shared at https://github.com/minnesotanlp/struct_align.
Related papers
- Detecting Document-level Paraphrased Machine Generated Content: Mimicking Human Writing Style and Involving Discourse Features [57.34477506004105]
Machine-generated content poses challenges such as academic plagiarism and the spread of misinformation.
We introduce novel methodologies and datasets to overcome these challenges.
We propose MhBART, an encoder-decoder model designed to emulate human writing style.
We also propose DTransformer, a model that integrates discourse analysis through PDTB preprocessing to encode structural features.
arXiv Detail & Related papers (2024-12-17T08:47:41Z) - Annotating FrameNet via Structure-Conditioned Language Generation [15.877232416259805]
We propose a framework to produce novel frame-semantically annotated sentences following an overgenerate-and-filter approach.
Our results show that conditioning on rich, explicit semantic information tends to produce generations with high human acceptance.
arXiv Detail & Related papers (2024-06-07T11:01:15Z) - Instruct-SCTG: Guiding Sequential Controlled Text Generation through
Instructions [42.67608830386934]
Instruct-SCTG is a sequential framework that harnesses instruction-tuned language models to generate structurally coherent text.
Our framework generates articles in a section-by-section manner, aligned with the desired human structure using natural language instructions.
arXiv Detail & Related papers (2023-12-19T16:20:49Z) - Learning Hierarchical Prompt with Structured Linguistic Knowledge for
Vision-Language Models [43.56153167864033]
We propose a novel approach to harnessing structured knowledge in large language models (LLMs)
We introduce a relationship-guided attention module to capture pair-wise associations among entities and attributes for low-level prompt learning.
In addition, by incorporating high-level and global-level prompts, the proposed hierarchical structure forges cross-level interlinks and empowers the model to handle more complex and long-term relationships.
arXiv Detail & Related papers (2023-12-11T12:14:06Z) - Revisiting Conversation Discourse for Dialogue Disentanglement [88.3386821205896]
We propose enhancing dialogue disentanglement by taking full advantage of the dialogue discourse characteristics.
We develop a structure-aware framework to integrate the rich structural features for better modeling the conversational semantic context.
Our work has great potential to facilitate broader multi-party multi-thread dialogue applications.
arXiv Detail & Related papers (2023-06-06T19:17:47Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - Model Criticism for Long-Form Text Generation [113.13900836015122]
We apply a statistical tool, model criticism in latent space, to evaluate the high-level structure of generated text.
We perform experiments on three representative aspects of high-level discourse -- coherence, coreference, and topicality.
We find that transformer-based language models are able to capture topical structures but have a harder time maintaining structural coherence or modeling coreference.
arXiv Detail & Related papers (2022-10-16T04:35:58Z) - Long Text Generation by Modeling Sentence-Level and Discourse-Level
Coherence [59.51720326054546]
We propose a long text generation model, which can represent the prefix sentences at sentence level and discourse level in the decoding process.
Our model can generate more coherent texts than state-of-the-art baselines.
arXiv Detail & Related papers (2021-05-19T07:29:08Z) - An End-to-End Document-Level Neural Discourse Parser Exploiting
Multi-Granularity Representations [24.986030179701405]
We exploit robust representations derived from multiple levels of granularity across syntax and semantics.
We incorporate such representations in an end-to-end encoder-decoder neural architecture for more resourceful discourse processing.
arXiv Detail & Related papers (2020-12-21T08:01:04Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.