OUTLINEFORGE: Hierarchical Reinforcement Learning with Explicit States for Scientific Writing
- URL: http://arxiv.org/abs/2601.09858v1
- Date: Wed, 14 Jan 2026 20:37:26 GMT
- Title: OUTLINEFORGE: Hierarchical Reinforcement Learning with Explicit States for Scientific Writing
- Authors: Yilin Bao, Ziyao He, Zayden Yang,
- Abstract summary: We present a reinforcement learning framework that casts scientific outline construction as a long-horizon planning problem.<n>We also introduce a benchmark for scientific paper generation that evaluates document planning, input utilization, reference faithfulness, outline organization, and content-level factual accuracy.
- Score: 5.930754928033565
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scientific paper generation requires document-level planning and factual grounding, but current large language models, despite their strong local fluency, often fail in global structure, input coverage, and citation consistency. We present a reinforcement learning framework that casts scientific outline construction as a long-horizon planning problem over hierarchical document structures. Our approach models edit evolving outlines through structured actions, enabling the system to incrementally build a complete scientific manuscript. To support effective and stabilize learning,we introduce a two-stage optimization procedure consisting of (i) backward outline reconstruction from partial plans to enforce global structural consistency, and (ii) forward value-guided reinforcement learning with rewards explicitly modeling scientific correctness, discourse coherence, and citation fidelity. In addition, We further introduce a benchmark for scientific paper generation that evaluates document planning, input utilization, reference faithfulness, outline organization, and content-level factual accuracy. Our results show consistent improvements over strong neural and LLM baselines, particularly in long-range structural coherence and citation reliability.
Related papers
- DeepSynth-Eval: Objectively Evaluating Information Consolidation in Deep Survey Writing [53.85037373860246]
We introduce Deep Synth-Eval, a benchmark designed to objectively evaluate information consolidation capabilities.<n>We propose a fine-grained evaluation protocol using General Checklists (for factual coverage) and Constraint Checklists (for structural organization)<n>Our results demonstrate that agentic plan-and-write significantly outperform single-turn generation.
arXiv Detail & Related papers (2026-01-07T03:07:52Z) - Paper2SysArch: Structure-Constrained System Architecture Generation from Scientific Papers [10.395280181257737]
We introduce a novel benchmark to quantitatively evaluate the automated generation of diagrams from text.<n>It consists of 3,000 research papers paired with their corresponding high-quality ground-truth diagrams and is accompanied by a three-tiered evaluation metric.<n>We propose Paper2Arch, an end-to-end system that leverages multi-agent collaboration to convert papers into structured, editable diagrams.
arXiv Detail & Related papers (2025-11-22T12:24:30Z) - Structure-R1: Dynamically Leveraging Structural Knowledge in LLM Reasoning through Reinforcement Learning [29.722512436773638]
We propose textscStructure-R1, a framework that transforms retrieved content into structured representations optimized for reasoning.<n>We show that textscStructure-R1 consistently achieves competitive performance with a 7B-scale backbone model.<n>Our theoretical analysis demonstrates how structured representations enhance reasoning by improving information density and contextual clarity.
arXiv Detail & Related papers (2025-10-16T23:19:28Z) - Context-Aware Hierarchical Taxonomy Generation for Scientific Papers via LLM-Guided Multi-Aspect Clustering [59.54662810933882]
Existing taxonomy construction methods, leveraging unsupervised clustering or direct prompting of large language models, often lack coherence and granularity.<n>We propose a novel context-aware hierarchical taxonomy generation framework that integrates LLM-guided multi-aspect encoding with dynamic clustering.
arXiv Detail & Related papers (2025-09-23T15:12:58Z) - Beyond Chunking: Discourse-Aware Hierarchical Retrieval for Long Document Question Answering [51.7493726399073]
We present a discourse-aware hierarchical framework to enhance long document question answering.<n>The framework involves three key innovations: specialized discourse parsing for lengthy documents, LLM-based enhancement of discourse relation nodes, and structure-guided hierarchical retrieval.
arXiv Detail & Related papers (2025-05-26T14:45:12Z) - XtraGPT: Context-Aware and Controllable Academic Paper Revision [43.263488839387584]
We propose a human-AI collaboration framework for academic paper revision centered on criteria-guided intent alignment and context-aware modeling.<n>We instantiate the framework in XtraGPT, the first suite of open-source LLMs for context-aware, instruction-guided writing assistance.
arXiv Detail & Related papers (2025-05-16T15:02:19Z) - Align to Structure: Aligning Large Language Models with Structural Information [26.960069076925386]
We introduce Structural Alignment, a novel method that aligns large language models with human-like discourse structures to enhance long-form text generation.<n>We employ a dense reward scheme within a Proximal Policy Optimization framework, assigning fine-grained, token-level rewards based on the discourse distinctiveness relative to human writing.
arXiv Detail & Related papers (2025-04-04T17:40:04Z) - StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization [94.31508613367296]
Retrieval-augmented generation (RAG) is a key means to effectively enhance large language models (LLMs)
We propose StructRAG, which can identify the optimal structure type for the task at hand, reconstruct original documents into this structured format, and infer answers based on the resulting structure.
Experiments show that StructRAG achieves state-of-the-art performance, particularly excelling in challenging scenarios.
arXiv Detail & Related papers (2024-10-11T13:52:44Z) - HPT++: Hierarchically Prompting Vision-Language Models with Multi-Granularity Knowledge Generation and Improved Structure Modeling [39.14392943549792]
We propose a novel approach called Hierarchical Prompt Tuning (HPT), enabling simultaneous modeling of both structured and conventional linguistic knowledge.
We introduce a relationship-guided attention module to capture pair-wise associations among entities and attributes for low-level prompt learning.
By incorporating high-level and global-level prompts modeling overall semantics, the proposed hierarchical structure forges cross-level interlinks and empowers the model to handle more complex and long-term relationships.
arXiv Detail & Related papers (2024-08-27T06:50:28Z) - Structure-aware Domain Knowledge Injection for Large Language Models [38.08691252042949]
StructTuning is a methodology to transform Large Language Models (LLMs) into domain specialists.<n>It significantly reduces the training corpus needs to a mere 5% while achieving an impressive 100% of traditional knowledge injection performance.
arXiv Detail & Related papers (2024-07-23T12:38:48Z) - Document Structure in Long Document Transformers [64.76981299465885]
Long documents often exhibit structure with hierarchically organized elements of different functions, such as section headers and paragraphs.
Despite the omnipresence of document structure, its role in natural language processing (NLP) remains opaque.
Do long-document Transformer models acquire an internal representation of document structure during pre-training?
How can structural information be communicated to a model after pre-training, and how does it influence downstream performance?
arXiv Detail & Related papers (2024-01-31T08:28:06Z) - Towards Verifiable Generation: A Benchmark for Knowledge-aware Language Model Attribution [48.86322922826514]
This paper defines a new task of Knowledge-aware Language Model Attribution (KaLMA)
First, we extend attribution source from unstructured texts to Knowledge Graph (KG), whose rich structures benefit both the attribution performance and working scenarios.
Second, we propose a new Conscious Incompetence" setting considering the incomplete knowledge repository.
Third, we propose a comprehensive automatic evaluation metric encompassing text quality, citation quality, and text citation alignment.
arXiv Detail & Related papers (2023-10-09T11:45:59Z) - Physics of Language Models: Part 1, Learning Hierarchical Language Structures [51.68385617116854]
Transformer-based language models are effective but complex, and understanding their inner workings and reasoning mechanisms is a significant challenge.<n>We introduce a family of synthetic CFGs that produce hierarchical rules, capable of generating lengthy sentences.<n>We demonstrate that generative models like GPT can accurately learn and reason over CFG-defined hierarchies and generate sentences based on it.
arXiv Detail & Related papers (2023-05-23T04:28:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.