SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models
- URL: http://arxiv.org/abs/2506.04180v1
- Date: Wed, 04 Jun 2025 17:27:42 GMT
- Title: SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models
- Authors: Yuhao Wu, Yushi Bai, Zhiqiang Hu, Juanzi Li, Roy Ka-Wei Lee,
- Abstract summary: SuperWriter-Agent is a framework designed to enhance the quality and consistency of long-form text generation.<n>Based on this framework, we construct a supervised fine-tuning dataset to train a 7B SuperWriter-LM.<n> Empirical results across diverse benchmarks demonstrate that SuperWriter-LM achieves state-of-the-art performance.
- Score: 34.723917246316205
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Long-form text generation remains a significant challenge for large language models (LLMs), particularly in maintaining coherence, ensuring logical consistency, and preserving text quality as sequence length increases. To address these limitations, we propose SuperWriter-Agent, an agent-based framework designed to enhance the quality and consistency of long-form text generation. SuperWriter-Agent introduces explicit structured thinking-through planning and refinement stages into the generation pipeline, guiding the model to follow a more deliberate and cognitively grounded process akin to that of a professional writer. Based on this framework, we construct a supervised fine-tuning dataset to train a 7B SuperWriter-LM. We further develop a hierarchical Direct Preference Optimization (DPO) procedure that uses Monte Carlo Tree Search (MCTS) to propagate final quality assessments and optimize each generation step accordingly. Empirical results across diverse benchmarks demonstrate that SuperWriter-LM achieves state-of-the-art performance, surpassing even larger-scale baseline models in both automatic evaluation and human evaluation. Furthermore, comprehensive ablation studies demonstrate the effectiveness of hierarchical DPO and underscore the value of incorporating structured thinking steps to improve the quality of long-form text generation.
Related papers
- LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning [34.723917246316205]
We propose an incentivization-based approach that leverages reinforcement learning (RL) to foster the emergence of ultra-long, high-quality text generation capabilities.<n>Our LongWriter-Zero model, trained from Qwen2.5-32B, consistently outperforms traditional SFT methods on long-form writing tasks.
arXiv Detail & Related papers (2025-06-23T16:59:02Z) - GenerationPrograms: Fine-grained Attribution with Executable Programs [72.23792263905372]
We introduce a modular generation framework, GenerationPrograms, inspired by recent advancements in "code agent" architectures.<n>GenerationPrograms decomposes the process into two distinct stages: first, creating an executable program plan composed of modular text operations explicitly tailored to the query, and second, executing these operations following the program's specified instructions to produce the final response.<n> Empirical evaluations demonstrate that GenerationPrograms significantly improves attribution quality at both the document level and sentence level.
arXiv Detail & Related papers (2025-06-17T14:37:09Z) - WritingBench: A Comprehensive Benchmark for Generative Writing [87.48445972563631]
We present WritingBench, a benchmark designed to evaluate large language models (LLMs) across 6 core writing domains and 100, encompassing creative, persuasive, informative, and technical writing.<n>We propose a query-dependent evaluation framework that empowers LLMs to dynamically generate instance-specific assessment criteria.<n>This framework is complemented by a fine-tuned critic model for criteria-aware scoring, enabling evaluations in style, format and length.
arXiv Detail & Related papers (2025-03-07T08:56:20Z) - LongEval: A Comprehensive Analysis of Long-Text Generation Through a Plan-based Paradigm [21.661578831520963]
Large Language Models (LLMs) have achieved remarkable success in various natural language processing tasks.<n>Our analysis reveals that current LLMs struggle with length requirements and information density in long-text generation.<n>We present LongEval, a benchmark that evaluates long-text generation through both direct and plan-based generation paradigms.
arXiv Detail & Related papers (2025-02-26T12:46:36Z) - Enhancing RWKV-based Language Models for Long-Sequence Text Generation [0.0]
This paper introduces an enhanced RWKV architecture with adaptive temporal gating mechanisms for improved long-context language modeling.<n>We propose two principal innovations: (1) a position-aware convolutional shift operator that captures local syntactic patterns while preserving global coherence, and (2) a neurally-gated information routing mechanism that dynamically regulates inter-token information flow.
arXiv Detail & Related papers (2025-02-21T14:18:18Z) - LongDPO: Unlock Better Long-form Generation Abilities for LLMs via Critique-augmented Stepwise Information [76.26257306813899]
Long-form generation is crucial for academic writing papers and repo-level code generation.<n>Existing methods that utilize preference learning with outcome supervision often fail to provide detailed feedback for extended contexts.<n>We propose enhancing long-form generation by incorporating process supervision.
arXiv Detail & Related papers (2025-02-04T08:25:17Z) - Detecting Document-level Paraphrased Machine Generated Content: Mimicking Human Writing Style and Involving Discourse Features [57.34477506004105]
Machine-generated content poses challenges such as academic plagiarism and the spread of misinformation.<n>We introduce novel methodologies and datasets to overcome these challenges.<n>We propose MhBART, an encoder-decoder model designed to emulate human writing style.<n>We also propose DTransformer, a model that integrates discourse analysis through PDTB preprocessing to encode structural features.
arXiv Detail & Related papers (2024-12-17T08:47:41Z) - Summarizing long regulatory documents with a multi-step pipeline [2.2591852560804675]
We show that the effectiveness of a two-step architecture for summarizing long regulatory texts varies depending on the model used.
For abstractive encoder-decoder models with short context lengths, the effectiveness of an extractive step varies, whereas for long-context encoder-decoder models, the extractive step worsens their performance.
arXiv Detail & Related papers (2024-08-19T08:07:25Z) - PLANET: Dynamic Content Planning in Autoregressive Transformers for
Long-form Text Generation [47.97523895218194]
We propose a novel generation framework leveraging autoregressive self-attention mechanism to conduct content planning and surface realization dynamically.
Our framework enriches the Transformer decoder with latent representations to maintain sentence-level semantic plans grounded by bag-of-words.
arXiv Detail & Related papers (2022-03-17T05:52:35Z) - Data-to-text Generation with Variational Sequential Planning [74.3955521225497]
We consider the task of data-to-text generation, which aims to create textual output from non-linguistic input.
We propose a neural model enhanced with a planning component responsible for organizing high-level information in a coherent and meaningful way.
We infer latent plans sequentially with a structured variational model, while interleaving the steps of planning and generation.
arXiv Detail & Related papers (2022-02-28T13:17:59Z) - Progressive Generation of Long Text with Pretrained Language Models [83.62523163717448]
Large-scale language models (LMs) pretrained on massive corpora of text, such as GPT-2, are powerful open-domain text generators.
It is still challenging for such models to generate coherent long passages of text, especially when the models are fine-tuned to the target domain on a small corpus.
We propose a simple but effective method of generating text in a progressive manner, inspired by generating images from low to high resolution.
arXiv Detail & Related papers (2020-06-28T21:23:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.