Related papers: Meow: End-to-End Outline Writing for Automatic Academic Survey

Meow: End-to-End Outline Writing for Automatic Academic Survey

URL: http://arxiv.org/abs/2509.19370v1
Date: Fri, 19 Sep 2025 07:20:53 GMT
Title: Meow: End-to-End Outline Writing for Automatic Academic Survey
Authors: Zhaoyu Ma, Yuan Shan, Jiahao Zhao, Nan Xu, Lei Wang,
Abstract summary: We propose Meow, a framework that produces organized and faithful outlines efficiently.<n>We first formulate outline writing as an end-to-end task that generates hierarchical structured outlines from paper metadata.<n>We then curate a high-quality dataset of surveys from arXiv, bioRxiv, and medRxiv, and establish systematic evaluation metrics for outline quality assessment.
Score: 24.749855249116802
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As academic paper publication numbers grow exponentially, conducting in-depth surveys with LLMs automatically has become an inevitable trend. Outline writing, which aims to systematically organize related works, is critical for automated survey generation. Yet existing automatic survey methods treat outline writing as mere workflow steps in the overall pipeline. Such template-based workflows produce outlines that lack in-depth understanding of the survey topic and fine-grained styles. To address these limitations, we propose Meow, the first metadata-driven outline writing framework that produces organized and faithful outlines efficiently. Specifically, we first formulate outline writing as an end-to-end task that generates hierarchical structured outlines from paper metadata. We then curate a high-quality dataset of surveys from arXiv, bioRxiv, and medRxiv, and establish systematic evaluation metrics for outline quality assessment. Finally, we employ a two-stage training approach combining supervised fine-tuning and reinforcement learning. Our 8B reasoning model demonstrates strong performance with high structural fidelity and stylistic coherence.

Related papers

SciFig: Towards Automating Scientific Figure Generation [41.73701976318102]
SciFig is an end-to-end AI agent system that generates publication-ready pipeline figures directly from research paper texts.<n>We introduce a rubric-based evaluation framework that analyzes 2,219 real scientific figures to extract evaluation rubrics.<n>SciFig demonstrates remarkable performance: achieving 70.1$%$ overall quality on dataset-level evaluation and 66.2$%$ on paper-specific evaluation.
arXiv Detail & Related papers (2026-01-07T20:56:58Z)
DeepSynth-Eval: Objectively Evaluating Information Consolidation in Deep Survey Writing [53.85037373860246]
We introduce Deep Synth-Eval, a benchmark designed to objectively evaluate information consolidation capabilities.<n>We propose a fine-grained evaluation protocol using General Checklists (for factual coverage) and Constraint Checklists (for structural organization)<n>Our results demonstrate that agentic plan-and-write significantly outperform single-turn generation.
arXiv Detail & Related papers (2026-01-07T03:07:52Z)
AutoSurvey2: Empowering Researchers with Next Level Automated Literature Surveys [10.50820843303237]
This paper introduces autosurvey2, a multi-stage pipeline that automates survey generation through retrieval-augmented synthesis and structured evaluation.<n>The system integrates parallel section generation, iterative refinement, and real-time retrieval of recent publications to ensure both topical completeness and factual accuracy.<n> Experimental results demonstrate that autosurvey2 consistently outperforms existing retrieval-based and automated baselines.
arXiv Detail & Related papers (2025-10-29T22:57:03Z)
Deep Literature Survey Automation with an Iterative Workflow [30.923568155892184]
ours is a framework based on recurrent outline generation to ensure both exploration and coherence.<n>To provide faithful paper-level grounding, we design paper cards that distill each paper into its contributions, methods, and findings.<n>Experiments on both established and emerging topics show that ours substantially outperforms state-of-the-art baselines in content coverage, structural coherence, and citation quality.
arXiv Detail & Related papers (2025-10-24T14:41:26Z)
SurveyG: A Multi-Agent LLM Framework with Hierarchical Citation Graph for Automated Survey Generation [4.512335376984058]
Large language models (LLMs) are increasingly adopted for automating survey paper generation.<n>We propose textbfSurveyG, an LLM-based agent framework that integrates textithierarchical citation graph<n>The graph is organized into three layers: textbfFoundation, textbfDevelopment, and textbfFrontier, to capture the evolution of research from seminal works to incremental advances and emerging directions.
arXiv Detail & Related papers (2025-10-09T03:14:20Z)
Benchmarking Computer Science Survey Generation [18.844790013427282]
SurGE (Survey Generation Evaluation) is a new benchmark for evaluating scientific survey generation in the computer science domain.<n>SurGE consists of (1) a collection of test instances, each including a topic description, an expert-written survey, and its full set of cited references, and (2) a large-scale academic corpus of over one million papers that serves as the retrieval pool.<n>In addition, we propose an automated evaluation framework that measures generated surveys across four dimensions: information coverage, referencing accuracy, structural organization, and content quality.
arXiv Detail & Related papers (2025-08-21T15:45:10Z)
Let's Use ChatGPT To Write Our Paper! Benchmarking LLMs To Write the Introduction of a Research Paper [64.50822834679101]
SciIG is a task that evaluates LLMs' ability to produce coherent introductions from titles, abstracts, and related works.<n>We assess five state-of-the-art models, including open-source (DeepSeek-v3, Gemma-3-12B, LLaMA 4-Maverick, MistralAI Small 3.1) and closed-source GPT-4o systems.<n>Results demonstrate LLaMA-4 Maverick's superior performance on most metrics, particularly in semantic similarity and faithfulness.
arXiv Detail & Related papers (2025-08-19T21:11:11Z)
AutoRev: Automatic Peer Review System for Academic Research Papers [9.269282930029856]
AutoRev is an Automatic Peer Review System for Academic Research Papers.<n>Our framework represents an academic document as a graph, enabling the extraction of the most critical passages.<n>When applied to review generation, our method outperforms SOTA baselines by an average of 58.72%.
arXiv Detail & Related papers (2025-05-20T13:59:58Z)
Graphy'our Data: Towards End-to-End Modeling, Exploring and Generating Report from Raw Data [5.752510084651565]
Graphy is an end-to-end platform that automates data modeling, exploration and high-quality report generation.<n>We showcase a pre-scrapped graph of over 50,000 papers -- complete with their references -- demonstrating how Graphy facilitates the literature-survey scenario.
arXiv Detail & Related papers (2025-02-24T06:10:49Z)
Integrating Planning into Single-Turn Long-Form Text Generation [66.08871753377055]
We propose to use planning to generate long form content. Our main novelty lies in a single auxiliary task that does not require multiple rounds of prompting or planning. Our experiments demonstrate on two datasets from different domains, that LLMs fine-tuned with the auxiliary task generate higher quality documents.
arXiv Detail & Related papers (2024-10-08T17:02:40Z)
Taxonomy Tree Generation from Citation Graph [15.188580557890942]
HiGTL is a novel end-to-end framework guided by human-provided instructions or preferred topics.<n>We develop a novel taxonomy node verbalization strategy that iteratively generates central concepts for each cluster.<n>Experiments demonstrate that HiGTL effectively produces coherent, high-quality concept.
arXiv Detail & Related papers (2024-10-02T13:02:03Z)
The Power of Summary-Source Alignments [62.76959473193149]
Multi-document summarization (MDS) is a challenging task, often decomposed to subtasks of salience and redundancy detection. alignment of corresponding sentences between a reference summary and its source documents has been leveraged to generate training data. This paper proposes extending the summary-source alignment framework by applying it at the more fine-grained proposition span level.
arXiv Detail & Related papers (2024-06-02T19:35:19Z)
Decoding the End-to-end Writing Trajectory in Scholarly Manuscripts [7.294418916091011]
We introduce a novel taxonomy that categorizes scholarly writing behaviors according to intention, writer actions, and the information types of the written data. Motivated by cognitive writing theory, our taxonomy for scientific papers includes three levels of categorization in order to trace the general writing flow. ManuScript intends to provide a complete picture of the scholarly writing process by capturing the linearity and non-linearity of writing trajectory.
arXiv Detail & Related papers (2023-03-31T20:33:03Z)
Automated Concatenation of Embeddings for Structured Prediction [75.44925576268052]
We propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks. We follow strategies in reinforcement learning to optimize the parameters of the controller and compute the reward based on the accuracy of a task model.
arXiv Detail & Related papers (2020-10-10T14:03:20Z)
Summary-Source Proposition-level Alignment: Task, Datasets and Supervised Baseline [94.0601799665342]
Aligning sentences in a reference summary with their counterparts in source documents was shown as a useful auxiliary summarization task. We propose establishing summary-source alignment as an explicit task, while introducing two major novelties. We create a novel training dataset for proposition-level alignment, derived automatically from available summarization evaluation data. We present a supervised proposition alignment baseline model, showing improved alignment-quality over the unsupervised approach.
arXiv Detail & Related papers (2020-09-01T17:27:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.