Paper2Web: Let's Make Your Paper Alive!
- URL: http://arxiv.org/abs/2510.15842v1
- Date: Fri, 17 Oct 2025 17:35:58 GMT
- Title: Paper2Web: Let's Make Your Paper Alive!
- Authors: Yuhang Chen, Tianpeng Lv, Siyi Zhang, Yixiang Yin, Yao Wan, Philip S. Yu, Dongping Chen,
- Abstract summary: We introduce Paper2Web, a benchmark dataset and framework for assessing academic webpage generation.<n>We present PWAgent, an autonomous pipeline that converts scientific papers into interactive and multimedia-rich academic homepages.
- Score: 51.75896846964824
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Academic project websites can more effectively disseminate research when they clearly present core content and enable intuitive navigation and interaction. However, current approaches such as direct Large Language Model (LLM) generation, templates, or direct HTML conversion struggle to produce layout-aware, interactive sites, and a comprehensive evaluation suite for this task has been lacking. In this paper, we introduce Paper2Web, a benchmark dataset and multi-dimensional evaluation framework for assessing academic webpage generation. It incorporates rule-based metrics like Connectivity, Completeness and human-verified LLM-as-a-Judge (covering interactivity, aesthetics, and informativeness), and PaperQuiz, which measures paper-level knowledge retention. We further present PWAgent, an autonomous pipeline that converts scientific papers into interactive and multimedia-rich academic homepages. The agent iteratively refines both content and layout through MCP tools that enhance emphasis, balance, and presentation quality. Our experiments show that PWAgent consistently outperforms end-to-end baselines like template-based webpages and arXiv/alphaXiv versions by a large margin while maintaining low cost, achieving the Pareto-front in academic webpage generation.
Related papers
- Multimodal Peer Review Simulation with Actionable To-Do Recommendations for Community-Aware Manuscript Revisions [16.556181117253473]
We present an interactive web-based system for multimodal, community-aware peer review simulation to enable effective manuscript revisions before paper submission.<n>Our framework integrates textual and visual information through multimodal LLMs, enhances review quality via retrieval-augmented generation (RAG) grounded in web-scale OpenReview data.<n>The system integrates seamlessly into existing academic writing platforms, providing interactive interfaces for real-time feedback and revision tracking.
arXiv Detail & Related papers (2025-11-14T02:29:23Z) - Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1 [27.277038925857173]
AutoPage deconstructs paper-to-page creation into a coarse-to-fine pipeline from narrative planning to multimodal content generation and interactive rendering.<n>Tests show AutoPage not only generates high-quality, visually appealing pages but does so with remarkable efficiency in under 15 minutes for less than $0.1.
arXiv Detail & Related papers (2025-10-22T13:53:57Z) - WebRenderBench: Enhancing Web Interface Generation through Layout-Style Consistency and Reinforcement Learning [24.178675410636135]
We present a large-scale benchmark of 45.1k webpages collected from real-world portal sites.<n>We also propose a novel evaluation metric that measures layout and style consistency from the final rendered pages.
arXiv Detail & Related papers (2025-10-05T08:47:39Z) - Let's Use ChatGPT To Write Our Paper! Benchmarking LLMs To Write the Introduction of a Research Paper [64.50822834679101]
SciIG is a task that evaluates LLMs' ability to produce coherent introductions from titles, abstracts, and related works.<n>We assess five state-of-the-art models, including open-source (DeepSeek-v3, Gemma-3-12B, LLaMA 4-Maverick, MistralAI Small 3.1) and closed-source GPT-4o systems.<n>Results demonstrate LLaMA-4 Maverick's superior performance on most metrics, particularly in semantic similarity and faithfulness.
arXiv Detail & Related papers (2025-08-19T21:11:11Z) - DocR1: Evidence Page-Guided GRPO for Multi-Page Document Understanding [100.29587871213624]
We introduce DocR1, an MLLM trained with a novel RL framework, Evidence Page-Guided GRPO.<n>EviGRPO incorporates an evidence-aware reward mechanism that promotes a coarse-to-fine reasoning strategy.<n>We show that DocR1 achieves state-of-the-art performance on multi-page tasks, while consistently maintaining strong results on single-page benchmarks.
arXiv Detail & Related papers (2025-08-10T12:03:45Z) - Navigating Through Paper Flood: Advancing LLM-based Paper Evaluation through Domain-Aware Retrieval and Latent Reasoning [30.92327406304362]
We present PaperEval, a novel framework for automated paper evaluation using Large Language Models (LLMs)<n>PaperEval has two key components: 1) a domain-aware paper retrieval module that retrieves relevant concurrent work to support contextualized assessments of novelty and contributions, and 2) a latent reasoning mechanism that enables deep understanding of complex motivations and methodologies.<n> Experiments on two datasets demonstrate that PaperEval consistently outperforms existing methods in both academic impact and paper quality evaluation.
arXiv Detail & Related papers (2025-08-07T08:08:13Z) - P2P: Automated Paper-to-Poster Generation and Fine-Grained Benchmark [27.57464219790922]
We introduce P2P, the first flexible, LLM-based multi-agent framework that generates high-quality, HTML-rendered academic posters.<n>P2P employs three specialized agents-for visual element processing, content generation, and final poster assembly-each integrated with dedicated checker modules.<n>We establish P2PEval, a comprehensive benchmark featuring 121 paper-poster pairs and a dual evaluation methodology.
arXiv Detail & Related papers (2025-05-21T09:06:05Z) - Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning [70.04746094652653]
We introduce PaperCoder, a framework that transforms machine learning papers into functional code repositories.<n>PaperCoder operates in three stages: planning, designs the system architecture with diagrams, identifies file dependencies, and generates configuration files.<n>We then evaluate PaperCoder on generating code implementations from machine learning papers based on both model-based and human evaluations.
arXiv Detail & Related papers (2025-04-24T01:57:01Z) - MRWeb: An Exploration of Generating Multi-Page Resource-Aware Web Code from UI Designs [50.274447094978996]
Multi-Page Resource-Aware Webpage (MRWeb) generation task transforms UI designs into multi-page, functional web UIs with internal/external navigation, image loading, and backend routing.<n>Our study applies existing methods to the MRWeb problem using a newly curated dataset of 500 websites (300 synthetic, 200 real-world). Specifically, we identify the best metric to evaluate the similarity of the web UI, assess the impact of the resource list on MRWeb generation, analyze MLLM limitations, and evaluate the effectiveness of the MRWeb tool in real-world.
arXiv Detail & Related papers (2024-12-19T15:02:33Z) - Interaction2Code: Benchmarking MLLM-based Interactive Webpage Code Generation from Interactive Prototyping [57.024913536420264]
Multimodal Large Language Models (MLLMs) have demonstrated remarkable performance on the design-to-code task.<n>We present the first systematic investigation of MLLMs in generating interactive webpages.
arXiv Detail & Related papers (2024-11-05T17:40:03Z) - Integrating Planning into Single-Turn Long-Form Text Generation [66.08871753377055]
We propose to use planning to generate long form content.
Our main novelty lies in a single auxiliary task that does not require multiple rounds of prompting or planning.
Our experiments demonstrate on two datasets from different domains, that LLMs fine-tuned with the auxiliary task generate higher quality documents.
arXiv Detail & Related papers (2024-10-08T17:02:40Z) - Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions [62.0123588983514]
Large Language Models (LLMs) have demonstrated wide-ranging applications across various fields.
We reformulate the peer-review process as a multi-turn, long-context dialogue, incorporating distinct roles for authors, reviewers, and decision makers.
We construct a comprehensive dataset containing over 26,841 papers with 92,017 reviews collected from multiple sources.
arXiv Detail & Related papers (2024-06-09T08:24:17Z) - AllTogether: Investigating the Efficacy of Spliced Prompt for Web
Navigation using Large Language Models [2.234037966956278]
We introduce AllTogether, a standardized prompt template that enhances task context representation.
We evaluate the efficacy of this approach through prompt learning and instruction finetuning based on open-source Llama-2 and API-accessible GPT models.
arXiv Detail & Related papers (2023-10-20T11:10:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.