P2P: Automated Paper-to-Poster Generation and Fine-Grained Benchmark
- URL: http://arxiv.org/abs/2505.17104v1
- Date: Wed, 21 May 2025 09:06:05 GMT
- Title: P2P: Automated Paper-to-Poster Generation and Fine-Grained Benchmark
- Authors: Tao Sun, Enhao Pan, Zhengkai Yang, Kaixin Sui, Jiajun Shi, Xianfu Cheng, Tongliang Li, Wenhao Huang, Ge Zhang, Jian Yang, Zhoujun Li,
- Abstract summary: We introduce P2P, the first flexible, LLM-based multi-agent framework that generates high-quality, HTML-rendered academic posters.<n>P2P employs three specialized agents-for visual element processing, content generation, and final poster assembly-each integrated with dedicated checker modules.<n>We establish P2PEval, a comprehensive benchmark featuring 121 paper-poster pairs and a dual evaluation methodology.
- Score: 27.57464219790922
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Academic posters are vital for scholarly communication, yet their manual creation is time-consuming. However, automated academic poster generation faces significant challenges in preserving intricate scientific details and achieving effective visual-textual integration. Existing approaches often struggle with semantic richness and structural nuances, and lack standardized benchmarks for evaluating generated academic posters comprehensively. To address these limitations, we introduce P2P, the first flexible, LLM-based multi-agent framework that generates high-quality, HTML-rendered academic posters directly from research papers, demonstrating strong potential for practical applications. P2P employs three specialized agents-for visual element processing, content generation, and final poster assembly-each integrated with dedicated checker modules to enable iterative refinement and ensure output quality. To foster advancements and rigorous evaluation in this domain, we construct and release P2PInstruct, the first large-scale instruction dataset comprising over 30,000 high-quality examples tailored for the academic paper-to-poster generation task. Furthermore, we establish P2PEval, a comprehensive benchmark featuring 121 paper-poster pairs and a dual evaluation methodology (Universal and Fine-Grained) that leverages LLM-as-a-Judge and detailed, human-annotated checklists. Our contributions aim to streamline research dissemination and provide the community with robust tools for developing and evaluating next-generation poster generation systems.
Related papers
- PreGenie: An Agentic Framework for High-quality Visual Presentation Generation [25.673526096069548]
PreGenie is an agentic and modular framework powered by multimodal large language models (MLLMs) for generating high-quality visual presentations.<n>It operates in two stages: (1) Analysis and Initial Generation, which summarizes multimodal input and generates initial code, and (2) Review and Re-generation, which iteratively reviews intermediate code and rendered slides to produce final, high-quality presentations.
arXiv Detail & Related papers (2025-05-27T18:36:19Z) - Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers [11.186078920251754]
Poster generation is a crucial yet challenging task in scientific communication.<n>We introduce the first benchmark and metric suite for poster generation.<n>PosterAgent is a top-down, visual-in-the-loop multi-agent pipeline.
arXiv Detail & Related papers (2025-05-27T17:58:49Z) - XtraGPT: LLMs for Human-AI Collaboration on Controllable Academic Paper Revision [41.44785777328187]
XtraGPT is the first suite of open-source large language models (LLMs) designed to provide context-aware, instruction-guided writing assistance.<n>We introduce a dataset of 7,040 research papers from top-tier venues annotated with over 140,000 instruction-response pairs.<n>Experiments validate that XtraGPT significantly outperforms same-scale baselines and approaches the quality of proprietary systems.
arXiv Detail & Related papers (2025-05-16T15:02:19Z) - Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning [57.09163579304332]
We introduce PaperCoder, a framework that transforms machine learning papers into functional code repositories.<n>PaperCoder operates in three stages: planning, designs the system architecture with diagrams, identifies file dependencies, and generates configuration files.<n>We then evaluate PaperCoder on generating code implementations from machine learning papers based on both model-based and human evaluations.
arXiv Detail & Related papers (2025-04-24T01:57:01Z) - Compose Your Aesthetics: Empowering Text-to-Image Models with the Principles of Art [61.28133495240179]
We propose a novel task of aesthetics alignment which seeks to align user-specified aesthetics with the T2I generation output.<n>Inspired by how artworks provide an invaluable perspective to approach aesthetics, we codify visual aesthetics using the compositional framework artists employ.<n>We demonstrate that T2I DMs can effectively offer 10 compositional controls through user-specified PoA conditions.
arXiv Detail & Related papers (2025-03-15T06:58:09Z) - PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides [51.88536367177796]
We propose a two-stage, edit-based approach inspired by human drafts for automatically generating presentations.<n>PWTAgent first analyzes references to extract slide-level functional types and content schemas, then generates editing actions based on selected reference slides.<n>PWTAgent significantly outperforms existing automatic presentation generation methods across all three dimensions.
arXiv Detail & Related papers (2025-01-07T16:53:01Z) - AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials [53.376263056033046]
Existing approaches rely on expensive human annotation, making them unsustainable at scale.<n>We propose AgentTrek, a scalable data synthesis pipeline that generates web agent trajectories by leveraging publicly available tutorials.<n>Our fully automated approach significantly reduces data collection costs, achieving a cost of just $0.55 per high-quality trajectory without human annotators.
arXiv Detail & Related papers (2024-12-12T18:59:27Z) - Automating Intervention Discovery from Scientific Literature: A Progressive Ontology Prompting and Dual-LLM Framework [56.858564736806414]
This paper proposes a novel framework leveraging large language models (LLMs) to identify interventions in scientific literature.<n>Our approach successfully identified 2,421 interventions from a corpus of 64,177 research articles in the speech-language pathology domain.
arXiv Detail & Related papers (2024-08-20T16:42:23Z) - GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models [7.152732507491591]
We propose an automatic poster generation framework with text rendering capabilities leveraging LLMs.<n>This framework aims to create precise poster text within a detailed contextual background.<n>We introduce a high-resolution font dataset and a poster dataset with resolutions exceeding 1024 pixels.
arXiv Detail & Related papers (2024-07-02T13:17:49Z) - Navigating the Path of Writing: Outline-guided Text Generation with Large Language Models [8.920436030483872]
Large Language Models (LLMs) have impacted the writing process, enhancing productivity by collaborating with humans in content creation platforms.<n>We propose WritingPath, a framework that uses explicit outlines to guide LLMs in generating goal-oriented, high-quality text.
arXiv Detail & Related papers (2024-04-22T06:57:43Z) - What's New? Summarizing Contributions in Scientific Literature [85.95906677964815]
We introduce a new task of disentangled paper summarization, which seeks to generate separate summaries for the paper contributions and the context of the work.
We extend the S2ORC corpus of academic articles by adding disentangled "contribution" and "context" reference labels.
We propose a comprehensive automatic evaluation protocol which reports the relevance, novelty, and disentanglement of generated outputs.
arXiv Detail & Related papers (2020-11-06T02:23:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.