Related papers: Reflective Paper-to-Code Reproduction Enabled by Fine-Grained Verification

Reflective Paper-to-Code Reproduction Enabled by Fine-Grained Verification

URL: http://arxiv.org/abs/2508.16671v1
Date: Thu, 21 Aug 2025 06:57:44 GMT
Title: Reflective Paper-to-Code Reproduction Enabled by Fine-Grained Verification
Authors: Mingyang Zhou, Quanming Yao, Lun Du, Lanning Wei, Da Zheng,
Abstract summary: Motivated by how humans use systematic checklists to efficiently debug complex code, we propose textbfRePro, a textbfReflective Paper-to-Code textbfReproduction framework.<n>It automatically extracts a paper's fingerprint, referring to a comprehensive set of accurate and atomic criteria serving as high-quality supervisory signals.<n>It achieves 13.0% performance gap over baselines, and it correctly revises complex logical and mathematical criteria in reflecting.
Score: 46.845133190560375
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reproducing machine learning papers is essential for scientific progress but remains challenging for both humans and automated agents. Existing agent-based methods often struggle to fully and accurately reproduce implementation details such as mathematical formulas and algorithmic logic. Previous studies show that reflection with explicit feedback improves agent performance. However, current paper reproduction methods fail to effectively adopt this strategy. This gap mainly arises from the diverse paper patterns, complex method modules, and varied configurations encountered in research papers. Motivated by how humans use systematic checklists to efficiently debug complex code, we propose \textbf{RePro}, a \textbf{Re}flective Paper-to-Code \textbf{Repro}duction framework that automatically extracts a paper's fingerprint, referring to a comprehensive set of accurate and atomic criteria serving as high-quality supervisory signals. The framework first generates code based on the extracted information, and then leverages the fingerprint within iterative verification and refinement loop. This approach systematically detects discrepancies and produces targeted revisions to align generated code with the paper's implementation details. Extensive experiments on the PaperBench Code-Dev benchmark have been conducted, RePro achieves 13.0\% performance gap over baselines, and it correctly revises complex logical and mathematical criteria in reflecting, on which the effectiveness is obvious.

Related papers

What Papers Don't Tell You: Recovering Tacit Knowledge for Automated Paper Reproduction [57.86097956633207]
method is a graph-based agent framework for generating executable code from academic papers.<n>On an extended ReproduceBench spanning 3 domains, 10 tasks, and 40 recent papers, method achieves an average performance gap of 10.04% against official implementations.
arXiv Detail & Related papers (2026-03-02T12:33:31Z)
Enhancing Automated Paper Reproduction via Prompt-Free Collaborative Agents [8.185402940269794]
We propose a prompt-free collaborative agent framework that automatically enhances the quality of paper-to-code generation.<n>Our approach employs two collaborative agents: a verification agent that examines whether the outputs at each step satisfy the requirements specified in the corresponding system prompt, and a refinement agent that revises the outputs based on the identified issues.
arXiv Detail & Related papers (2025-12-02T14:24:23Z)
AutoReproduce: Automatic AI Experiment Reproduction with Paper Lineage [62.049868205196425]
AutoReproduce is a framework capable of automatically reproducing experiments described in research papers in an end-to-end manner.<n>Results show that AutoReproduce achieves an average performance gap of $22.1%$ on $89.74%$ of the executable experiment runs.
arXiv Detail & Related papers (2025-05-27T03:15:21Z)
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning [57.09163579304332]
We introduce PaperCoder, a framework that transforms machine learning papers into functional code repositories.<n>PaperCoder operates in three stages: planning, designs the system architecture with diagrams, identifies file dependencies, and generates configuration files.<n>We then evaluate PaperCoder on generating code implementations from machine learning papers based on both model-based and human evaluations.
arXiv Detail & Related papers (2025-04-24T01:57:01Z)
Learning Refined Document Representations for Dense Retrieval via Deliberate Thinking [58.69615583599489]
Deliberate Thinking based Retriever (Debater) is a novel approach that enhances document representations by incorporating a step-by-step thinking process.<n>Debater significantly outperforms existing methods across several retrieval benchmarks.
arXiv Detail & Related papers (2025-02-18T15:56:34Z)
Enhancing Code Consistency in AI Research with Large Language Models and Retrieval-Augmented Generation [0.0]
This paper presents a novel system designed to verify code implementations against the algorithms and methodologies outlined in corresponding research papers.<n>Our system employs Retrieval-Augmented Generation to extract relevant details from both the research papers and code bases, followed by a structured comparison using Large Language Models.
arXiv Detail & Related papers (2025-02-02T00:35:42Z)
DOCE: Finding the Sweet Spot for Execution-Based Code Generation [69.5305729627198]
We propose a comprehensive framework that includes candidate generation, $n$-best reranking, minimum Bayes risk (MBR) decoding, and self-ging as the core components. Our findings highlight the importance of execution-based methods and the difference gap between execution-based and execution-free methods.
arXiv Detail & Related papers (2024-08-25T07:10:36Z)
CodeRefine: A Pipeline for Enhancing LLM-Generated Code Implementations of Research Papers [0.0]
CodeRefine is a framework for transforming research paper methodologies into functional code using Large Language Models. Our multi-step approach first extracts and summarizes key text chunks from papers, analyzes their code relevance, and creates a knowledge graph. Code is then generated from this structured representation and enhanced through a proposed retrospective retrieval-augmented generation approach.
arXiv Detail & Related papers (2024-08-23T20:51:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.