Fugu-MT 論文翻訳(概要): Containing the Reproducibility Gap: Automated Repository-Level Containerization for Scholarly Jupyter Notebooks

論文の概要: Containing the Reproducibility Gap: Automated Repository-Level Containerization for Scholarly Jupyter Notebooks

arxiv url: http://arxiv.org/abs/2604.01072v1
Date: Wed, 01 Apr 2026 16:07:54 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-02 16:44:32.076649
Title: Containing the Reproducibility Gap: Automated Repository-Level Containerization for Scholarly Jupyter Notebooks
Title（参考訳）: Reproducibility Gap:Scholarly Jupyterノートの自動レポジトリレベルコンテナ化
Authors: Sheeba Samuel, Daniel Mietchen, Hemanta Lo, Martin Gaedke,
Abstract要約: 環境の漂流、文書化されていない依存関係、暗黙的な実行仮定は、出版された研究の独立した再実行を妨げる。学術ノートのリポジトリレベルの実行環境を再構築し,評価する,Web指向の自動化型エンジニアリングパイプラインを提案する。システムは依存性推論、コンテナの自動生成、ノートブックのオリジナルの計算コンテキストを近似するために独立した実行を実行する。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Computational reproducibility is fundamental to trustworthy science, yet remains difficult to achieve in practice across various research workflows, including Jupyter notebooks published alongside scholarly articles. Environment drift, undocumented dependencies and implicit execution assumptions frequently prevent independent re-execution of published research. Despite existing reproducibility guidelines, scalable and systematic infrastructure for automated assessment remains limited. We present an automated, web-oriented reproducibility engineering pipeline that reconstructs and evaluates repository-level execution environments for scholarly notebooks. The system performs dependency inference, automated container generation, and isolated execution to approximate the notebook's original computational context. We evaluate the approach on 443 notebooks from 116 GitHub repositories referenced by publications in PubMed Central. Execution outcomes are classified into four categories: resolved environment failures, persistent logic or data errors, reproducibility drift, and container-induced regressions. Our results show that containerization resolves 66.7% of prior dependency-related failures and substantially improves execution robustness. However, a significant reproducibility gap remains: 53.7% of notebooks exhibit low output fidelity, largely due to persistent runtime failures and stochastic non-determinism. These findings indicate that standardized containerization is essential for computational stability but insufficient for full bit-wise reproducibility. The framework offers a scalable solution for researchers, editors, and archivists seeking systematic, automated assessment of computational artifacts.
Abstract（参考訳）: 計算再現性は信頼できる科学の基本であるが、学術論文とともに出版されたJupyterノートなど、様々な研究ワークフローで実際に達成することは困難である。環境ドリフト、文書化されていない依存関係、暗黙的な実行仮定は、しばしば公表された研究の独立した再実行を妨げる。既存の再現性ガイドラインにもかかわらず、自動化評価のためのスケーラブルで体系的なインフラは依然として限られている。本稿では,学術ノートのリポジトリレベルの実行環境を再構築し,評価するWeb指向の再現性エンジニアリングパイプラインを提案する。このシステムは、ノートブックのオリジナルの計算コンテキストを近似するために、依存性推論、コンテナの自動生成、分離された実行を実行する。 PubMed Centralのパブリッシュによって参照された116のGitHubリポジトリから443のノートブックに対するアプローチを評価した。実行結果は、解決された環境障害、永続的なロジックまたはデータエラー、再現性ドリフト、コンテナによる回帰の4つのカテゴリに分類される。コンテナ化は,従来の依存性関連障害の66.7%を解消し,実行の堅牢性を大幅に向上することを示す。 53.7%のノートブックは、持続的な実行障害と確率論的非決定主義のために、出力忠実度が低い。これらの結果は、標準化されたコンテナ化は計算安定性には不可欠であるが、完全なビットワイド再現性には不十分であることを示している。このフレームワークは、研究者、編集者、考古学者に、計算成果物の体系的かつ自動化された評価を求めるスケーラブルなソリューションを提供する。

論文の概要: Containing the Reproducibility Gap: Automated Repository-Level Containerization for Scholarly Jupyter Notebooks

関連論文リスト