Related papers: ReproduceMeGit: A Visualization Tool for Analyzing Reproducibility of Jupyter Notebooks

ReproduceMeGit: A Visualization Tool for Analyzing Reproducibility of Jupyter Notebooks

URL: http://arxiv.org/abs/2006.12110v1
Date: Mon, 22 Jun 2020 10:05:52 GMT
Title: ReproduceMeGit: A Visualization Tool for Analyzing Reproducibility of Jupyter Notebooks
Authors: Sheeba Samuel and Birgitta K\"onig-Ries
Abstract summary: We present ReproduceMeGit, a visualization tool for analyzing the GitHub of Jupyter Notebooks. The tool provides information on the number of notebooks that were successfully reproducible, those that resulted in exceptions, those with different results from the original notebooks, etc.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Computational notebooks have gained widespread adoption among researchers from academia and industry as they support reproducible science. These notebooks allow users to combine code, text, and visualizations for easy sharing of experiments and results. They are widely shared in GitHub, which currently has more than 100 million repositories making it the largest host of source code in the world. Recent reproducibility studies have indicated that there exist good and bad practices in writing these notebooks which can affect their overall reproducibility. We present ReproduceMeGit, a visualization tool for analyzing the reproducibility of Jupyter Notebooks. This will help repository users and owners to reproduce and directly analyze and assess the reproducibility of any GitHub repository containing Jupyter Notebooks. The tool provides information on the number of notebooks that were successfully reproducible, those that resulted in exceptions, those with different results from the original notebooks, etc. Each notebook in the repository along with the provenance information of its execution can also be exported in RDF with the integration of the ProvBook tool.

Related papers

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning [57.09163579304332]
We introduce PaperCoder, a framework that transforms machine learning papers into functional code repositories. PaperCoder operates in three stages: planning, designs the system architecture with diagrams, identifies file dependencies, and generates configuration files. We then evaluate PaperCoder on generating code implementations from machine learning papers based on both model-based and human evaluations.
arXiv Detail & Related papers (2025-04-24T01:57:01Z)
Suggesting Code Edits in Interactive Machine Learning Notebooks Using Large Language Models [3.2433570328895196]
We present the first dataset of 48,398 Jupyter notebook edits derived from 20,095 revisions of 792 machine learning repositories on GitHub. Our dataset captures granular details of cell-level and line-level modifications, offering a foundation for understanding real-world maintenance patterns in machine learning.
arXiv Detail & Related papers (2025-01-16T18:55:38Z)
CodeRAG-Bench: Can Retrieval Augment Code Generation? [78.37076502395699]
We conduct a systematic, large-scale analysis of code generation using retrieval-augmented generation. We first curate a comprehensive evaluation benchmark, CodeRAG-Bench, encompassing three categories of code generation tasks. We examine top-performing models on CodeRAG-Bench by providing contexts retrieved from one or multiple sources.
arXiv Detail & Related papers (2024-06-20T16:59:52Z)
How to Understand Whole Software Repository? [64.19431011897515]
An excellent understanding of the whole repository will be the critical path to Automatic Software Engineering (ASE) We develop a novel method named RepoUnderstander by guiding agents to comprehensively understand the whole repositories. To better utilize the repository-level knowledge, we guide the agents to summarize, analyze, and plan.
arXiv Detail & Related papers (2024-06-03T15:20:06Z)
Collaborative, Code-Proximal Dynamic Software Visualization within Code Editors [55.57032418885258]
This paper introduces the design and proof-of-concept implementation for a software visualization approach that can be embedded into code editors. Our contribution differs from related work in that we use dynamic analysis of a software system's runtime behavior. Our visualization approach enhances common remote pair programming tools and is collaboratively usable by employing shared code cities.
arXiv Detail & Related papers (2023-08-30T06:35:40Z)
SuperNOVA: Design Strategies and Opportunities for Interactive Visualization in Computational Notebooks [34.04783941358773]
We analyze 163 interactive visualization tools for notebooks. We identify key design implications and trade-offs. We develop SuperNOVA, an open-source interactive browser to help researchers explore existing notebook visualization tools.
arXiv Detail & Related papers (2023-05-04T17:57:54Z)
Mining the Characteristics of Jupyter Notebooks in Data Science Projects [1.655246222110267]
The computational notebook (e.g., Jupyter Notebook) is a well-known data science tool adopted in practice. This research aims to understand the characteristics of high-voted Jupyter Notebooks on Kaggle and the popular Jupyter Notebooks for data science projects on GitHub.
arXiv Detail & Related papers (2023-04-11T16:30:53Z)
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation [96.75695811963242]
RepoCoder is a framework to streamline the repository-level code completion process. It incorporates a similarity-based retriever and a pre-trained code language model. It consistently outperforms the vanilla retrieval-augmented code completion approach.
arXiv Detail & Related papers (2023-03-22T13:54:46Z)
Static Analysis Driven Enhancements for Comprehension in Machine Learning Notebooks [7.142786325863891]
Jupyter notebooks enable developers to interleave code snippets with rich-text and in-line visualizations. Recent studies have demonstrated that a large portion of Jupyter notebooks are undocumented and lacks a narrative structure. This paper presents HeaderGen, a novel tool-based approach that automatically annotates code cells with categorical markdown headers.
arXiv Detail & Related papers (2023-01-11T11:57:52Z)
Deep learning for table detection and structure recognition: A survey [49.09628624903334]
The goal of this survey is to provide a profound comprehension of the major developments in the field of Table Detection. We provide an analysis of both classic and new applications in the field. The datasets and source code of the existing models are organized to provide the reader with a compass on this vast literature.
arXiv Detail & Related papers (2022-11-15T19:42:27Z)
Pynblint: a Static Analyzer for Python Jupyter Notebooks [10.190501703364234]
Pynblint is a static analyzer for Jupyter notebooks written in Python. It checks compliance of notebooks (and surrounding repositories) with a set of empirically validated best practices.
arXiv Detail & Related papers (2022-05-24T09:56:03Z)
Repro: An Open-Source Library for Improving the Reproducibility and Usability of Publicly Available Research Code [74.28810048824519]
Repro is an open-source library which aims at improving the usability of research code. It provides a lightweight Python API for running software released by researchers within Docker containers.
arXiv Detail & Related papers (2022-04-29T01:54:54Z)
StickyLand: Breaking the Linear Presentation of Computational Notebooks [5.1175396458764855]
StickyLand is a notebook extension for empowering users to freely organize their code in non-linear ways. With sticky cells that are always shown on the screen, users can quickly access their notes, instantly observe experiment results, and easily build interactive dashboards.
arXiv Detail & Related papers (2022-02-22T18:25:54Z)
You Only Write Thrice: Creating Documents, Computational Notebooks and Presentations From a Single Source [11.472707084860875]
Academic trade requires juggling multiple variants of the same content published in different formats. We propose to significantly reduce this burden by maintaining a single source document in a version-controlled environment. We offer a proof-of-concept workflow that composes Jupyter Book (an online document), Jupyter Notebook (a computational narrative) and reveal.js slides from a single markdown source file.
arXiv Detail & Related papers (2021-07-02T21:02:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.