ReproduceMeGit: A Visualization Tool for Analyzing Reproducibility of
Jupyter Notebooks
- URL: http://arxiv.org/abs/2006.12110v1
- Date: Mon, 22 Jun 2020 10:05:52 GMT
- Title: ReproduceMeGit: A Visualization Tool for Analyzing Reproducibility of
Jupyter Notebooks
- Authors: Sheeba Samuel and Birgitta K\"onig-Ries
- Abstract summary: We present ReproduceMeGit, a visualization tool for analyzing the GitHub of Jupyter Notebooks.
The tool provides information on the number of notebooks that were successfully reproducible, those that resulted in exceptions, those with different results from the original notebooks, etc.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Computational notebooks have gained widespread adoption among researchers
from academia and industry as they support reproducible science. These
notebooks allow users to combine code, text, and visualizations for easy
sharing of experiments and results. They are widely shared in GitHub, which
currently has more than 100 million repositories making it the largest host of
source code in the world. Recent reproducibility studies have indicated that
there exist good and bad practices in writing these notebooks which can affect
their overall reproducibility. We present ReproduceMeGit, a visualization tool
for analyzing the reproducibility of Jupyter Notebooks. This will help
repository users and owners to reproduce and directly analyze and assess the
reproducibility of any GitHub repository containing Jupyter Notebooks. The tool
provides information on the number of notebooks that were successfully
reproducible, those that resulted in exceptions, those with different results
from the original notebooks, etc. Each notebook in the repository along with
the provenance information of its execution can also be exported in RDF with
the integration of the ProvBook tool.
Related papers
- CodeRAG-Bench: Can Retrieval Augment Code Generation? [78.37076502395699]
We conduct a systematic, large-scale analysis of code generation using retrieval-augmented generation.
We first curate a comprehensive evaluation benchmark, CodeRAG-Bench, encompassing three categories of code generation tasks.
We examine top-performing models on CodeRAG-Bench by providing contexts retrieved from one or multiple sources.
arXiv Detail & Related papers (2024-06-20T16:59:52Z) - How to Understand Whole Software Repository? [64.19431011897515]
An excellent understanding of the whole repository will be the critical path to Automatic Software Engineering (ASE)
We develop a novel method named RepoUnderstander by guiding agents to comprehensively understand the whole repositories.
To better utilize the repository-level knowledge, we guide the agents to summarize, analyze, and plan.
arXiv Detail & Related papers (2024-06-03T15:20:06Z) - Collaborative, Code-Proximal Dynamic Software Visualization within Code
Editors [55.57032418885258]
This paper introduces the design and proof-of-concept implementation for a software visualization approach that can be embedded into code editors.
Our contribution differs from related work in that we use dynamic analysis of a software system's runtime behavior.
Our visualization approach enhances common remote pair programming tools and is collaboratively usable by employing shared code cities.
arXiv Detail & Related papers (2023-08-30T06:35:40Z) - SuperNOVA: Design Strategies and Opportunities for Interactive Visualization in Computational Notebooks [34.04783941358773]
We analyze 163 interactive visualization tools for notebooks.
We identify key design implications and trade-offs.
We develop SuperNOVA, an open-source interactive browser to help researchers explore existing notebook visualization tools.
arXiv Detail & Related papers (2023-05-04T17:57:54Z) - RepoCoder: Repository-Level Code Completion Through Iterative Retrieval
and Generation [96.75695811963242]
RepoCoder is a framework to streamline the repository-level code completion process.
It incorporates a similarity-based retriever and a pre-trained code language model.
It consistently outperforms the vanilla retrieval-augmented code completion approach.
arXiv Detail & Related papers (2023-03-22T13:54:46Z) - Static Analysis Driven Enhancements for Comprehension in Machine Learning Notebooks [7.142786325863891]
Jupyter notebooks enable developers to interleave code snippets with rich-text and in-line visualizations.
Recent studies have demonstrated that a large portion of Jupyter notebooks are undocumented and lacks a narrative structure.
This paper presents HeaderGen, a novel tool-based approach that automatically annotates code cells with categorical markdown headers.
arXiv Detail & Related papers (2023-01-11T11:57:52Z) - Deep learning for table detection and structure recognition: A survey [49.09628624903334]
The goal of this survey is to provide a profound comprehension of the major developments in the field of Table Detection.
We provide an analysis of both classic and new applications in the field.
The datasets and source code of the existing models are organized to provide the reader with a compass on this vast literature.
arXiv Detail & Related papers (2022-11-15T19:42:27Z) - Pynblint: a Static Analyzer for Python Jupyter Notebooks [10.190501703364234]
Pynblint is a static analyzer for Jupyter notebooks written in Python.
It checks compliance of notebooks (and surrounding repositories) with a set of empirically validated best practices.
arXiv Detail & Related papers (2022-05-24T09:56:03Z) - Repro: An Open-Source Library for Improving the Reproducibility and
Usability of Publicly Available Research Code [74.28810048824519]
Repro is an open-source library which aims at improving the usability of research code.
It provides a lightweight Python API for running software released by researchers within Docker containers.
arXiv Detail & Related papers (2022-04-29T01:54:54Z) - StickyLand: Breaking the Linear Presentation of Computational Notebooks [5.1175396458764855]
StickyLand is a notebook extension for empowering users to freely organize their code in non-linear ways.
With sticky cells that are always shown on the screen, users can quickly access their notes, instantly observe experiment results, and easily build interactive dashboards.
arXiv Detail & Related papers (2022-02-22T18:25:54Z) - You Only Write Thrice: Creating Documents, Computational Notebooks and
Presentations From a Single Source [11.472707084860875]
Academic trade requires juggling multiple variants of the same content published in different formats.
We propose to significantly reduce this burden by maintaining a single source document in a version-controlled environment.
We offer a proof-of-concept workflow that composes Jupyter Book (an online document), Jupyter Notebook (a computational narrative) and reveal.js slides from a single markdown source file.
arXiv Detail & Related papers (2021-07-02T21:02:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.