Related papers: StickyLand: Breaking the Linear Presentation of Computational Notebooks

StickyLand: Breaking the Linear Presentation of Computational Notebooks

URL: http://arxiv.org/abs/2202.11086v1
Date: Tue, 22 Feb 2022 18:25:54 GMT
Title: StickyLand: Breaking the Linear Presentation of Computational Notebooks
Authors: Zijie J. Wang, Katie Dai, W. Keith Edwards
Abstract summary: StickyLand is a notebook extension for empowering users to freely organize their code in non-linear ways. With sticky cells that are always shown on the screen, users can quickly access their notes, instantly observe experiment results, and easily build interactive dashboards.
Score: 5.1175396458764855
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: How can we better organize code in computational notebooks? Notebooks have become a popular tool among data scientists, as they seamlessly weave text and code together, supporting users to rapidly iterate and document code experiments. However, it is often challenging to organize code in notebooks, partially because there is a mismatch between the linear presentation of code and the non-linear process of exploratory data analysis. We present StickyLand, a notebook extension for empowering users to freely organize their code in non-linear ways. With sticky cells that are always shown on the screen, users can quickly access their notes, instantly observe experiment results, and easily build interactive dashboards that support complex visual analytics. Case studies highlight how our tool can enhance notebook users's productivity and identify opportunities for future notebook designs. StickyLand is available at https://github.com/xiaohk/stickyland.

Related papers

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning [57.09163579304332]
We introduce PaperCoder, a framework that transforms machine learning papers into functional code repositories. PaperCoder operates in three stages: planning, designs the system architecture with diagrams, identifies file dependencies, and generates configuration files. We then evaluate PaperCoder on generating code implementations from machine learning papers based on both model-based and human evaluations.
arXiv Detail & Related papers (2025-04-24T01:57:01Z)
Suggesting Code Edits in Interactive Machine Learning Notebooks Using Large Language Models [3.2433570328895196]
We present the first dataset of 48,398 Jupyter notebook edits derived from 20,095 revisions of 792 machine learning repositories on GitHub. Our dataset captures granular details of cell-level and line-level modifications, offering a foundation for understanding real-world maintenance patterns in machine learning.
arXiv Detail & Related papers (2025-01-16T18:55:38Z)
Contextualized Data-Wrangling Code Generation in Computational Notebooks [131.26365849822932]
We propose an automated approach, CoCoMine, to mine data-wrangling code generation examples with clear multi-modal contextual dependency. We construct CoCoNote, a dataset containing 58,221 examples for Contextualized Data-wrangling Code generation in Notebooks. Experiment results demonstrate the significance of incorporating data context in data-wrangling code generation.
arXiv Detail & Related papers (2024-09-20T14:49:51Z)
Predicting the Understandability of Computational Notebooks through Code Metrics Analysis [0.5277756703318045]
We employ a fine-tuned DistilBERT transformer to identify user comments associated with code understandability. We established a criterion called User Opinion Code Understandability (UOCU), which considers the number of relevant comments, upvotes on those comments, total notebook views, and total notebook upvotes. We trained machine learning models to predict notebook code understandability based solely on their metrics.
arXiv Detail & Related papers (2024-06-16T15:58:40Z)
Notably Inaccessible -- Data Driven Understanding of Data Science Notebook (In)Accessibility [13.428631054625797]
We perform a large scale systematic analysis of 100000 Jupyter notebooks to identify various accessibility challenges. We make recommendations to improve accessibility of the artifacts of a notebook, suggest authoring practices, and propose changes to infrastructure to make notebooks accessible.
arXiv Detail & Related papers (2023-08-07T01:33:32Z)
InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback [50.725076393314964]
We introduce InterCode, a lightweight, flexible, and easy-to-use framework of interactive coding as a standard reinforcement learning environment. Our framework is language and platform agnostic, uses self-contained Docker environments to provide safe and reproducible execution. We demonstrate InterCode's viability as a testbed by evaluating multiple state-of-the-art LLMs configured with different prompting strategies.
arXiv Detail & Related papers (2023-06-26T17:59:50Z)
LongCoder: A Long-Range Pre-trained Language Model for Code Completion [56.813974784131624]
LongCoder employs a sliding window mechanism for self-attention and introduces two types of globally accessible tokens. Bridge tokens are inserted throughout the input sequence to aggregate local information and facilitate global interaction. memory tokens are included to highlight important statements that may be invoked later and need to be memorized.
arXiv Detail & Related papers (2023-06-26T17:59:24Z)
SuperNOVA: Design Strategies and Opportunities for Interactive Visualization in Computational Notebooks [34.04783941358773]
We analyze 163 interactive visualization tools for notebooks. We identify key design implications and trade-offs. We develop SuperNOVA, an open-source interactive browser to help researchers explore existing notebook visualization tools.
arXiv Detail & Related papers (2023-05-04T17:57:54Z)
Mining the Characteristics of Jupyter Notebooks in Data Science Projects [1.655246222110267]
The computational notebook (e.g., Jupyter Notebook) is a well-known data science tool adopted in practice. This research aims to understand the characteristics of high-voted Jupyter Notebooks on Kaggle and the popular Jupyter Notebooks for data science projects on GitHub.
arXiv Detail & Related papers (2023-04-11T16:30:53Z)
STUNT: Few-shot Tabular Learning with Self-generated Tasks from Unlabeled Tables [64.0903766169603]
We propose a framework for few-shot semi-supervised learning, coined Self-generated Tasks from UNlabeled Tables (STUNT) Our key idea is to self-generate diverse few-shot tasks by treating randomly chosen columns as a target label. We then employ a meta-learning scheme to learn generalizable knowledge with the constructed tasks.
arXiv Detail & Related papers (2023-03-02T02:37:54Z)
MONAI Label: A framework for AI-assisted Interactive Labeling of 3D Medical Images [49.664220687980006]
The lack of annotated datasets is a major bottleneck for training new task-specific supervised machine learning models. We present MONAI Label, a free and open-source framework that facilitates the development of applications based on artificial intelligence (AI) models.
arXiv Detail & Related papers (2022-03-23T12:33:11Z)
HAConvGNN: Hierarchical Attention Based Convolutional Graph Neural Network for Code Documentation Generation in Jupyter Notebooks [33.37494243822309]
We propose a hierarchical attention-based ConvGNN component to augment the Seq2Seq network. We build a dataset with publicly available Kaggle notebooks and evaluate our model (HAConvGNN) against baseline models.
arXiv Detail & Related papers (2021-03-31T22:36:41Z)
SLADE: A Self-Training Framework For Distance Metric Learning [75.54078592084217]
We present a self-training framework, SLADE, to improve retrieval performance by leveraging additional unlabeled data. We first train a teacher model on the labeled data and use it to generate pseudo labels for the unlabeled data. We then train a student model on both labels and pseudo labels to generate final feature embeddings.
arXiv Detail & Related papers (2020-11-20T08:26:10Z)
ReproduceMeGit: A Visualization Tool for Analyzing Reproducibility of Jupyter Notebooks [0.0]
We present ReproduceMeGit, a visualization tool for analyzing the GitHub of Jupyter Notebooks. The tool provides information on the number of notebooks that were successfully reproducible, those that resulted in exceptions, those with different results from the original notebooks, etc.
arXiv Detail & Related papers (2020-06-22T10:05:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.