StickyLand: Breaking the Linear Presentation of Computational Notebooks
- URL: http://arxiv.org/abs/2202.11086v1
- Date: Tue, 22 Feb 2022 18:25:54 GMT
- Title: StickyLand: Breaking the Linear Presentation of Computational Notebooks
- Authors: Zijie J. Wang, Katie Dai, W. Keith Edwards
- Abstract summary: StickyLand is a notebook extension for empowering users to freely organize their code in non-linear ways.
With sticky cells that are always shown on the screen, users can quickly access their notes, instantly observe experiment results, and easily build interactive dashboards.
- Score: 5.1175396458764855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How can we better organize code in computational notebooks? Notebooks have
become a popular tool among data scientists, as they seamlessly weave text and
code together, supporting users to rapidly iterate and document code
experiments. However, it is often challenging to organize code in notebooks,
partially because there is a mismatch between the linear presentation of code
and the non-linear process of exploratory data analysis. We present StickyLand,
a notebook extension for empowering users to freely organize their code in
non-linear ways. With sticky cells that are always shown on the screen, users
can quickly access their notes, instantly observe experiment results, and
easily build interactive dashboards that support complex visual analytics. Case
studies highlight how our tool can enhance notebook users's productivity and
identify opportunities for future notebook designs. StickyLand is available at
https://github.com/xiaohk/stickyland.
Related papers
- Contextualized Data-Wrangling Code Generation in Computational Notebooks [131.26365849822932]
We propose an automated approach, CoCoMine, to mine data-wrangling code generation examples with clear multi-modal contextual dependency.
We construct CoCoNote, a dataset containing 58,221 examples for Contextualized Data-wrangling Code generation in Notebooks.
Experiment results demonstrate the significance of incorporating data context in data-wrangling code generation.
arXiv Detail & Related papers (2024-09-20T14:49:51Z) - Predicting the Understandability of Computational Notebooks through Code Metrics Analysis [0.5277756703318045]
We employ a fine-tuned DistilBERT transformer to identify user comments associated with code understandability.
We established a criterion called User Opinion Code Understandability (UOCU), which considers the number of relevant comments, upvotes on those comments, total notebook views, and total notebook upvotes.
We trained machine learning models to predict notebook code understandability based solely on their metrics.
arXiv Detail & Related papers (2024-06-16T15:58:40Z) - Notably Inaccessible -- Data Driven Understanding of Data Science
Notebook (In)Accessibility [13.428631054625797]
We perform a large scale systematic analysis of 100000 Jupyter notebooks to identify various accessibility challenges.
We make recommendations to improve accessibility of the artifacts of a notebook, suggest authoring practices, and propose changes to infrastructure to make notebooks accessible.
arXiv Detail & Related papers (2023-08-07T01:33:32Z) - InterCode: Standardizing and Benchmarking Interactive Coding with
Execution Feedback [50.725076393314964]
We introduce InterCode, a lightweight, flexible, and easy-to-use framework of interactive coding as a standard reinforcement learning environment.
Our framework is language and platform agnostic, uses self-contained Docker environments to provide safe and reproducible execution.
We demonstrate InterCode's viability as a testbed by evaluating multiple state-of-the-art LLMs configured with different prompting strategies.
arXiv Detail & Related papers (2023-06-26T17:59:50Z) - LongCoder: A Long-Range Pre-trained Language Model for Code Completion [56.813974784131624]
LongCoder employs a sliding window mechanism for self-attention and introduces two types of globally accessible tokens.
Bridge tokens are inserted throughout the input sequence to aggregate local information and facilitate global interaction.
memory tokens are included to highlight important statements that may be invoked later and need to be memorized.
arXiv Detail & Related papers (2023-06-26T17:59:24Z) - SuperNOVA: Design Strategies and Opportunities for Interactive Visualization in Computational Notebooks [34.04783941358773]
We analyze 163 interactive visualization tools for notebooks.
We identify key design implications and trade-offs.
We develop SuperNOVA, an open-source interactive browser to help researchers explore existing notebook visualization tools.
arXiv Detail & Related papers (2023-05-04T17:57:54Z) - STUNT: Few-shot Tabular Learning with Self-generated Tasks from
Unlabeled Tables [64.0903766169603]
We propose a framework for few-shot semi-supervised learning, coined Self-generated Tasks from UNlabeled Tables (STUNT)
Our key idea is to self-generate diverse few-shot tasks by treating randomly chosen columns as a target label.
We then employ a meta-learning scheme to learn generalizable knowledge with the constructed tasks.
arXiv Detail & Related papers (2023-03-02T02:37:54Z) - MONAI Label: A framework for AI-assisted Interactive Labeling of 3D
Medical Images [49.664220687980006]
The lack of annotated datasets is a major bottleneck for training new task-specific supervised machine learning models.
We present MONAI Label, a free and open-source framework that facilitates the development of applications based on artificial intelligence (AI) models.
arXiv Detail & Related papers (2022-03-23T12:33:11Z) - HAConvGNN: Hierarchical Attention Based Convolutional Graph Neural
Network for Code Documentation Generation in Jupyter Notebooks [33.37494243822309]
We propose a hierarchical attention-based ConvGNN component to augment the Seq2Seq network.
We build a dataset with publicly available Kaggle notebooks and evaluate our model (HAConvGNN) against baseline models.
arXiv Detail & Related papers (2021-03-31T22:36:41Z) - SLADE: A Self-Training Framework For Distance Metric Learning [75.54078592084217]
We present a self-training framework, SLADE, to improve retrieval performance by leveraging additional unlabeled data.
We first train a teacher model on the labeled data and use it to generate pseudo labels for the unlabeled data.
We then train a student model on both labels and pseudo labels to generate final feature embeddings.
arXiv Detail & Related papers (2020-11-20T08:26:10Z) - ReproduceMeGit: A Visualization Tool for Analyzing Reproducibility of
Jupyter Notebooks [0.0]
We present ReproduceMeGit, a visualization tool for analyzing the GitHub of Jupyter Notebooks.
The tool provides information on the number of notebooks that were successfully reproducible, those that resulted in exceptions, those with different results from the original notebooks, etc.
arXiv Detail & Related papers (2020-06-22T10:05:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.