Jup2Kub: algorithms and a system to translate a Jupyter Notebook
pipeline to a fault tolerant distributed Kubernetes deployment
- URL: http://arxiv.org/abs/2311.12308v1
- Date: Tue, 21 Nov 2023 02:54:06 GMT
- Title: Jup2Kub: algorithms and a system to translate a Jupyter Notebook
pipeline to a fault tolerant distributed Kubernetes deployment
- Authors: Jinli Duan, Shasha Dennis
- Abstract summary: Scientific facilitate computational, data manipulation, and sometimes visualization steps for scientific data analysis.
Jupyter notebooks struggle to scale with larger data sets, lack failure tolerance, and depend heavily on the stability of underlying tools and packages.
Jup2Kup translates from Jupyter notebooks into a distributed, high-performance environment, enhancing fault tolerance.
- Score: 0.9790236766474201
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scientific workflows facilitate computational, data manipulation, and
sometimes visualization steps for scientific data analysis. They are vital for
reproducing and validating experiments, usually involving computational steps
in scientific simulations and data analysis. These workflows are often
developed by domain scientists using Jupyter notebooks, which are convenient
yet face limitations: they struggle to scale with larger data sets, lack
failure tolerance, and depend heavily on the stability of underlying tools and
packages. To address these issues, Jup2Kup has been developed. This software
system translates workflows from Jupyter notebooks into a distributed,
high-performance Kubernetes environment, enhancing fault tolerance. It also
manages software dependencies to maintain operational stability amidst changes
in tools and packages.
Related papers
- Cuvis.Ai: An Open-Source, Low-Code Software Ecosystem for Hyperspectral Processing and Classification [0.4038539043067986]
cuvis.ai is an open-source and low-code software ecosystem for data acquisition, preprocessing, and model training.
The package is written in Python and provides wrappers around common machine learning libraries.
arXiv Detail & Related papers (2024-11-18T06:33:40Z) - KGym: A Platform and Dataset to Benchmark Large Language Models on Linux Kernel Crash Resolution [59.20933707301566]
Large Language Models (LLMs) are consistently improving at increasingly realistic software engineering (SE) tasks.
In real-world software stacks, significant SE effort is spent developing foundational system software like the Linux kernel.
To evaluate if ML models are useful while developing such large-scale systems-level software, we introduce kGym and kBench.
arXiv Detail & Related papers (2024-07-02T21:44:22Z) - Untangling Knots: Leveraging LLM for Error Resolution in Computational Notebooks [4.318590074766604]
We propose a potential solution for resolving errors in computational notebooks via an iterative LLM-based agent.
We discuss the questions raised by this approach and share a novel dataset of computational notebooks containing bugs.
arXiv Detail & Related papers (2024-03-26T18:53:17Z) - Pynblint: a Static Analyzer for Python Jupyter Notebooks [10.190501703364234]
Pynblint is a static analyzer for Jupyter notebooks written in Python.
It checks compliance of notebooks (and surrounding repositories) with a set of empirically validated best practices.
arXiv Detail & Related papers (2022-05-24T09:56:03Z) - Satellite Image Time Series Analysis for Big Earth Observation Data [50.591267188664666]
This paper describes sits, an open-source R package for satellite image time series analysis using machine learning.
We show that this approach produces high accuracy for land use and land cover maps through a case study in the Cerrado biome.
arXiv Detail & Related papers (2022-04-24T15:23:25Z) - Kubric: A scalable dataset generator [73.78485189435729]
Kubric is a Python framework that interfaces with PyBullet and Blender to generate photo-realistic scenes, with rich annotations, and seamlessly scales to large jobs distributed over thousands of machines.
We demonstrate the effectiveness of Kubric by presenting a series of 13 different generated datasets for tasks ranging from studying 3D NeRF models to optical flow estimation.
arXiv Detail & Related papers (2022-03-07T18:13:59Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Robust and Transferable Anomaly Detection in Log Data using Pre-Trained
Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users.
We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z) - Superiority of Simplicity: A Lightweight Model for Network Device
Workload Prediction [58.98112070128482]
We propose a lightweight solution for series prediction based on historic observations.
It consists of a heterogeneous ensemble method composed of two models - a neural network and a mean predictor.
It achieves an overall $R2$ score of 0.10 on the available FedCSIS 2020 challenge dataset.
arXiv Detail & Related papers (2020-07-07T15:44:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.