scikit-package -- software packaging standards and roadmap for sharing reproducible scientific software
- URL: http://arxiv.org/abs/2507.03328v2
- Date: Tue, 08 Jul 2025 07:05:38 GMT
- Title: scikit-package -- software packaging standards and roadmap for sharing reproducible scientific software
- Authors: S. Lee, C. Myers, A. Yang, T. Zhang, S. J. L. Billinge,
- Abstract summary: scikit-package provides a roadmap to facilitate code reuse and sharing.<n>The goal of the project is to provide practical tools for scientists who are not professionally trained software engineers.
- Score: 0.9162364621797965
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scientific advancement relies on the ability to share and reproduce results. When data analysis or calculations are carried out using software written by scientists there are special challenges around code versions, quality and code sharing. scikit-package provides a roadmap to facilitate code reuse and sharing with minimal effort through tutorials coupled with automated and centralized reusable workflows. The goal of the project is to provide pedagogical and practical tools for scientists who are not professionally trained software engineers to write more reusable and maintainable software code. Code reuse can occur at multiple levels of complexity-from turning a code block into a function within a single script, to publishing a publicly installable, fully tested, and documented software package scikit-package provides a community maintained set of tools, and a roadmap, to help scientists bring their software higher levels of reproducibility and shareability.
Related papers
- Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning [57.09163579304332]
We introduce PaperCoder, a framework that transforms machine learning papers into functional code repositories.<n>PaperCoder operates in three stages: planning, designs the system architecture with diagrams, identifies file dependencies, and generates configuration files.<n>We then evaluate PaperCoder on generating code implementations from machine learning papers based on both model-based and human evaluations.
arXiv Detail & Related papers (2025-04-24T01:57:01Z) - PyPackIT: Automated Research Software Engineering for Scientific Python Applications on GitHub [0.0]
PyPackIT is a user-friendly, ready-to-use software that enables scientists to focus on the scientific aspects of their projects.<n> PyPackIT offers a robust project infrastructure including a build-ready Python package skeleton, a fully operational documentation and test suite, and a control center for dynamic project management.
arXiv Detail & Related papers (2025-03-06T19:41:55Z) - ToolCoder: A Systematic Code-Empowered Tool Learning Framework for Large Language Models [81.12673534903979]
Tool learning has emerged as a crucial capability for large language models (LLMs) to solve complex real-world tasks through interaction with external tools.<n>We propose ToolCoder, a novel framework that reformulates tool learning as a code generation task.
arXiv Detail & Related papers (2025-02-17T03:42:28Z) - RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph [63.87660059104077]
We present RepoGraph, a plug-in module that manages a repository-level structure for modern AI software engineering solutions.<n>RepoGraph substantially boosts the performance of all systems, leading to a new state-of-the-art among open-source frameworks.
arXiv Detail & Related papers (2024-10-03T05:45:26Z) - Code Compass: A Study on the Challenges of Navigating Unfamiliar Codebases [2.808331566391181]
We propose a novel tool, Code, to address these issues.
Our study highlights a significant gap in current tools and methodologies.
Our formative study demonstrates how effectively the tool reduces the time developers spend navigating documentation.
arXiv Detail & Related papers (2024-05-10T06:58:31Z) - Collaborative, Code-Proximal Dynamic Software Visualization within Code
Editors [55.57032418885258]
This paper introduces the design and proof-of-concept implementation for a software visualization approach that can be embedded into code editors.
Our contribution differs from related work in that we use dynamic analysis of a software system's runtime behavior.
Our visualization approach enhances common remote pair programming tools and is collaboratively usable by employing shared code cities.
arXiv Detail & Related papers (2023-08-30T06:35:40Z) - CodeTF: One-stop Transformer Library for State-of-the-art Code LLM [72.1638273937025]
We present CodeTF, an open-source Transformer-based library for state-of-the-art Code LLMs and code intelligence.
Our library supports a collection of pretrained Code LLM models and popular code benchmarks.
We hope CodeTF is able to bridge the gap between machine learning/generative AI and software engineering.
arXiv Detail & Related papers (2023-05-31T05:24:48Z) - The Basis of Design Tools for Quantum Computing: Arrays, Decision
Diagrams, Tensor Networks, and ZX-Calculus [55.58528469973086]
Quantum computers promise to efficiently solve important problems classical computers never will.
A fully automated quantum software stack needs to be developed.
This work provides a look "under the hood" of today's tools and showcases how these means are utilized in them, e.g., for simulation, compilation, and verification of quantum circuits.
arXiv Detail & Related papers (2023-01-10T19:00:00Z) - Quantum Advantage Seeker with Kernels (QuASK): a software framework to
speed up the research in quantum machine learning [0.9217021281095907]
QuASK is an open-source quantum machine learning framework written in Python.
It implements most state-of-the-art algorithms to analyze the data through quantum kernels.
It can be used as a command-line tool to download datasets, pre-process them, quantum machine learning routines, analyze and visualize the results.
arXiv Detail & Related papers (2022-06-30T13:43:16Z) - Collective Knowledge: organizing research projects as a database of
reusable components and portable workflows with common APIs [0.2538209532048866]
This article provides the motivation and overview of the Collective Knowledge framework (CK or cKnowledge)
The CK concept is to decompose research projects into reusable components that encapsulate research artifacts.
The long-term goal is to accelerate innovation by connecting researchers and practitioners to share and reuse all their knowledge.
arXiv Detail & Related papers (2020-11-02T17:42:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.