Studying Popular Open Source Machine Learning Libraries and Their
Cross-Ecosystem Bindings
- URL: http://arxiv.org/abs/2201.07201v1
- Date: Tue, 18 Jan 2022 18:53:21 GMT
- Title: Studying Popular Open Source Machine Learning Libraries and Their
Cross-Ecosystem Bindings
- Authors: Hao Li and Cor-Paul Bezemer
- Abstract summary: Open source machine learning (ML) libraries allow developers to integrate advanced ML functionality into their own applications.
ML libraries are not available in all programming languages and software package ecosystems.
We conduct an in-depth study of 155 cross-ecosystem bindings and their development for 36 popular open source ML libraries.
- Score: 13.318005126654208
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Open source machine learning (ML) libraries allow developers to integrate
advanced ML functionality into their own applications. However, popular ML
libraries, such as TensorFlow, are not available natively in all programming
languages and software package ecosystems. Hence, developers who wish to use an
ML library which is not available in their programming language or ecosystem of
choice, may need to resort to using a so-called binding library. Binding
libraries provide support across programming languages and package ecosystems
for a source library. For example, the Keras .NET binding provides support for
the Keras library in the NuGet (.NET) ecosystem even though the Keras library
was written in Python. In this paper, we conduct an in-depth study of 155
cross-ecosystem bindings and their development for 36 popular open source ML
libraries. Our study shows that for most popular ML libraries, only one package
ecosystem is officially supported (usually PyPI). Cross-ecosystem support,
which is available for 25% of the studied ML libraries, is usually provided
through community-maintained bindings, e.g., 73% of the bindings in the npm
ecosystem are community-maintained. Our study shows that the vast majority of
the studied bindings cover only a small portion of the source library releases,
and the delay for receiving support for a source library release is large.
Related papers
- Contributing Back to the Ecosystem: A User Survey of NPM Developers [10.154686574810501]
Survey of 49 developers from the NPM ecosystem.
We find that developers are more likely to maintain their own packages rather than contribute to the ecosystem.
Our results open up new avenues into tool support and research into how to sustain these ecosystems.
arXiv Detail & Related papers (2024-07-01T00:15:55Z) - PufferLib: Making Reinforcement Learning Libraries and Environments Play Nice [0.8702432681310401]
PufferLib provides one-line environment wrappers that eliminate common compatibility problems.
With PufferLib, you can use familiar libraries like CleanRL and SB3 to scale from classic benchmarks like Atari and Procgen to complex simulators like NetHack and Neural MMO.
All of our code is free and open-source software under the MIT license.
arXiv Detail & Related papers (2024-06-11T21:13:34Z) - Analyzing the Accessibility of GitHub Repositories for PyPI and NPM Libraries [91.97201077607862]
Industrial applications heavily rely on open-source software (OSS) libraries, which provide various benefits.
To monitor the activities of such communities, a comprehensive list of repositories for the libraries of an ecosystem must be accessible.
In this study, we analyze the accessibility of GitHub repositories for PyPI and NPM libraries.
arXiv Detail & Related papers (2024-04-26T13:27:04Z) - Causal-learn: Causal Discovery in Python [53.17423883919072]
Causal discovery aims at revealing causal relations from observational data.
$textitcausal-learn$ is an open-source Python library for causal discovery.
arXiv Detail & Related papers (2023-07-31T05:00:35Z) - SequeL: A Continual Learning Library in PyTorch and JAX [50.33956216274694]
SequeL is a library for Continual Learning that supports both PyTorch and JAX frameworks.
It provides a unified interface for a wide range of Continual Learning algorithms, including regularization-based approaches, replay-based approaches, and hybrid approaches.
We release SequeL as an open-source library, enabling researchers and developers to easily experiment and extend the library for their own purposes.
arXiv Detail & Related papers (2023-04-21T10:00:22Z) - An Empirical Study of Library Usage and Dependency in Deep Learning
Frameworks [12.624032509149869]
pytorch, Caffe, and Scikit-learn are the most frequent combination in 18% and 14% of the projects.
The developer uses two or three dl libraries in the same projects and tends to use different multiple dl libraries in both the same function and the same files.
arXiv Detail & Related papers (2022-11-28T19:31:56Z) - Code Librarian: A Software Package Recommendation System [65.05559087332347]
We present a recommendation engine called Librarian for open source libraries.
A candidate library package is recommended for a given context if: 1) it has been frequently used with the imported libraries in the program; 2) it has similar functionality to the imported libraries in the program; 3) it has similar functionality to the developer's implementation, and 4) it can be used efficiently in the context of the provided code.
arXiv Detail & Related papers (2022-10-11T12:30:05Z) - Repro: An Open-Source Library for Improving the Reproducibility and
Usability of Publicly Available Research Code [74.28810048824519]
Repro is an open-source library which aims at improving the usability of research code.
It provides a lightweight Python API for running software released by researchers within Docker containers.
arXiv Detail & Related papers (2022-04-29T01:54:54Z) - skrl: Modular and Flexible Library for Reinforcement Learning [0.0]
skrl is an open-source modular library for reinforcement learning written in Python.
It allows loading, configuring, and operating NVIDIA Isaac Gym environments.
arXiv Detail & Related papers (2022-02-08T12:43:31Z) - Solo-learn: A Library of Self-supervised Methods for Visual
Representation Learning [83.02597612195966]
solo-learn is a library of self-supervised methods for visual representation learning.
Implemented in Python, using Pytorch and Pytorch lightning, the library fits both research and industry needs.
arXiv Detail & Related papers (2021-08-03T22:19:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.