Related papers: The Stackage Repository: An Exploratory Study of its Evolution

The Stackage Repository: An Exploratory Study of its Evolution

URL: http://arxiv.org/abs/2310.10887v2
Date: Mon, 23 Oct 2023 01:03:20 GMT
Title: The Stackage Repository: An Exploratory Study of its Evolution
Authors: Paul Leger and Felipe Ruiz and Nicol\'as Sep\'ulveda and Ismael Figueroa and Nicol\'as Cardozo
Abstract summary: This paper conducts empirical research about the evolution of Stackage considering monad packages. To the best of our knowledge, this is the first large-scale analysis of the evolution of the Stackage repository regarding packages used and monads.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Context. Package repositories for a programming language are increasingly common. A repository can keep a register of the evolution of its packages. In the programming language Haskell, with its defining characteristic monads, we can find the Stackage repository, which is a curated repository for stable Haskell packages in the Hackage repository. Despite the widespread use of Stackage in its industrial target, we are not aware of much empirical research about how this repository has evolved, including the use of monads. Objective. This paper conducts empirical research about the evolution of Stackage considering monad packages through 22 Long-Term Support releases during the period 2014-2023. Focusing on five research questions, this evolution is analyzed in terms of packages with their dependencies and imports; including the most used monad packages. To the best of our knowledge, this is the first large-scale analysis of the evolution of the Stackage repository regarding packages used and monads. Method. We define six research questions regarding the repository's evolution, and analyze them on 51,716 packages (17.05 GB) spread over 22 releases. For each package, we parse its cabal file and source code to extract the data, which is analyzed in terms of dependencies and imports using Pandas scripts. Results. From the methodology we get different findings. For example, there are packages that depend on other packages whose versions are not available in a particular release of Stackage; opening a potential stability issue. The mtl and transformers are on the top 10 packages most used/imported across releases of the Stackage evolution. We discussed these findings with Stackage maintainers, which allowed us to refine the research questions.

Related papers

MutaGReP: Execution-Free Repository-Grounded Plan Search for Code-Use [92.28400093066212]
MutaGReP is an approach to search for plans that decompose a user request into natural language steps grounded in a large code repository. Our plans use less than 5% of the 128K context window for GPT-4o but rival the coding performance of GPT-4o with a context window filled with the repo.
arXiv Detail & Related papers (2025-02-21T18:58:17Z)
A First Look at Package-to-Group Mechanism: An Empirical Study of the Linux Distributions [20.491275902894273]
A package-to-group mechanism (P2G) is employed to enable unified installation, uninstallation, and updates of multiple packages at once. This paper takes Linux distributions as a case study and presents an empirical study focusing on its application trends, evolutionary patterns, group quality, and developer tendencies.
arXiv Detail & Related papers (2024-10-14T03:48:20Z)
How to Understand Whole Software Repository? [64.19431011897515]
An excellent understanding of the whole repository will be the critical path to Automatic Software Engineering (ASE) We develop a novel method named RepoUnderstander by guiding agents to comprehensively understand the whole repositories. To better utilize the repository-level knowledge, we guide the agents to summarize, analyze, and plan.
arXiv Detail & Related papers (2024-06-03T15:20:06Z)
DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories [83.5195424237358]
Existing benchmarks are poorly aligned with real-world code repositories. We propose a new benchmark named DevEval, which has three advances. DevEval comprises 1,874 testing samples from 117 repositories, covering 10 popular domains.
arXiv Detail & Related papers (2024-05-30T09:03:42Z)
Analyzing the Accessibility of GitHub Repositories for PyPI and NPM Libraries [91.97201077607862]
Industrial applications heavily rely on open-source software (OSS) libraries, which provide various benefits. To monitor the activities of such communities, a comprehensive list of repositories for the libraries of an ecosystem must be accessible. In this study, we analyze the accessibility of GitHub repositories for PyPI and NPM libraries.
arXiv Detail & Related papers (2024-04-26T13:27:04Z)
PyRadar: Towards Automatically Retrieving and Validating Source Code Repository Information for PyPI Packages [24.8919191161202]
Existing tools can only retrieve repository information for up to 70.5% of PyPI releases. This paper proposes PyRadar, a novel framework that utilizes the metadata and source distribution to retrieve and validate the repository information for PyPI releases.
arXiv Detail & Related papers (2024-04-25T12:27:59Z)
npm-follower: A Complete Dataset Tracking the NPM Ecosystem [5.931961380320841]
npm-follower is a dataset and crawling architecture which archives metadata and code of all packages and versions as they are published. The dataset currently includes over 35 million versions of packages, and grows at a rate of about 1 million versions per month.
arXiv Detail & Related papers (2023-08-24T04:05:49Z)
Analyzing the Evolution of Inter-package Dependencies in Operating Systems: A Case Study of Ubuntu [7.76541950830141]
An Operating System (OS) combines multiple interdependent software packages, which usually have their own independently developed architectures. For an evolutionary effort, designers/developers of OS can greatly benefit from fully understanding the system-wide dependency focused on individual files. We propose a framework, DepEx, aimed at discovering the detailed package relations at the level of individual binary files.
arXiv Detail & Related papers (2023-07-10T10:12:21Z)
DADApy: Distance-based Analysis of DAta-manifolds in Python [51.37841707191944]
DADApy is a python software package for analysing and characterising high-dimensional data. It provides methods for estimating the intrinsic dimension and the probability density, for performing density-based clustering and for comparing different distance metrics.
arXiv Detail & Related papers (2022-05-04T08:41:59Z)
Repro: An Open-Source Library for Improving the Reproducibility and Usability of Publicly Available Research Code [74.28810048824519]
Repro is an open-source library which aims at improving the usability of research code. It provides a lightweight Python API for running software released by researchers within Docker containers.
arXiv Detail & Related papers (2022-04-29T01:54:54Z)
PyHHMM: A Python Library for Heterogeneous Hidden Markov Models [63.01207205641885]
PyHHMM is an object-oriented Python implementation of Heterogeneous-Hidden Markov Models (HHMMs) PyHHMM emphasizes features not supported in similar available frameworks: a heterogeneous observation model, missing data inference, different model order selection criterias, and semi-supervised training. PyHHMM relies on the numpy, scipy, scikit-learn, and seaborn Python packages, and is distributed under the Apache-2.0 License.
arXiv Detail & Related papers (2022-01-12T07:32:36Z)
An Empirical Analysis of the R Package Ecosystem [0.0]
We analyze more than 25,000 packages, 150,000 releases, and 15 million files across two decades. We find that the historical growth of the ecosystem has been robust under all measures.
arXiv Detail & Related papers (2021-02-19T12:55:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.