A First Look at Package-to-Group Mechanism: An Empirical Study of the Linux Distributions
- URL: http://arxiv.org/abs/2410.10131v1
- Date: Mon, 14 Oct 2024 03:48:20 GMT
- Title: A First Look at Package-to-Group Mechanism: An Empirical Study of the Linux Distributions
- Authors: Dongming Jin, Nianyu Li, Kai Yang, Minghui Zhou, Zhi Jin,
- Abstract summary: A package-to-group mechanism (P2G) is employed to enable unified installation, uninstallation, and updates of multiple packages at once.
This paper takes Linux distributions as a case study and presents an empirical study focusing on its application trends, evolutionary patterns, group quality, and developer tendencies.
- Score: 20.491275902894273
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reusing third-party software packages is a common practice in software development. As the scale and complexity of open-source software (OSS) projects continue to grow (e.g., Linux distributions), the number of reused third-party packages has significantly increased. Therefore, maintaining effective package management is critical for developing and evolving OSS projects. To achieve this, a package-to-group mechanism (P2G) is employed to enable unified installation, uninstallation, and updates of multiple packages at once. To better understand this mechanism, this paper takes Linux distributions as a case study and presents an empirical study focusing on its application trends, evolutionary patterns, group quality, and developer tendencies. By analyzing 11,746 groups and 193,548 packages from 89 versions of 5 popular Linux distributions and conducting questionnaire surveys with Linux practitioners and researchers, we derive several key insights. Our findings show that P2G is increasingly being adopted, particularly in popular Linux distributions. P2G follows six evolutionary patterns (\eg splitting and merging groups). Interestingly, packages no longer managed through P2G are more likely to remain in Linux distributions rather than being directly removed. To assess the effectiveness of P2G, we propose a metric called {\sc GValue} to evaluate the quality of groups and identify issues such as inadequate group descriptions and insufficient group sizes. We also summarize five types of packages that tend to adopt P2G, including graphical desktops, networks, etc. To the best of our knowledge, this is the first study focusing on the P2G mechanisms. We expect our study can assist in the efficient management of packages and reduce the burden on practitioners in rapidly growing Linux distributions and other open-source software projects.
Related papers
- An Overview and Catalogue of Dependency Challenges in Open Source Software Package Registries [52.23798016734889]
This article provides a catalogue of dependency-related challenges that come with relying on OSS packages or libraries.
The catalogue is based on the scientific literature on empirical research that has been conducted to understand, quantify and overcome these challenges.
arXiv Detail & Related papers (2024-09-27T16:20:20Z) - A Systematic Approach to Evaluating Development Activity in Heterogeneous Package Management Systems for Overall System Health Assessment [0.0]
We develop a method to identify packages within a Linux distribution that show low development activity between versions of the OSS projects included in a release.
We use regular expressions to extract the epoch and upstream project major, minor, and patch versions for more than 6000 packages in the Ubuntu distribution.
arXiv Detail & Related papers (2024-09-06T19:58:20Z) - Uncovering and Mitigating the Impact of Frozen Package Versions for Fixed-Release Linux [38.53185042161599]
We study the ecosystem gap of fixed-release Linux caused by the evolution of mirrors.
We propose a novel package management approach allowing for separate dependency environments based on native Debian mirrors.
We present a working prototype, named ccenv, which can effectively remedy the inadequacy of current tools.
arXiv Detail & Related papers (2024-08-21T14:01:46Z) - An Empirical Study on Package-Level Deprecation in Python Ecosystem [6.0347124337922144]
Python, a widely adopted programming language, is renowned for its extensive and diverse third-party package ecosystem.
A significant number of OSS packages within the Python ecosystem are in poor maintenance, leading to potential risks in functionality and security.
This paper investigates the current practices of announcing, receiving, and handling package-level deprecation in the Python ecosystem.
arXiv Detail & Related papers (2024-08-19T18:08:21Z) - How to Understand Whole Software Repository? [64.19431011897515]
An excellent understanding of the whole repository will be the critical path to Automatic Software Engineering (ASE)
We develop a novel method named RepoUnderstander by guiding agents to comprehensively understand the whole repositories.
To better utilize the repository-level knowledge, we guide the agents to summarize, analyze, and plan.
arXiv Detail & Related papers (2024-06-03T15:20:06Z) - Analyzing the Evolution of Inter-package Dependencies in Operating
Systems: A Case Study of Ubuntu [7.76541950830141]
An Operating System (OS) combines multiple interdependent software packages, which usually have their own independently developed architectures.
For an evolutionary effort, designers/developers of OS can greatly benefit from fully understanding the system-wide dependency focused on individual files.
We propose a framework, DepEx, aimed at discovering the detailed package relations at the level of individual binary files.
arXiv Detail & Related papers (2023-07-10T10:12:21Z) - Partitioning Distributed Compute Jobs with Reinforcement Learning and
Graph Neural Networks [58.720142291102135]
Large-scale machine learning models are bringing advances to a broad range of fields.
Many of these models are too large to be trained on a single machine, and must be distributed across multiple devices.
We show that maximum parallelisation is sub-optimal in relation to user-critical metrics such as throughput and blocking rate.
arXiv Detail & Related papers (2023-01-31T17:41:07Z) - Tackling Long-Tailed Category Distribution Under Domain Shifts [50.21255304847395]
Existing approaches cannot handle the scenario where both issues exist.
We designed three novel core functional blocks including Distribution Calibrated Classification Loss, Visual-Semantic Mapping and Semantic-Similarity Guided Augmentation.
Two new datasets were proposed for this problem, named AWA2-LTS and ImageNet-LTS.
arXiv Detail & Related papers (2022-07-20T19:07:46Z) - Tangelo: An Open-source Python Package for End-to-end Chemistry
Workflows on Quantum Computers [85.21205677945196]
Tangelo is an open-source Python software package for the development of end-to-end chemistry on quantum computers.
It aims to support the design of successful experiments on quantum hardware, and to facilitate advances in quantum algorithm development.
arXiv Detail & Related papers (2022-06-24T17:44:00Z) - pymdp: A Python library for active inference in discrete state spaces [52.85819390191516]
pymdp is an open-source package for simulating active inference in Python.
We provide the first open-source package for simulating active inference with POMDPs.
arXiv Detail & Related papers (2022-01-11T12:18:44Z) - Underproduction: An Approach for Measuring Risk in Open Source Software [9.701036831490766]
'Underproduction' occurs when the supply of software engineering labor becomes out of alignment with the demand of people who rely on the software produced.
We present a conceptual framework for identifying relative underproduction in software as well as a statistical method for applying our framework to a comprehensive dataset.
arXiv Detail & Related papers (2021-02-27T23:18:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.