An Overview and Catalogue of Dependency Challenges in Open Source Software Package Registries
- URL: http://arxiv.org/abs/2409.18884v3
- Date: Tue, 19 Nov 2024 20:44:16 GMT
- Title: An Overview and Catalogue of Dependency Challenges in Open Source Software Package Registries
- Authors: Tom Mens, Alexandre Decan,
- Abstract summary: This article provides a catalogue of dependency-related challenges that come with relying on OSS packages or libraries.
The catalogue is based on the scientific literature on empirical research that has been conducted to understand, quantify and overcome these challenges.
- Score: 52.23798016734889
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While open-source software has enabled significant levels of reuse to speed up software development, it has also given rise to the dreadful dependency hell that all software practitioners face on a regular basis. This article provides a catalogue of dependency-related challenges that come with relying on OSS packages or libraries. The catalogue is based on the scientific literature on empirical research that has been conducted to understand, quantify and overcome these challenges. Our overview of this very active research field of package dependency management can be used as a starting point for junior and senior researchers as well as practitioners that would like to learn more about research advances in dealing with the challenges that come with the dependency networks of large OSS package registries.
Related papers
- Why Authors and Maintainers Link (or Don't Link) Their PyPI Libraries to Code Repositories and Donation Platforms [83.16077040470975]
Metadata of libraries on the Python Package Index (PyPI) plays a critical role in supporting the transparency, trust, and sustainability of open-source libraries.<n>This paper presents a large-scale empirical study combining two targeted surveys sent to 50,000 PyPI authors and maintainers.<n>We analyze more than 1,400 responses using large language model (LLM)-based topic modeling to uncover key motivations and barriers related to linking repositories and donation platforms.
arXiv Detail & Related papers (2026-01-21T16:13:57Z) - Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey [59.3507264893654]
Issue resolution is a complex Software Engineering task integral to real-world development.<n> benchmarks like SWE-bench revealed this task as profoundly difficult for large language models.<n>This paper presents a systematic survey of this emerging domain.
arXiv Detail & Related papers (2026-01-15T18:55:03Z) - Deep Research: A Systematic Survey [118.82795024422722]
Deep Research (DR) aims to combine the reasoning capabilities of large language models with external tools, such as search engines.<n>This survey presents a comprehensive and systematic overview of deep research systems.
arXiv Detail & Related papers (2025-11-24T15:28:28Z) - Uncovering Scientific Software Sustainability through Community Engagement and Software Quality Metrics [0.0]
This paper explores the sustainability of scientific open-source software (Sci-OSS) projects hosted on GitHub.<n>We map sustainability to repository metrics from the literature and mined data from ten prominent Sci-OSS projects.<n>Our visualization and analysis methods offer researchers, funders, and developers key insights into long-term software sustainability.
arXiv Detail & Related papers (2025-11-11T05:34:27Z) - Executable Knowledge Graphs for Replicating AI Research [65.41207324831583]
Executable Knowledge Graphs (xKG) is a modular and pluggable knowledge base that automatically integrates technical insights, code snippets, and domain-specific knowledge extracted from scientific literature.<n>Code will released at https://github.com/zjunlp/xKG.
arXiv Detail & Related papers (2025-10-20T17:53:23Z) - Open Source, Hidden Costs: A Systematic Literature Review on OSS License Management [10.002122950923967]
Integrating third-party software components is a common practice in modern software development.<n>A lack of understanding may lead to disputes, which can pose serious legal and operational challenges.
arXiv Detail & Related papers (2025-07-03T14:02:15Z) - WebThinker: Empowering Large Reasoning Models with Deep Research Capability [60.81964498221952]
WebThinker is a deep research agent that empowers large reasoning models to autonomously search the web, navigate web pages, and draft research reports during the reasoning process.<n>It also employs an textbfAutonomous Think-Search-and-Draft strategy, allowing the model to seamlessly interleave reasoning, information gathering, and report writing in real time.<n>Our approach enhances LRM reliability and applicability in complex scenarios, paving the way for more capable and versatile deep research systems.
arXiv Detail & Related papers (2025-04-30T16:25:25Z) - Insights into Dependency Maintenance Trends in the Maven Ecosystem [0.14999444543328289]
We present a quantitative analysis of the Neo4j dataset using the Goblin framework.
Our analysis reveals that releases with fewer dependencies have a higher number of missed releases.
Our study shows that the dependencies in the latest releases have positive freshness scores, indicating better software management efficacy.
arXiv Detail & Related papers (2025-03-28T22:20:24Z) - Tracking Down Software Cluster Bombs: A Current State Analysis of the Free/Libre and Open Source Software (FLOSS) Ecosystem [0.43981305860983705]
This study provides a summary of the current state of available FLOSS package repositories.
It addresses the challenge of identifying problematic areas within a software ecosystem.
The results indicate that while there are well-maintained projects within the FLOSS ecosystem, there are also high-impact projects that are susceptible to supply chain attacks.
arXiv Detail & Related papers (2025-02-12T08:57:57Z) - Making Software FAIR: A machine-assisted workflow for the research software lifecycle [2.682583873311538]
SoFAIR will extend the capabilities of widely used open scholarly infrastructures.
It will deliver and deploy an effective solution for the management of the research software lifecycle.
arXiv Detail & Related papers (2025-01-08T14:17:26Z) - Towards Sample-Efficiency and Generalization of Transfer and Inverse Reinforcement Learning: A Comprehensive Literature Review [50.67937325077047]
This paper is devoted to a comprehensive review of realizing the sample efficiency and generalization of RL algorithms through transfer and inverse reinforcement learning (T-IRL)
Our findings denote that a majority of recent research works have dealt with the aforementioned challenges by utilizing human-in-the-loop and sim-to-real strategies.
Under the IRL structure, training schemes that require a low number of experience transitions and extension of such frameworks to multi-agent and multi-intention problems have been the priority of researchers in recent years.
arXiv Detail & Related papers (2024-11-15T15:18:57Z) - GEMS: Generative Expert Metric System through Iterative Prompt Priming [18.0413505095456]
Non-experts can find it unintuitive to create effective measures or transform theories into context-specific metrics.
This technical report addresses this challenge by examining software communities within large software corporations.
We propose a prompt-engineering framework inspired by neural activities, demonstrating that generative models can extract and summarize theories.
arXiv Detail & Related papers (2024-10-01T17:14:54Z) - Multi-Source Knowledge Pruning for Retrieval-Augmented Generation: A Benchmark and Empirical Study [46.55831783809377]
Retrieval-augmented generation (RAG) is increasingly recognized as an effective approach to mitigating the hallucination of large language models (LLMs)
We develop PruningRAG, a plug-and-play RAG framework that uses multi-granularity pruning strategies to more effectively incorporate relevant context and mitigate the negative impact of misleading information.
arXiv Detail & Related papers (2024-09-03T03:31:37Z) - A Survey of AIOps for Failure Management in the Era of Large Language Models [60.59720351854515]
This paper presents a comprehensive survey of AIOps technology for failure management in the LLM era.
It includes a detailed definition of AIOps tasks for failure management, the data sources for AIOps, and the LLM-based approaches adopted for AIOps.
arXiv Detail & Related papers (2024-06-17T05:13:24Z) - How to Understand Whole Software Repository? [64.19431011897515]
An excellent understanding of the whole repository will be the critical path to Automatic Software Engineering (ASE)
We develop a novel method named RepoUnderstander by guiding agents to comprehensively understand the whole repositories.
To better utilize the repository-level knowledge, we guide the agents to summarize, analyze, and plan.
arXiv Detail & Related papers (2024-06-03T15:20:06Z) - The Code the World Depends On: A First Look at Technology Makers' Open Source Software Dependencies [3.6840775431698893]
Open-source software (OSS) supply chain security has become a topic of concern for organizations.
Patching an OSS vulnerability can require updating other dependent software products in addition to the original package.
We do not know what packages are most critical to patch, hindering efforts to improve OSS security where it is most needed.
arXiv Detail & Related papers (2024-04-17T21:44:38Z) - Biomedical Open Source Software: Crucial Packages and Hidden Heroes [2.3960586265742574]
We map the dependencies of the software used in biomedical papers and find the packages critical to the software ecosystems.
We propose the centrality metrics for the network of software dependencies, analyze three ecosystems (PyPi, CRAN, Bioconductor) and determine the packages with the highest centrality.
arXiv Detail & Related papers (2024-04-10T01:22:02Z) - A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond [84.95530356322621]
This survey presents a systematic review of the advancements in code intelligence.
It covers over 50 representative models and their variants, more than 20 categories of tasks, and an extensive coverage of over 680 related works.
Building on our examination of the developmental trajectories, we further investigate the emerging synergies between code intelligence and broader machine intelligence.
arXiv Detail & Related papers (2024-03-21T08:54:56Z) - PyRCA: A Library for Metric-based Root Cause Analysis [66.72542200701807]
PyRCA is an open-source machine learning library of Root Cause Analysis (RCA) for Artificial Intelligence for IT Operations (AIOps)
It provides a holistic framework to uncover the complicated metric causal dependencies and automatically locate root causes of incidents.
arXiv Detail & Related papers (2023-06-20T09:55:10Z) - Towards Measuring Vulnerabilities and Exposures in Open-Source Packages [0.0]
We provide an up-to-date overview of the open source landscape.
We discuss approaches to map entries of the Common Vulnerabilities and Exposures ( CVE) list to open-source libraries.
We show the frequency and distribution of existing CVE entries with respect to popular programming languages.
arXiv Detail & Related papers (2022-06-29T10:51:23Z) - Empirical Study on the Software Engineering Practices in Open Source ML
Package Repositories [6.2894222252929985]
Modern Machine Learning technologies require considerable technical expertise and resources to develop, train and deploy such models.
Such discovery and reuse by practitioners and researchers are being addressed by public ML package repositories.
This paper conducts an exploratory study that analyzes the structure and contents of two popular ML package repositories.
arXiv Detail & Related papers (2020-12-02T18:52:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.