Understanding and Remediating Open-Source License Incompatibilities in
the PyPI Ecosystem
- URL: http://arxiv.org/abs/2308.05942v1
- Date: Fri, 11 Aug 2023 04:57:54 GMT
- Title: Understanding and Remediating Open-Source License Incompatibilities in
the PyPI Ecosystem
- Authors: Weiwei Xu, Hao He, Kai Gao, Minghui Zhou
- Abstract summary: We conduct a large-scale empirical study of license incompatibilities and their remediation practices in the PyPI ecosystem.
We propose SILENCE, an SMT-solver-based approach to recommend license incompatibility remediations with minimal costs in package dependency graph.
- Score: 29.898303568884227
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The reuse and distribution of open-source software must be in compliance with
its accompanying open-source license. In modern packaging ecosystems,
maintaining such compliance is challenging because a package may have a complex
multi-layered dependency graph with many packages, any of which may have an
incompatible license. Although prior research finds that license
incompatibilities are prevalent, empirical evidence is still scarce in some
modern packaging ecosystems (e.g., PyPI). It also remains unclear how
developers remediate the license incompatibilities in the dependency graphs of
their packages (including direct and transitive dependencies), let alone any
automated approaches. To bridge this gap, we conduct a large-scale empirical
study of license incompatibilities and their remediation practices in the PyPI
ecosystem. We find that 7.27% of the PyPI package releases have license
incompatibilities and 61.3% of them are caused by transitive dependencies,
causing challenges in their remediation; for remediation, developers can apply
one of the five strategies: migration, removal, pinning versions, changing
their own licenses, and negotiation. Inspired by our findings, we propose
SILENCE, an SMT-solver-based approach to recommend license incompatibility
remediations with minimal costs in package dependency graph. Our evaluation
shows that the remediations proposed by SILENCE can match 19 historical
real-world cases (except for migrations not covered by an existing knowledge
base) and have been accepted by five popular PyPI packages whose developers
were previously unaware of their license incompatibilities.
Related papers
- An Overview and Catalogue of Dependency Challenges in Open Source Software Package Registries [52.23798016734889]
This article provides a catalogue of dependency-related challenges that come with relying on OSS packages or libraries.
The catalogue is based on the scientific literature on empirical research that has been conducted to understand, quantify and overcome these challenges.
arXiv Detail & Related papers (2024-09-27T16:20:20Z) - An Empirical Study on Package-Level Deprecation in Python Ecosystem [6.0347124337922144]
Python, a widely adopted programming language, is renowned for its extensive and diverse third-party package ecosystem.
A significant number of OSS packages within the Python ecosystem are in poor maintenance, leading to potential risks in functionality and security.
This paper investigates the current practices of announcing, receiving, and handling package-level deprecation in the Python ecosystem.
arXiv Detail & Related papers (2024-08-19T18:08:21Z) - Catch the Butterfly: Peeking into the Terms and Conflicts among SPDX
Licenses [16.948633594354412]
Third-party libraries (TPLs) in software development has accelerated the creation of modern software.
Developers may inadvertently violate the licenses of TPLs, leading to legal issues.
There is a need for a high-quality license dataset that encompasses a broad range of mainstream licenses.
arXiv Detail & Related papers (2024-01-19T11:27:34Z) - Less is More? An Empirical Study on Configuration Issues in Python PyPI
Ecosystem [38.44692482370243]
Python is widely used in the open-source community, largely owing to the extensive support from diverse third-party libraries.
Third-party libraries can potentially lead to conflicts in dependencies, prompting researchers to develop dependency conflict detectors.
endeavors have been made to automatically infer dependencies.
arXiv Detail & Related papers (2023-10-19T09:07:51Z) - Dependency Practices for Vulnerability Mitigation [4.710141711181836]
We analyze more than 450 vulnerabilities in the npm ecosystem to understand why dependent packages remain vulnerable.
We identify over 200,000 npm packages that are infected through their dependencies.
We use 9 features to build a prediction model that identifies packages that quickly adopt the vulnerability fix and prevent further propagation of vulnerabilities.
arXiv Detail & Related papers (2023-10-11T19:48:46Z) - LiResolver: License Incompatibility Resolution for Open Source Software [13.28021004336228]
LiResolver is a fine-grained, scalable, and flexible tool to resolve license incompatibility issues for open source software.
Comprehensive experiments demonstrate the effectiveness of LiResolver, with 4.09% false positive (FP) rate and 0.02% false negative (FN) rate for incompatibility issue localization.
arXiv Detail & Related papers (2023-06-26T13:16:09Z) - Analyzing Maintenance Activities of Software Libraries [65.268245109828]
Industrial applications heavily integrate open-source software libraries nowadays.
I want to introduce an automatic monitoring approach for industrial applications to identify open-source dependencies that show negative signs regarding their current or future maintenance activities.
arXiv Detail & Related papers (2023-06-09T16:51:25Z) - Quality-Based Conditional Processing in Multi-Biometrics: Application to
Sensor Interoperability [63.05238390013457]
We describe and evaluate the ATVS-UAM fusion approach submitted to the quality-based evaluation of the 2007 BioSecure Multimodal Evaluation Campaign.
Our approach is based on linear logistic regression, in which fused scores tend to be log-likelihood-ratios.
Results show that the proposed approach outperforms all the rule-based fusion schemes.
arXiv Detail & Related papers (2022-11-24T12:11:22Z) - DIETERpy: a Python framework for The Dispatch and Investment Evaluation
Tool with Endogenous Renewables [62.997667081978825]
DIETER is an open-source power sector model designed to analyze future settings with very high shares of variable renewable energy sources.
It minimizes overall system costs, including fixed and variable costs of various generation, flexibility and sector coupling options.
We introduce DIETERpy that builds on the existing model version, written in the General Algebraic Modeling System (GAMS) and enhances it with a Python framework.
arXiv Detail & Related papers (2020-10-02T09:27:33Z) - Implicit Distributional Reinforcement Learning [61.166030238490634]
implicit distributional actor-critic (IDAC) built on two deep generator networks (DGNs)
Semi-implicit actor (SIA) powered by a flexible policy distribution.
We observe IDAC outperforms state-of-the-art algorithms on representative OpenAI Gym environments.
arXiv Detail & Related papers (2020-07-13T02:52:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.