A Large-Scale Empirical Study on Semantic Versioning in Golang Ecosystem
- URL: http://arxiv.org/abs/2309.02894v2
- Date: Mon, 18 Sep 2023 02:07:59 GMT
- Title: A Large-Scale Empirical Study on Semantic Versioning in Golang Ecosystem
- Authors: Wenke Li, Feng Wu, Cai Fu, Fan Zhou
- Abstract summary: We conduct the first large-scale empirical study in the Go ecosystem to study SemVer compliance in terms of breaking changes and their impact.
We collect the first large-scale Go dataset with a dependency graph from GitHub, including 124K TPLs and 532K client programs.
- Score: 38.357000816448405
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Third-party libraries (TPLs) have become an essential component of software,
accelerating development and reducing maintenance costs. However, breaking
changes often occur during the upgrades of TPLs and prevent client programs
from moving forward. Semantic versioning (SemVer) has been applied to
standardize the versions of releases according to compatibility, but not all
releases follow SemVer compliance. Lots of work focuses on SemVer compliance in
ecosystems such as Java and JavaScript beyond Golang (Go for short). Due to the
lack of tools to detect breaking changes and dataset for Go, developers of TPLs
do not know if breaking changes occur and affect client programs, and
developers of client programs may hesitate to upgrade dependencies in terms of
breaking changes.
To bridge this gap, we conduct the first large-scale empirical study in the
Go ecosystem to study SemVer compliance in terms of breaking changes and their
impact. In detail, we purpose GoSVI (Go Semantic Versioning Insight) to detect
breaking changes and analyze their impact by resolving identifiers in client
programs and comparing their types with breaking changes. Moreover, we collect
the first large-scale Go dataset with a dependency graph from GitHub, including
124K TPLs and 532K client programs. Based on the dataset, our results show that
86.3% of library upgrades follow SemVer compliance and 28.6% of no-major
upgrades introduce breaking changes. Furthermore, the tendency to comply with
SemVer has improved over time from 63.7% in 2018/09 to 92.2% in 2023/03.
Finally, we find 33.3% of downstream client programs may be affected by
breaking changes. These findings provide developers and users of TPLs with
valuable insights to help make decisions related to SemVer.
Related papers
- Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement [62.94719119451089]
Lingma SWE-GPT series learns from and simulating real-world code submission activities.
Lingma SWE-GPT 72B resolves 30.20% of GitHub issues, marking a significant improvement in automatic issue resolution.
arXiv Detail & Related papers (2024-11-01T14:27:16Z) - Evaluating Software Development Agents: Patch Patterns, Code Quality, and Issue Complexity in Real-World GitHub Scenarios [13.949319911378826]
This study evaluated 4,892 patches from 10 top-ranked agents on 500 real-world GitHub issues.
No single agent dominated, with 170 issues unresolved, indicating room for improvement.
Most agents maintained code reliability and security, avoiding new bugs or vulnerabilities.
Some agents increased code complexity, many reduced code duplication and minimized code smells.
arXiv Detail & Related papers (2024-10-16T11:33:57Z) - The Impact of SBOM Generators on Vulnerability Assessment in Python: A Comparison and a Novel Approach [56.4040698609393]
Software Bill of Materials (SBOM) has been promoted as a tool to increase transparency and verifiability in software composition.
Current SBOM generation tools often suffer from inaccuracies in identifying components and dependencies.
We propose PIP-sbom, a novel pip-inspired solution that addresses their shortcomings.
arXiv Detail & Related papers (2024-09-10T10:12:37Z) - Towards Better Comprehension of Breaking Changes in the NPM Ecosystem [12.392457751450374]
We conduct a large-scale empirical study to investigate breaking changes in the NPM ecosystem.
We construct a dataset of explicitly documented breaking changes from 381 popular NPM projects.
We yield a taxonomy of JavaScript and TypeScript-specific syntactic breaking changes and a taxonomy of major types of behavioral breaking changes.
arXiv Detail & Related papers (2024-08-26T17:18:38Z) - Impact of the Availability of ChatGPT on Software Development: A Synthetic Difference in Differences Estimation using GitHub Data [49.1574468325115]
ChatGPT is an AI tool that enhances software production efficiency.
We estimate ChatGPT's effects on the number of git pushes, repositories, and unique developers per 100,000 people.
These results suggest that AI tools like ChatGPT can substantially boost developer productivity, though further analysis is needed to address potential downsides such as low quality code and privacy concerns.
arXiv Detail & Related papers (2024-06-16T19:11:15Z) - See to Believe: Using Visualization To Motivate Updating Third-party Dependencies [1.7914660044009358]
Security vulnerabilities introduced by applications using third-party dependencies are on the increase.
Developers are wary of library updates, even to fix vulnerabilities, citing that being unaware, or that the migration effort to update outweighs the decision.
In this paper, we hypothesize that the dependency graph visualization (DGV) approach will motivate developers to update.
arXiv Detail & Related papers (2024-05-15T03:57:27Z) - Empirical Analysis of Vulnerabilities Life Cycle in Golang Ecosystem [0.773844059806915]
A comprehensive investigation was undertaken to examine the life cycle of vulnerability in Golang.
It turned out that 66.10% of modules in the Golang ecosystem were affected by vulnerabilities.
By analyzing reasons behind non-lagged and lagged vulnerabilities, timely releasing and indexing patch versions could significantly enhance ecosystem security.
arXiv Detail & Related papers (2023-12-31T14:53:51Z) - MS-Former: Memory-Supported Transformer for Weakly Supervised Change
Detection with Patch-Level Annotations [50.79913333804232]
We propose a memory-supported transformer (MS-Former) for weakly supervised change detection.
MS-Former consists of a bi-directional attention block (BAB) and a patch-level supervision scheme (PSS)
Experimental results on three benchmark datasets demonstrate the effectiveness of our proposed method in the change detection task.
arXiv Detail & Related papers (2023-11-16T09:57:29Z) - Client-side Gradient Inversion Against Federated Learning from Poisoning [59.74484221875662]
Federated Learning (FL) enables distributed participants to train a global model without sharing data directly to a central server.
Recent studies have revealed that FL is vulnerable to gradient inversion attack (GIA), which aims to reconstruct the original training samples.
We propose Client-side poisoning Gradient Inversion (CGI), which is a novel attack method that can be launched from clients.
arXiv Detail & Related papers (2023-09-14T03:48:27Z) - Multi-Granularity Detector for Vulnerability Fixes [13.653249890867222]
We propose MiDas (Multi-Granularity Detector for Vulnerability Fixes) to identify vulnerability-fixing commits.
MiDas constructs different neural networks for each level of code change granularity, corresponding to commit-level, file-level, hunk-level, and line-level.
MiDas outperforms the current state-of-the-art baseline in terms of AUC by 4.9% and 13.7% on Java and Python-based datasets.
arXiv Detail & Related papers (2023-05-23T10:06:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.