Towards Better Comprehension of Breaking Changes in the NPM Ecosystem
- URL: http://arxiv.org/abs/2408.14431v2
- Date: Mon, 14 Oct 2024 13:28:59 GMT
- Title: Towards Better Comprehension of Breaking Changes in the NPM Ecosystem
- Authors: Dezhen Kong, Jiakun Liu, Lingfeng Bao, David Lo,
- Abstract summary: We conduct a large-scale empirical study to investigate breaking changes in the NPM ecosystem.
We construct a dataset of explicitly documented breaking changes from 381 popular NPM projects.
We yield a taxonomy of JavaScript and TypeScript-specific syntactic breaking changes and a taxonomy of major types of behavioral breaking changes.
- Score: 12.392457751450374
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Breaking changes cause a lot of effort to both downstream and upstream developers: downstream developers need to adapt to breaking changes and upstream developers are responsible for identifying and documenting them. In the NPM ecosystem, characterized by frequent code changes and a high tolerance for making breaking changes, the effort is larger. For better comprehension of breaking changes in the NPM ecosystem and to enhance breaking change detection tools, we conduct a large-scale empirical study to investigate breaking changes in the NPM ecosystem. We construct a dataset of explicitly documented breaking changes from 381 popular NPM projects. We find that 95.4% of the detected breaking changes can be covered by developers' documentation, and about 19% of the breaking changes cannot be detected by regression testing. Then in the process of investigating source code of our collected breaking changes, we yield a taxonomy of JavaScript and TypeScript-specific syntactic breaking changes and a taxonomy of major types of behavioral breaking changes. Additionally, we investigate the reasons why developers make breaking changes in NPM and find three major reasons, i.e., to reduce code redundancy, to improve identifier name, and to improve API design, and each category contains several sub-items. We provide actionable implications for future research, e.g., automatic naming and renaming techniques should be applied in JavaScript projects to improve identifier names, future research can try to detect more types of behavioral breaking changes. By presenting the implications, we also discuss the weakness of automatic renaming and breaking change detection approaches.
Related papers
- ChangeGuard: Validating Code Changes via Pairwise Learning-Guided Execution [16.130469984234956]
ChangeGuard is an approach that uses learning-guided execution to compare the runtime behavior of a modified function.
Our results show that the approach identifies semantics-changing code changes with a precision of 77.1% and a recall of 69.5%.
arXiv Detail & Related papers (2024-10-21T15:13:32Z) - Enhancing Perception of Key Changes in Remote Sensing Image Change Captioning [49.24306593078429]
We propose a novel framework for remote sensing image change captioning, guided by Key Change Features and Instruction-tuned (KCFI)
KCFI includes a ViTs encoder for extracting bi-temporal remote sensing image features, a key feature perceiver for identifying critical change areas, and a pixel-level change detection decoder.
To validate the effectiveness of our approach, we compare it against several state-of-the-art change captioning methods on the LEVIR-CC dataset.
arXiv Detail & Related papers (2024-09-19T09:33:33Z) - Understanding Code Change with Micro-Changes [9.321152185934105]
We present a catalog of micro-changes, together with an automated micro-change detector.
We found that our detector is capable of explaining more than 67% of the changes taking place in the systems under study.
arXiv Detail & Related papers (2024-09-16T01:47:25Z) - Language Modeling with Editable External Knowledge [90.7714362827356]
This paper introduces ERASE, which improves model behavior when new documents are acquired.
It incrementally deletes or rewriting other entries in the knowledge base each time a document is added.
It improves accuracy relative to conventional retrieval-augmented generation by 7-13% (Mixtral-8x7B) and 6-10% (Llama-3-8B) absolute.
arXiv Detail & Related papers (2024-06-17T17:59:35Z) - An Empirical Study of Token-based Micro Commits [1.4749940504074461]
In software development, developers frequently apply maintenance activities to the source code that change a few lines by a single commit.
In this paper, we define micro commits, a type of small change based on changed tokens.
We find that micro commits mainly replace a single name or literal token, and micro commits are more likely used to fix bugs.
arXiv Detail & Related papers (2024-05-15T07:52:13Z) - Segment Any Change [64.23961453159454]
We propose a new type of change detection model that supports zero-shot prediction and generalization on unseen change types and data distributions.
AnyChange is built on the segment anything model (SAM) via our training-free adaptation method, bitemporal latent matching.
We also propose a point query mechanism to enable AnyChange's zero-shot object-centric change detection capability.
arXiv Detail & Related papers (2024-02-02T07:17:39Z) - MS-Former: Memory-Supported Transformer for Weakly Supervised Change
Detection with Patch-Level Annotations [50.79913333804232]
We propose a memory-supported transformer (MS-Former) for weakly supervised change detection.
MS-Former consists of a bi-directional attention block (BAB) and a patch-level supervision scheme (PSS)
Experimental results on three benchmark datasets demonstrate the effectiveness of our proposed method in the change detection task.
arXiv Detail & Related papers (2023-11-16T09:57:29Z) - A Large-Scale Empirical Study on Semantic Versioning in Golang Ecosystem [38.357000816448405]
We conduct the first large-scale empirical study in the Go ecosystem to study SemVer compliance in terms of breaking changes and their impact.
We collect the first large-scale Go dataset with a dependency graph from GitHub, including 124K TPLs and 532K client programs.
arXiv Detail & Related papers (2023-09-06T10:33:00Z) - Do code refactorings influence the merge effort? [80.1936417993664]
Multiple contributors frequently change the source code in parallel to implement new features, fix bugs, existing code, and make other changes.
These simultaneous changes need to be merged into the same version of the source code.
Studies show that 10 to 20 percent of all merge attempts result in conflicts, which require the manual developer's intervention to complete the process.
arXiv Detail & Related papers (2023-05-10T13:24:59Z) - Editing Factual Knowledge in Language Models [51.947280241185]
We present KnowledgeEditor, a method that can be used to edit this knowledge.
Besides being computationally efficient, KnowledgeEditor does not require any modifications in LM pre-training.
We show KnowledgeEditor's efficacy with two popular architectures and knowledge-intensive tasks.
arXiv Detail & Related papers (2021-04-16T15:24:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.