Detecting and removing bloated dependencies in CommonJS packages
- URL: http://arxiv.org/abs/2405.17939v2
- Date: Sat, 18 Jan 2025 07:29:36 GMT
- Title: Detecting and removing bloated dependencies in CommonJS packages
- Authors: Yuxin Liu, Deepika Tiwari, Cristian Bogdan, Benoit Baudry,
- Abstract summary: We present the first study to investigate bloated dependencies within server-side JavaScript applications.
We propose a trace-based dynamic analysis that monitors the OS file system to determine which dependencies are not accessed during runtime.
- Score: 6.115666382910127
- License:
- Abstract: JavaScript packages are notoriously prone to bloat, a factor that significantly impacts the performance and maintainability of web applications. While web bundlers and tree-shaking can mitigate this issue in client-side applications, state-of-the-art techniques have limitations on the detection and removal of bloat in server-side applications. In this paper, we present the first study to investigate bloated dependencies within server-side JavaScript applications, focusing on those built with the widely used and highly dynamic CommonJS module system. We propose a trace-based dynamic analysis that monitors the OS file system to determine which dependencies are not accessed during runtime. To evaluate our approach, we curate an original dataset of 91 CommonJS packages with a total of 50,488 dependencies. Compared to the state-of-the-art dynamic and static approaches, our trace-based analysis demonstrates higher accuracy in detecting bloated dependencies. Our analysis identifies 50.6% of the 50,488 dependencies as bloated: 13.8% of direct dependencies and 51.3% of indirect dependencies. Furthermore, removing only the direct bloated dependencies by cleaning the dependency configuration file can remove a significant share of unnecessary bloated indirect dependencies while preserving functional correctness.
Related papers
- DI-BENCH: Benchmarking Large Language Models on Dependency Inference with Testable Repositories at Scale [39.92722886613929]
DI-BENCH is a large-scale benchmark and evaluation framework designed to assess Large Language Models' capability on dependency inference.
The benchmark features 581 repositories with testing environments across Python, C#, Rust, and JavaScript.
Extensive experiments with textual and execution-based metrics reveal that the current best-performing model achieves only a 42.9% execution pass rate.
arXiv Detail & Related papers (2025-01-23T14:27:11Z) - Commit0: Library Generation from Scratch [77.38414688148006]
Commit0 is a benchmark that challenges AI agents to write libraries from scratch.
Agents are provided with a specification document outlining the library's API as well as a suite of interactive unit tests.
Commit0 also offers an interactive environment where models receive static analysis and execution feedback on the code they generate.
arXiv Detail & Related papers (2024-12-02T18:11:30Z) - A Preliminary Study on Self-Contained Libraries in the NPM Ecosystem [2.221643499902673]
The widespread of libraries within modern software ecosystems creates complex networks of dependencies.
One mitigation strategy involves reducing dependencies; libraries with zero dependencies become to self-contained.
This paper explores the characteristics of self-contained libraries within the NPM ecosystem.
arXiv Detail & Related papers (2024-06-17T09:33:49Z) - See to Believe: Using Visualization To Motivate Updating Third-party Dependencies [1.7914660044009358]
Security vulnerabilities introduced by applications using third-party dependencies are on the increase.
Developers are wary of library updates, even to fix vulnerabilities, citing that being unaware, or that the migration effort to update outweighs the decision.
In this paper, we hypothesize that the dependency graph visualization (DGV) approach will motivate developers to update.
arXiv Detail & Related papers (2024-05-15T03:57:27Z) - Dependency Practices for Vulnerability Mitigation [4.710141711181836]
We analyze more than 450 vulnerabilities in the npm ecosystem to understand why dependent packages remain vulnerable.
We identify over 200,000 npm packages that are infected through their dependencies.
We use 9 features to build a prediction model that identifies packages that quickly adopt the vulnerability fix and prevent further propagation of vulnerabilities.
arXiv Detail & Related papers (2023-10-11T19:48:46Z) - Analyzing Maintenance Activities of Software Libraries [65.268245109828]
Industrial applications heavily integrate open-source software libraries nowadays.
I want to introduce an automatic monitoring approach for industrial applications to identify open-source dependencies that show negative signs regarding their current or future maintenance activities.
arXiv Detail & Related papers (2023-06-09T16:51:25Z) - On the Security Blind Spots of Software Composition Analysis [46.1389163921338]
We present a novel approach to detect vulnerable clones in the Maven repository.
We retrieve over 53k potential vulnerable clones from Maven Central.
We detect 727 confirmed vulnerable clones and synthesize a testable proof-of-vulnerability project for each of those.
arXiv Detail & Related papers (2023-06-08T20:14:46Z) - Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation [103.90033029330527]
Few-Shot Instance (FSIS) requires detecting and segmenting novel classes with limited support examples.
We introduce a unified framework, Reference Twice (RefT), to exploit the relationship between support and query features for FSIS.
arXiv Detail & Related papers (2023-01-03T15:33:48Z) - Demystifying Dependency Bugs in Deep Learning Stack [7.488059560714949]
This paper characterizes symptoms, root causes and fix patterns of dependency bugs (DBs) across the whole Deep Learning stack.
Our findings shed light on practical implications on dependency management.
arXiv Detail & Related papers (2022-07-21T07:56:03Z) - Pack Together: Entity and Relation Extraction with Levitated Marker [61.232174424421025]
We propose a novel span representation approach, named Packed Levitated Markers, to consider the dependencies between the spans (pairs) by strategically packing the markers in the encoder.
Our experiments show that our model with packed levitated markers outperforms the sequence labeling model by 0.4%-1.9% F1 on three flat NER tasks, and beats the token concat model on six NER benchmarks.
arXiv Detail & Related papers (2021-09-13T15:38:13Z) - Modeling Multi-Label Action Dependencies for Temporal Action
Localization [53.53490517832068]
Real-world videos contain many complex actions with inherent relationships between action classes.
We propose an attention-based architecture that models these action relationships for the task of temporal action localization in unoccurrence videos.
We show improved performance over state-of-the-art methods on multi-label action localization benchmarks.
arXiv Detail & Related papers (2021-03-04T13:37:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.