Related papers: An Investigation of Patch Porting Practices of the Linux Kernel Ecosystem

An Investigation of Patch Porting Practices of the Linux Kernel Ecosystem

URL: http://arxiv.org/abs/2402.05212v1
Date: Wed, 7 Feb 2024 19:38:48 GMT
Title: An Investigation of Patch Porting Practices of the Linux Kernel Ecosystem
Authors: Xingyu Li, Zheng Zhang, Zhiyun Qian, Trent Jaeger, Chengyu Song
Abstract summary: We investigate the responsiveness of patch porting in the Linux ecosystem. We find diverse patch porting strategies and competence levels that help explain the phenomenon. We offer recommendations based on our analysis of the general patch flow.
Score: 39.80455045213432
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Open-source software is increasingly reused, complicating the process of patching to repair bugs. In the case of Linux, a distinct ecosystem has formed, with Linux mainline serving as the upstream, stable or long-term-support (LTS) systems forked from mainline, and Linux distributions, such as Ubuntu and Android, as downstreams forked from stable or LTS systems for end-user use. Ideally, when a patch is committed in the Linux upstream, it should not introduce new bugs and be ported to all the applicable downstream branches in a timely fashion. However, several concerns have been expressed in prior work about the responsiveness of patch porting in this Linux ecosystem. In this paper, we mine the software repositories to investigate a range of Linux distributions in combination with Linux stable and LTS, and find diverse patch porting strategies and competence levels that help explain the phenomenon. Furthermore, we show concretely using three metrics, i.e., patch delay, patch rate, and bug inheritance ratio, that different porting strategies have different tradeoffs. We find that hinting tags(e.g., Cc stable tags and fixes tags) are significantly important to the prompt patch porting, but it is noteworthy that a substantial portion of patches remain devoid of these indicative tags. Finally, we offer recommendations based on our analysis of the general patch flow, e.g., interactions among various stakeholders in the ecosystem and automatic generation of hinting tags, as well as tailored suggestions for specific porting strategies.

Related papers

CrashFixer: A crash resolution agent for the Linux kernel [58.152358195983155]
This work builds upon kGym, which shares a benchmark for system-level Linux kernel bugs and a platform to run experiments on the Linux kernel. This paper introduces CrashFixer, the first LLM-based software repair agent that is applicable to Linux kernel bugs.
arXiv Detail & Related papers (2025-04-29T04:18:51Z)
Analyzing the Usage of Donation Platforms for PyPI Libraries [91.97201077607862]
This study analyzes the adoption of donation platforms in the PyPI ecosystem. GitHub Sponsors is the dominant platform, though many PyPI-listed links are outdated.
arXiv Detail & Related papers (2025-03-11T10:27:31Z)
Bomfather: An eBPF-based Kernel-level Monitoring Framework for Accurate Identification of Unknown, Unused, and Dynamically Loaded Dependencies in Modern Software Supply Chains [0.0]
Inaccuracies in dependency-tracking methods undermine the security and integrity of modern software supply chains. This paper introduces a kernel-level framework leveraging extended Berkeley Packet Filter (eBPF) to capture software build dependencies transparently in real time.
arXiv Detail & Related papers (2025-03-03T22:32:59Z)
Uncovering and Mitigating the Impact of Frozen Package Versions for Fixed-Release Linux [38.53185042161599]
We study the ecosystem gap of fixed-release Linux caused by the evolution of mirrors. We propose a novel package management approach allowing for separate dependency environments based on native Debian mirrors. We present a working prototype, named ccenv, which can effectively remedy the inadequacy of current tools.
arXiv Detail & Related papers (2024-08-21T14:01:46Z)
KGym: A Platform and Dataset to Benchmark Large Language Models on Linux Kernel Crash Resolution [59.20933707301566]
Large Language Models (LLMs) are consistently improving at increasingly realistic software engineering (SE) tasks. In real-world software stacks, significant SE effort is spent developing foundational system software like the Linux kernel. To evaluate if ML models are useful while developing such large-scale systems-level software, we introduce kGym and kBench.
arXiv Detail & Related papers (2024-07-02T21:44:22Z)
Patch2QL: Discover Cognate Defects in Open Source Software Supply Chain With Auto-generated Static Analysis Rules [1.9591497166224197]
We propose a novel technique for detecting cognate defects in OSS through the automatic generation of SAST rules. Specifically, it extracts key syntax and semantic information from pre- and post-patch versions of code. We have implemented a prototype tool called Patch2QL and applied it to fundamental OSS in C/C++.
arXiv Detail & Related papers (2024-01-23T02:23:11Z)
Empirical Analysis of Vulnerabilities Life Cycle in Golang Ecosystem [0.773844059806915]
A comprehensive investigation was undertaken to examine the life cycle of vulnerability in Golang. It turned out that 66.10% of modules in the Golang ecosystem were affected by vulnerabilities. By analyzing reasons behind non-lagged and lagged vulnerabilities, timely releasing and indexing patch versions could significantly enhance ecosystem security.
arXiv Detail & Related papers (2023-12-31T14:53:51Z)
RLTrace: Synthesizing High-Quality System Call Traces for OS Fuzz Testing [10.644829779197341]
We propose a deep reinforcement learning-based solution, called RLTrace, to synthesize diverse and comprehensive system call traces as the seed to fuzz OS kernels. During model training, the deep learning model interacts with OS kernels and infers optimal system call traces. Our evaluation shows that RLTrace outperforms other seed generators by producing more comprehensive system call traces.
arXiv Detail & Related papers (2023-10-04T06:46:00Z)
RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic Program Repair [75.40584530380589]
We propose a novel Retrieval-Augmented Patch Generation framework (RAP-Gen) RAP-Gen explicitly leveraging relevant fix patterns retrieved from a list of previous bug-fix pairs. We evaluate RAP-Gen on three benchmarks in two programming languages, including the TFix benchmark in JavaScript, and Code Refinement and Defects4J benchmarks in Java.
arXiv Detail & Related papers (2023-09-12T08:52:56Z)
Multilevel Semantic Embedding of Software Patches: A Fine-to-Coarse Grained Approach Towards Security Patch Detection [6.838615442552715]
We introduce a multilevel Semantic Embedder for security patch detection, termed MultiSEM. This model harnesses word-centric vectors at a fine-grained level, emphasizing the significance of individual words. We further enrich this representation by assimilating patch descriptions to obtain a holistic semantic portrait.
arXiv Detail & Related papers (2023-08-29T11:41:21Z)
PatchMix Augmentation to Identify Causal Features in Few-shot Learning [55.64873998196191]
Few-shot learning aims to transfer knowledge learned from base with sufficient categories labelled data to novel categories with scarce known information. We propose a novel data augmentation strategy dubbed as PatchMix that can break this spurious dependency. We show that such an augmentation mechanism, different from existing ones, is able to identify the causal features.
arXiv Detail & Related papers (2022-11-29T08:41:29Z)
S3M: Siamese Stack (Trace) Similarity Measure [55.58269472099399]
We present S3M -- the first approach to computing stack trace similarity based on deep learning. It is based on a biLSTM encoder and a fully-connected classifier to compute similarity. Our experiments demonstrate the superiority of our approach over the state-of-the-art on both open-sourced data and a private JetBrains dataset.
arXiv Detail & Related papers (2021-03-18T21:10:41Z)
Rethinking Generative Zero-Shot Learning: An Ensemble Learning Perspective for Recognising Visual Patches [52.67723703088284]
We propose a novel framework called multi-patch generative adversarial nets (MPGAN) MPGAN synthesises local patch features and labels unseen classes with a novel weighted voting strategy. MPGAN has significantly greater accuracy than state-of-the-art methods.
arXiv Detail & Related papers (2020-07-27T05:49:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.