Policy-driven Software Bill of Materials on GitHub: An Empirical Study
- URL: http://arxiv.org/abs/2509.01255v1
- Date: Mon, 01 Sep 2025 08:45:39 GMT
- Title: Policy-driven Software Bill of Materials on GitHub: An Empirical Study
- Authors: Oleksii Novikov, Davide Fucci, Oleksandr Adamov, Daniel Mendez,
- Abstract summary: The Software Bill of Materials (SBOM) is a machine-readable list of all the software dependencies included in a software.<n>Despite mandates from governments to use SBOM, research on this artifact is still in its early stages.
- Score: 14.398115591070727
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Background. The Software Bill of Materials (SBOM) is a machine-readable list of all the software dependencies included in a software. SBOM emerged as way to assist securing the software supply chain. However, despite mandates from governments to use SBOM, research on this artifact is still in its early stages. Aims. We want to understand the current state of SBOM in open-source projects, focusing specifically on policy-driven SBOMs, i.e., SBOM created to achieve security goals, such as enhancing project transparency and ensuring compliance, rather than being used as fixtures for tools or artificially generated for benchmarking or academic research purposes. Method. We performed a mining software repository study to collect and carefully select SBOM files hosted on GitHub. We analyzed the information reported in policy-driven SBOMs and the vulnerabilities associated with the declared dependencies by means of descriptive statistics. Results. We show that only 0.56% of popular GitHub repositories contain policy-driven SBOM. The declared dependencies contain 2,202 unique vulnerabilities, while 22% of them do not report licensing information. Conclusion. Our findings provide insights for SBOM usage to support security assessment and licensing.
Related papers
- A Large Scale Empirical Analysis on the Adherence Gap between Standards and Tools in SBOM [54.38424417079265]
A Software Bill of Materials (SBOM) is a machine-readable artifact that organizes software information.<n>Following standards, organizations have developed tools for generating and utilizing SBOMs.<n>This paper presents the first large-scale, two-stage empirical analysis of the adherence gap, using our automated evaluation framework, SAP.
arXiv Detail & Related papers (2026-01-09T08:26:05Z) - BinMetric: A Comprehensive Binary Analysis Benchmark for Large Language Models [50.17907898478795]
We introduce BinMetric, a benchmark designed to evaluate the performance of large language models on binary analysis tasks.<n>BinMetric comprises 1,000 questions derived from 20 real-world open-source projects across 6 practical binary analysis tasks.<n>Our empirical study on this benchmark investigates the binary analysis capabilities of various state-of-the-art LLMs, revealing their strengths and limitations in this field.
arXiv Detail & Related papers (2025-05-12T08:54:07Z) - A Dataset of Software Bill of Materials for Evaluating SBOM Consumption Tools [6.081142345739704]
A Software Bill of Materials (SBOM) is a list of components used in software.<n> Numerous tools support software dependency management through SBOMs.<n>There is no publicly available dataset specifically designed for this purpose.<n>We present a dataset of SBOMs generated from real-world Java projects.
arXiv Detail & Related papers (2025-04-09T13:35:02Z) - Wild SBOMs: a Large-scale Dataset of Software Bills of Materials from Public Code [4.1920378271058425]
Developers gain productivity by reusing readily available Free and Open Source Software (FOSS) components.<n>One approach to handle those difficulties is to use Software Bill of Materials (SBOMs)<n>A large scale study on SBOM practices based on SBOM files produced in the wild is still lacking.
arXiv Detail & Related papers (2025-03-19T09:20:28Z) - Augmenting Software Bills of Materials with Software Vulnerability Description: A Preliminary Study on GitHub [10.609785671796873]
This paper reports the results of a preliminary study in which we augmented SBOMs of 40 open-source projects with information about Common Vulnerabilities and Exposures.<n>Our augmented SBOMs have been evaluated by submitting pull requests and by asking project owners to answer a survey.<n>Although, in most cases, augmented SBOMs were not directly accepted because owners required a continuous SBOM update, the received feedback shows the usefulness of the suggested SBOM augmentation.
arXiv Detail & Related papers (2025-03-18T08:04:22Z) - SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution [56.9361004704428]
Large Language Models (LLMs) have demonstrated remarkable proficiency across a variety of complex tasks.<n>SWE-Fixer is a novel open-source framework designed to effectively and efficiently resolve GitHub issues.<n>We assess our approach on the SWE-Bench Lite and Verified benchmarks, achieving competitive performance among open-source models.
arXiv Detail & Related papers (2025-01-09T07:54:24Z) - Supply Chain Insecurity: The Lack of Integrity Protection in SBOM Solutions [0.0]
The Software Bill of Materials (SBOM) is paramount in ensuring software supply chain security.<n>Under the Executive Order issued by President Biden, the adoption of the SBOM has become obligatory within the United States.<n>We present an in-depth and systematic investigation of the trust that can be put into the output of SBOMs.
arXiv Detail & Related papers (2024-12-06T15:52:12Z) - The Impact of SBOM Generators on Vulnerability Assessment in Python: A Comparison and a Novel Approach [56.4040698609393]
Software Bill of Materials (SBOM) has been promoted as a tool to increase transparency and verifiability in software composition.
Current SBOM generation tools often suffer from inaccuracies in identifying components and dependencies.
We propose PIP-sbom, a novel pip-inspired solution that addresses their shortcomings.
arXiv Detail & Related papers (2024-09-10T10:12:37Z) - Alibaba LingmaAgent: Improving Automated Issue Resolution via Comprehensive Repository Exploration [64.19431011897515]
This paper presents Alibaba LingmaAgent, a novel Automated Software Engineering method designed to comprehensively understand and utilize whole software repositories for issue resolution.<n>Our approach introduces a top-down method to condense critical repository information into a knowledge graph, reducing complexity, and employs a Monte Carlo tree search based strategy.<n>In production deployment and evaluation at Alibaba Cloud, LingmaAgent automatically resolved 16.9% of in-house issues faced by development engineers, and solved 43.3% of problems after manual intervention.
arXiv Detail & Related papers (2024-06-03T15:20:06Z) - On the Security Blind Spots of Software Composition Analysis [46.1389163921338]
We present a novel approach to detect vulnerable clones in the Maven repository.
We retrieve over 53k potential vulnerable clones from Maven Central.
We detect 727 confirmed vulnerable clones and synthesize a testable proof-of-vulnerability project for each of those.
arXiv Detail & Related papers (2023-06-08T20:14:46Z) - SafePILCO: a software tool for safe and data-efficient policy synthesis [67.17251247987187]
SafePILCO is a software tool for safe and data-efficient policy search with reinforcement learning.
It extends the known PILCO algorithm, originally written in Python, to support safe learning.
arXiv Detail & Related papers (2020-08-07T17:17:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.