Software Bills of Materials in Maven Central
- URL: http://arxiv.org/abs/2501.13832v1
- Date: Thu, 23 Jan 2025 16:56:40 GMT
- Title: Software Bills of Materials in Maven Central
- Authors: Yogya Gamage, Nadia Gonzalez Fernandez, Martin Monperrus, Benoit Baudry,
- Abstract summary: There is little knowledge about how developers distribute Software Bills of Materials (SBOMs)
We mine SBOMs from Maven Central to assess the extent to which developers publish SBOMs along with the artifacts.
We present our methodology to mine SBOMs, as well as novel insights about SBOM publication.
- Score: 9.699225997570384
- License:
- Abstract: Software Bills of Materials (SBOMs) are essential to ensure the transparency and integrity of the software supply chain. There is a growing body of work that investigates the accuracy of SBOM generation tools and the challenges for producing complete SBOMs. Yet, there is little knowledge about how developers distribute SBOMs. In this work, we mine SBOMs from Maven Central to assess the extent to which developers publish SBOMs along with the artifacts. We develop our work on top of the Goblin framework, which consists of a Maven Central dependency graph and a Weaver that allows augmenting the dependency graph with additional data. For this study, we select a sample of 10% of release nodes from the Maven Central dependency graph and collected 14,071 SBOMs from 7,290 package releases. We then augment the Maven Central dependency graph with the collected SBOMs. We present our methodology to mine SBOMs, as well as novel insights about SBOM publication. Our dataset is the first set of SBOMs collected from a package registry. We make it available as a standalone dataset, which can be used for future research about SBOMs and package distribution.
Related papers
- Categorical Schrödinger Bridge Matching [58.760054965084656]
The Schr"odinger Bridge (SB) is a powerful framework for solving generative modeling tasks such as unpaired domain translation.
We provide a theoretical and algorithmic foundation for solving SB in discrete spaces using the recently introduced Iterative Markovian Fitting (IMF) procedure.
This enables us to develop a practical computational algorithm for SB which we call Categorical Schr"odinger Bridge Matching (CSBM)
arXiv Detail & Related papers (2025-02-03T14:55:28Z) - Supply Chain Insecurity: The Lack of Integrity Protection in SBOM Solutions [0.0]
The Software Bill of Materials (SBOM) is paramount in ensuring software supply chain security.
Under the Executive Order issued by President Biden, the adoption of the SBOM has become obligatory within the United States.
This work presents an in-depth and systematic investigation into the integrity of SBOMs.
arXiv Detail & Related papers (2024-12-06T15:52:12Z) - MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models [66.64809260956312]
We propose a multi-granularity tool-use benchmark for large language models called MTU-Bench.
Our MTU-Bench is collected by transforming existing high-quality datasets to simulate real-world tool usage scenarios.
Comprehensive experimental results demonstrate the effectiveness of our MTU-Bench.
arXiv Detail & Related papers (2024-10-15T15:46:17Z) - InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning [58.7966588457529]
InfiMM-WebMath-40B is a high-quality dataset of interleaved image-text documents.
It comprises 24 million web pages, 85 million associated image URLs, and 40 billion text tokens, all meticulously extracted and filtered from CommonCrawl.
Our evaluations on text-only benchmarks show that, despite utilizing only 40 billion tokens, our dataset significantly enhances the performance of our 1.3B model.
Our models set a new state-of-the-art among open-source models on multi-modal math benchmarks such as MathVerse and We-Math.
arXiv Detail & Related papers (2024-09-19T08:41:21Z) - SBOM Generation Tools in the Python Ecosystem: an In-Detail Analysis [2.828503885204035]
We analyze four popular SBOM generation tools using the CycloneDX standard.
We highlight issues related to dependency versions, metadata files, remote dependencies, and optional dependencies.
We identify a systematic issue with the lack of standards for metadata in the PyPI ecosystem.
arXiv Detail & Related papers (2024-09-02T12:48:10Z) - xGen-MM (BLIP-3): A Family of Open Large Multimodal Models [157.44696790158784]
This report introduces xGen-MM, a framework for developing Large Multimodal Models (LMMs)
The framework comprises meticulously curated datasets, a training recipe, model architectures, and a resulting suite of LMMs.
Our models undergo rigorous evaluation across a range of tasks, including both single and multi-image benchmarks.
arXiv Detail & Related papers (2024-08-16T17:57:01Z) - Masked Image Modeling: A Survey [73.21154550957898]
Masked image modeling emerged as a powerful self-supervised learning technique in computer vision.
We construct a taxonomy and review the most prominent papers in recent years.
We aggregate the performance results of various masked image modeling methods on the most popular datasets.
arXiv Detail & Related papers (2024-08-13T07:27:02Z) - Decentralized Monte Carlo Tree Search for Partially Observable
Multi-agent Pathfinding [49.730902939565986]
Multi-Agent Pathfinding problem involves finding a set of conflict-free paths for a group of agents confined to a graph.
In this study, we focus on the decentralized MAPF setting, where the agents may observe the other agents only locally.
We propose a decentralized multi-agent Monte Carlo Tree Search (MCTS) method for MAPF tasks.
arXiv Detail & Related papers (2023-12-26T06:57:22Z) - The Stackage Repository: An Exploratory Study of its Evolution [0.0]
This paper conducts empirical research about the evolution of Stackage considering monad packages.
To the best of our knowledge, this is the first large-scale analysis of the evolution of the Stackage repository regarding packages used and monads.
arXiv Detail & Related papers (2023-10-16T23:42:47Z) - HiPart: Hierarchical Divisive Clustering Toolbox [0.0]
HiPart is an open-source python library that provides efficient and interpret-able implementations of divisive hierarchical clustering algorithms.
HiPart supports interactive visualizations for the manipulation of the execution steps allowing the direct intervention of the clustering outcome.
arXiv Detail & Related papers (2022-09-18T23:48:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.