Automating SBOM Generation with Zero-Shot Semantic Similarity
- URL: http://arxiv.org/abs/2403.08799v1
- Date: Sat, 3 Feb 2024 18:14:13 GMT
- Title: Automating SBOM Generation with Zero-Shot Semantic Similarity
- Authors: Devin Pereira, Christopher Molloy, Sudipta Acharya, Steven H. H. Ding,
- Abstract summary: A Software-Bill-of-Materials (SBOM) is a comprehensive inventory detailing a software application's components and dependencies.
We propose an automated method for generating SBOMs to prevent disastrous supply-chain attacks.
Our test results are compelling, demonstrating the model's strong performance in the zero-shot classification task.
- Score: 2.169562514302842
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is becoming increasingly important in the software industry, especially with the growing complexity of software ecosystems and the emphasis on security and compliance for manufacturers to inventory software used on their systems. A Software-Bill-of-Materials (SBOM) is a comprehensive inventory detailing a software application's components and dependencies. Current approaches rely on case-based reasoning to inconsistently identify the software components embedded in binary files. We propose a different route, an automated method for generating SBOMs to prevent disastrous supply-chain attacks. Remaining on the topic of static code analysis, we interpret this problem as a semantic similarity task wherein a transformer model can be trained to relate a product name to corresponding version strings. Our test results are compelling, demonstrating the model's strong performance in the zero-shot classification task, further demonstrating the potential for use in a real-world cybersecurity context.
Related papers
- SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI [47.11178028457252]
We develop SecCodePLT, a unified and comprehensive evaluation platform for code GenAIs' risks.
For insecure code, we introduce a new methodology for data creation that combines experts with automatic generation.
For cyberattack helpfulness, we construct samples to prompt a model to generate actual attacks, along with dynamic metrics in our environment.
arXiv Detail & Related papers (2024-10-14T21:17:22Z) - The Impact of SBOM Generators on Vulnerability Assessment in Python: A Comparison and a Novel Approach [56.4040698609393]
Software Bill of Materials (SBOM) has been promoted as a tool to increase transparency and verifiability in software composition.
Current SBOM generation tools often suffer from inaccuracies in identifying components and dependencies.
We propose PIP-sbom, a novel pip-inspired solution that addresses their shortcomings.
arXiv Detail & Related papers (2024-09-10T10:12:37Z) - Exploring the extent of similarities in software failures across industries using LLMs [0.0]
This research utilizes the Failure Analysis Investigation with LLMs (FAIL) model to extract industry-specific information.
In previous work news articles were collected from reputable sources and categorized by incidents inside a database.
This research extends these methods by categorizing articles into specific domains and types of software failures.
arXiv Detail & Related papers (2024-08-07T03:48:07Z) - SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering [79.07755560048388]
SWE-agent is a system that facilitates LM agents to autonomously use computers to solve software engineering tasks.
SWE-agent's custom agent-computer interface (ACI) significantly enhances an agent's ability to create and edit code files, navigate entire repositories, and execute tests and other programs.
We evaluate SWE-agent on SWE-bench and HumanEvalFix, achieving state-of-the-art performance on both with a pass@1 rate of 12.5% and 87.7%, respectively.
arXiv Detail & Related papers (2024-05-06T17:41:33Z) - OmniBOR: A System for Automatic, Verifiable Artifact Resolution across
Software Supply Chains [0.0]
OmniBOR is a minimalistic scheme for build tools to create an artifact dependency graph.
We present the architecture of OmniBOR, the underlying data representations, and two implementations that produce OmniBOR data and embed it into built software.
arXiv Detail & Related papers (2024-02-14T06:50:16Z) - A Novel Approach to Identify Security Controls in Source Code [4.598579706242066]
This paper enumerates a comprehensive list of commonly used security controls and creates a dataset for each one of them.
It uses the state-of-the-art NLP technique Bidirectional Representations from Transformers (BERT) and the Tactic Detector from our prior work to show that security controls could be identified with high confidence.
arXiv Detail & Related papers (2023-07-10T21:14:39Z) - A New Era in Software Security: Towards Self-Healing Software via Large Language Models and Formal Verification [8.733354577147093]
This paper introduces an innovative approach that combines Large Language Models (LLMs) with Formal Verification strategies for automatic software vulnerability repair.
We present the ESBMC-AI framework as a proof of concept, leveraging the well-recognized and industry-adopted Efficient SMT-based Context-Bounded Model Checker (ESBMC) and a pre-trained transformer model.
Our results demonstrate ESBMC-AI's capability to automate the detection and repair of issues such as buffer overflow, arithmetic overflow, and pointer dereference failures with high accuracy.
arXiv Detail & Related papers (2023-05-24T05:54:10Z) - MMRNet: Improving Reliability for Multimodal Object Detection and
Segmentation for Bin Picking via Multimodal Redundancy [68.7563053122698]
We propose a reliable object detection and segmentation system with MultiModal Redundancy (MMRNet)
This is the first system that introduces the concept of multimodal redundancy to address sensor failure issues during deployment.
We present a new label-free multi-modal consistency (MC) score that utilizes the output from all modalities to measure the overall system output reliability and uncertainty.
arXiv Detail & Related papers (2022-10-19T19:15:07Z) - Realistic simulation of users for IT systems in cyber ranges [63.20765930558542]
We instrument each machine by means of an external agent to generate user activity.
This agent combines both deterministic and deep learning based methods to adapt to different environment.
We also propose conditional text generation models to facilitate the creation of conversations and documents.
arXiv Detail & Related papers (2021-11-23T10:53:29Z) - Software Vulnerability Detection via Deep Learning over Disaggregated
Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora.
Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z) - MalBERT: Using Transformers for Cybersecurity and Malicious Software
Detection [0.0]
Transformers, a category of attention-based deep learning techniques, have recently shown impressive results in solving different tasks.
We propose a model based on BERT (Bi Representations from Transformers) which performs a static analysis on the source code of Android applications.
The obtained results are promising and show the high performance obtained by Transformer-based models for malicious software detection.
arXiv Detail & Related papers (2021-03-05T17:09:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.