Semantic Similarity-Based Clustering of Findings From Security Testing
Tools
- URL: http://arxiv.org/abs/2211.11057v1
- Date: Sun, 20 Nov 2022 19:03:19 GMT
- Title: Semantic Similarity-Based Clustering of Findings From Security Testing
Tools
- Authors: Phillip Schneider, Markus Voggenreiter, Abdullah Gulraiz and Florian
Matthes
- Abstract summary: In particular, it is common practice to use automated security testing tools that generate reports after inspecting a software artifact from multiple perspectives.
To identify these duplicate findings manually, a security expert has to invest resources like time, effort, and knowledge.
In this study, we investigated the potential of applying Natural Language Processing for clustering semantically similar security findings.
- Score: 1.6058099298620423
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the last years, software development in domains with high security
demands transitioned from traditional methodologies to uniting modern
approaches from software development and operations (DevOps). Key principles of
DevOps gained more importance and are now applied to security aspects of
software development, resulting in the automation of security-enhancing
activities. In particular, it is common practice to use automated security
testing tools that generate reports after inspecting a software artifact from
multiple perspectives. However, this raises the challenge of generating
duplicate security findings. To identify these duplicate findings manually, a
security expert has to invest resources like time, effort, and knowledge. A
partial automation of this process could reduce the analysis effort, encourage
DevOps principles, and diminish the chance of human error. In this study, we
investigated the potential of applying Natural Language Processing for
clustering semantically similar security findings to support the identification
of problem-specific duplicate findings. Towards this goal, we developed a web
application for annotating and assessing security testing tool reports and
published a human-annotated corpus of clustered security findings. In addition,
we performed a comparison of different semantic similarity techniques for
automatically grouping security findings. Finally, we assess the resulting
clusters using both quantitative and qualitative evaluation methods.
Related papers
- Automated Security Findings Management: A Case Study in Industrial
DevOps [3.7798600249187295]
We propose a methodology for the management of security findings in industrial DevOps projects.
As an instance of the methodology, we developed the Security Flama, a semantic knowledge base for the automated management of security findings.
arXiv Detail & Related papers (2024-01-12T14:35:51Z) - Exploiting Library Vulnerability via Migration Based Automating Test
Generation [16.39796265296833]
In software development, developers extensively utilize third-party libraries to avoid implementing existing functionalities.
Vulnerability exploits, as code snippets provided for reproducing vulnerabilities after disclosure, contain a wealth of vulnerability-related information.
This study proposes a new method based on vulnerability exploits, called VESTA, which provides vulnerability exploit tests as the basis for developers to decide whether to update dependencies.
arXiv Detail & Related papers (2023-12-15T06:46:45Z) - Applying Security Testing Techniques to Automotive Engineering [4.2755847332268235]
Security regression testing ensures that changes made to a system do not harm its security.
We present a systematic classification of available security regression testing approaches.
arXiv Detail & Related papers (2023-09-18T10:32:36Z) - Constrained Adversarial Learning and its applicability to Automated
Software Testing: a systematic review [0.0]
This systematic review is focused on the current state-of-the-art of constrained data generation methods applied for adversarial learning and software testing.
It aims to guide researchers and developers to enhance testing tools with adversarial learning methods and improve the resilience and robustness of their digital systems.
arXiv Detail & Related papers (2023-03-14T00:27:33Z) - CodeLMSec Benchmark: Systematically Evaluating and Finding Security
Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks.
Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities.
This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Realistic simulation of users for IT systems in cyber ranges [63.20765930558542]
We instrument each machine by means of an external agent to generate user activity.
This agent combines both deterministic and deep learning based methods to adapt to different environment.
We also propose conditional text generation models to facilitate the creation of conversations and documents.
arXiv Detail & Related papers (2021-11-23T10:53:29Z) - Increasing the Confidence of Deep Neural Networks by Coverage Analysis [71.57324258813674]
This paper presents a lightweight monitoring architecture based on coverage paradigms to enhance the model against different unsafe inputs.
Experimental results show that the proposed approach is effective in detecting both powerful adversarial examples and out-of-distribution inputs.
arXiv Detail & Related papers (2021-01-28T16:38:26Z) - Dos and Don'ts of Machine Learning in Computer Security [74.1816306998445]
Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance.
We identify common pitfalls in the design, implementation, and evaluation of learning-based security systems.
We propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible.
arXiv Detail & Related papers (2020-10-19T13:09:31Z) - Evaluating the Safety of Deep Reinforcement Learning Models using
Semi-Formal Verification [81.32981236437395]
We present a semi-formal verification approach for decision-making tasks based on interval analysis.
Our method obtains comparable results over standard benchmarks with respect to formal verifiers.
Our approach allows to efficiently evaluate safety properties for decision-making models in practical applications.
arXiv Detail & Related papers (2020-10-19T11:18:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.