Semantic Similarity-Based Clustering of Findings From Security Testing
Tools
- URL: http://arxiv.org/abs/2211.11057v1
- Date: Sun, 20 Nov 2022 19:03:19 GMT
- Title: Semantic Similarity-Based Clustering of Findings From Security Testing
Tools
- Authors: Phillip Schneider, Markus Voggenreiter, Abdullah Gulraiz and Florian
Matthes
- Abstract summary: In particular, it is common practice to use automated security testing tools that generate reports after inspecting a software artifact from multiple perspectives.
To identify these duplicate findings manually, a security expert has to invest resources like time, effort, and knowledge.
In this study, we investigated the potential of applying Natural Language Processing for clustering semantically similar security findings.
- Score: 1.6058099298620423
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the last years, software development in domains with high security
demands transitioned from traditional methodologies to uniting modern
approaches from software development and operations (DevOps). Key principles of
DevOps gained more importance and are now applied to security aspects of
software development, resulting in the automation of security-enhancing
activities. In particular, it is common practice to use automated security
testing tools that generate reports after inspecting a software artifact from
multiple perspectives. However, this raises the challenge of generating
duplicate security findings. To identify these duplicate findings manually, a
security expert has to invest resources like time, effort, and knowledge. A
partial automation of this process could reduce the analysis effort, encourage
DevOps principles, and diminish the chance of human error. In this study, we
investigated the potential of applying Natural Language Processing for
clustering semantically similar security findings to support the identification
of problem-specific duplicate findings. Towards this goal, we developed a web
application for annotating and assessing security testing tool reports and
published a human-annotated corpus of clustered security findings. In addition,
we performed a comparison of different semantic similarity techniques for
automatically grouping security findings. Finally, we assess the resulting
clusters using both quantitative and qualitative evaluation methods.
Related papers
- SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI [47.11178028457252]
We develop SecCodePLT, a unified and comprehensive evaluation platform for code GenAIs' risks.
For insecure code, we introduce a new methodology for data creation that combines experts with automatic generation.
For cyberattack helpfulness, we construct samples to prompt a model to generate actual attacks, along with dynamic metrics in our environment.
arXiv Detail & Related papers (2024-10-14T21:17:22Z) - Hacking, The Lazy Way: LLM Augmented Pentesting [0.0]
"LLM Augmented Pentesting" is demonstrated through a tool named "Pentest Copilot"
Our research includes a "chain of thought" mechanism to streamline token usage and boost performance.
We propose a novel file analysis approach, enabling LLMs to understand files.
arXiv Detail & Related papers (2024-09-14T17:40:35Z) - A Qualitative Study on Using ChatGPT for Software Security: Perception vs. Practicality [1.7624347338410744]
ChatGPT is a Large Language Model (LLM) that can perform a variety of tasks with remarkable semantic understanding and accuracy.
This study aims to gain an understanding of the potential of ChatGPT as an emerging technology for supporting software security.
It was determined that security practitioners view ChatGPT as beneficial for various software security tasks, including vulnerability detection, information retrieval, and penetration testing.
arXiv Detail & Related papers (2024-08-01T10:14:05Z) - Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? [59.96471873997733]
We propose an empirical foundation for developing more meaningful safety metrics and define AI safety in a machine learning research context.
We aim to provide a more rigorous framework for AI safety research, advancing the science of safety evaluations and clarifying the path towards measurable progress.
arXiv Detail & Related papers (2024-07-31T17:59:24Z) - Automated Security Findings Management: A Case Study in Industrial
DevOps [3.7798600249187295]
We propose a methodology for the management of security findings in industrial DevOps projects.
As an instance of the methodology, we developed the Security Flama, a semantic knowledge base for the automated management of security findings.
arXiv Detail & Related papers (2024-01-12T14:35:51Z) - Applying Security Testing Techniques to Automotive Engineering [4.2755847332268235]
Security regression testing ensures that changes made to a system do not harm its security.
We present a systematic classification of available security regression testing approaches.
arXiv Detail & Related papers (2023-09-18T10:32:36Z) - Constrained Adversarial Learning and its applicability to Automated
Software Testing: a systematic review [0.0]
This systematic review is focused on the current state-of-the-art of constrained data generation methods applied for adversarial learning and software testing.
It aims to guide researchers and developers to enhance testing tools with adversarial learning methods and improve the resilience and robustness of their digital systems.
arXiv Detail & Related papers (2023-03-14T00:27:33Z) - CodeLMSec Benchmark: Systematically Evaluating and Finding Security
Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks.
Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities.
This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Dos and Don'ts of Machine Learning in Computer Security [74.1816306998445]
Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance.
We identify common pitfalls in the design, implementation, and evaluation of learning-based security systems.
We propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible.
arXiv Detail & Related papers (2020-10-19T13:09:31Z) - Evaluating the Safety of Deep Reinforcement Learning Models using
Semi-Formal Verification [81.32981236437395]
We present a semi-formal verification approach for decision-making tasks based on interval analysis.
Our method obtains comparable results over standard benchmarks with respect to formal verifiers.
Our approach allows to efficiently evaluate safety properties for decision-making models in practical applications.
arXiv Detail & Related papers (2020-10-19T11:18:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.