Related papers: Semantic Similarity-Based Clustering of Findings From Security Testing Tools

Semantic Similarity-Based Clustering of Findings From Security Testing Tools

URL: http://arxiv.org/abs/2211.11057v1
Date: Sun, 20 Nov 2022 19:03:19 GMT
Title: Semantic Similarity-Based Clustering of Findings From Security Testing Tools
Authors: Phillip Schneider, Markus Voggenreiter, Abdullah Gulraiz and Florian Matthes
Abstract summary: In particular, it is common practice to use automated security testing tools that generate reports after inspecting a software artifact from multiple perspectives. To identify these duplicate findings manually, a security expert has to invest resources like time, effort, and knowledge. In this study, we investigated the potential of applying Natural Language Processing for clustering semantically similar security findings.
Score: 1.6058099298620423
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Over the last years, software development in domains with high security demands transitioned from traditional methodologies to uniting modern approaches from software development and operations (DevOps). Key principles of DevOps gained more importance and are now applied to security aspects of software development, resulting in the automation of security-enhancing activities. In particular, it is common practice to use automated security testing tools that generate reports after inspecting a software artifact from multiple perspectives. However, this raises the challenge of generating duplicate security findings. To identify these duplicate findings manually, a security expert has to invest resources like time, effort, and knowledge. A partial automation of this process could reduce the analysis effort, encourage DevOps principles, and diminish the chance of human error. In this study, we investigated the potential of applying Natural Language Processing for clustering semantically similar security findings to support the identification of problem-specific duplicate findings. Towards this goal, we developed a web application for annotating and assessing security testing tool reports and published a human-annotated corpus of clustered security findings. In addition, we performed a comparison of different semantic similarity techniques for automatically grouping security findings. Finally, we assess the resulting clusters using both quantitative and qualitative evaluation methods.

Related papers

AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement [73.0700818105842]
We introduce AISafetyLab, a unified framework and toolkit that integrates representative attack, defense, and evaluation methodologies for AI safety. AISafetyLab features an intuitive interface that enables developers to seamlessly apply various techniques. We conduct empirical studies on Vicuna, analyzing different attack and defense strategies to provide valuable insights into their comparative effectiveness.
arXiv Detail & Related papers (2025-02-24T02:11:52Z)
Computational Safety for Generative AI: A Signal Processing Perspective [65.268245109828]
computational safety is a mathematical framework that enables the quantitative assessment, formulation, and study of safety challenges in GenAI. We show how sensitivity analysis and loss landscape analysis can be used to detect malicious prompts with jailbreak attempts. We discuss key open research challenges, opportunities, and the essential role of signal processing in computational AI safety.
arXiv Detail & Related papers (2025-02-18T02:26:50Z)
Beyond the Surface: An NLP-based Methodology to Automatically Estimate CVE Relevance for CAPEC Attack Patterns [42.63501759921809]
We propose a methodology leveraging Natural Language Processing (NLP) to associate Common Vulnerabilities and Exposure (CAPEC) vulnerabilities with Common Attack Patternion and Classification (CAPEC) attack patterns. Experimental evaluations demonstrate superior performance compared to state-of-the-art models.
arXiv Detail & Related papers (2025-01-13T08:39:52Z)
SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI [47.11178028457252]
We develop SecCodePLT, a unified and comprehensive evaluation platform for code GenAIs' risks. For insecure code, we introduce a new methodology for data creation that combines experts with automatic generation. For cyberattack helpfulness, we construct samples to prompt a model to generate actual attacks, along with dynamic metrics in our environment.
arXiv Detail & Related papers (2024-10-14T21:17:22Z)
Hacking, The Lazy Way: LLM Augmented Pentesting [0.0]
"LLM Augmented Pentesting" is demonstrated through a tool named "Pentest Copilot" Our research includes a "chain of thought" mechanism to streamline token usage and boost performance. We propose a novel file analysis approach, enabling LLMs to understand files.
arXiv Detail & Related papers (2024-09-14T17:40:35Z)
A Qualitative Study on Using ChatGPT for Software Security: Perception vs. Practicality [1.7624347338410744]
ChatGPT is a Large Language Model (LLM) that can perform a variety of tasks with remarkable semantic understanding and accuracy. This study aims to gain an understanding of the potential of ChatGPT as an emerging technology for supporting software security. It was determined that security practitioners view ChatGPT as beneficial for various software security tasks, including vulnerability detection, information retrieval, and penetration testing.
arXiv Detail & Related papers (2024-08-01T10:14:05Z)
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? [59.96471873997733]
We propose an empirical foundation for developing more meaningful safety metrics and define AI safety in a machine learning research context. We aim to provide a more rigorous framework for AI safety research, advancing the science of safety evaluations and clarifying the path towards measurable progress.
arXiv Detail & Related papers (2024-07-31T17:59:24Z)
Automated Security Findings Management: A Case Study in Industrial DevOps [3.7798600249187295]
We propose a methodology for the management of security findings in industrial DevOps projects. As an instance of the methodology, we developed the Security Flama, a semantic knowledge base for the automated management of security findings.
arXiv Detail & Related papers (2024-01-12T14:35:51Z)
Applying Security Testing Techniques to Automotive Engineering [4.2755847332268235]
Security regression testing ensures that changes made to a system do not harm its security. We present a systematic classification of available security regression testing approaches.
arXiv Detail & Related papers (2023-09-18T10:32:36Z)
Constrained Adversarial Learning and its applicability to Automated Software Testing: a systematic review [0.0]
This systematic review is focused on the current state-of-the-art of constrained data generation methods applied for adversarial learning and software testing. It aims to guide researchers and developers to enhance testing tools with adversarial learning methods and improve the resilience and robustness of their digital systems.
arXiv Detail & Related papers (2023-03-14T00:27:33Z)
CodeLMSec Benchmark: Systematically Evaluating and Finding Security Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks. Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities. This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z)
Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL. We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection. To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z)
Dos and Don'ts of Machine Learning in Computer Security [74.1816306998445]
Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance. We identify common pitfalls in the design, implementation, and evaluation of learning-based security systems. We propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible.
arXiv Detail & Related papers (2020-10-19T13:09:31Z)
Evaluating the Safety of Deep Reinforcement Learning Models using Semi-Formal Verification [81.32981236437395]
We present a semi-formal verification approach for decision-making tasks based on interval analysis. Our method obtains comparable results over standard benchmarks with respect to formal verifiers. Our approach allows to efficiently evaluate safety properties for decision-making models in practical applications.
arXiv Detail & Related papers (2020-10-19T11:18:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.