Related papers: How Dataflow Diagrams Impact Software Security Analysis: an Empirical Experiment

How Dataflow Diagrams Impact Software Security Analysis: an Empirical Experiment

URL: http://arxiv.org/abs/2401.04446v1
Date: Tue, 9 Jan 2024 09:22:35 GMT
Title: How Dataflow Diagrams Impact Software Security Analysis: an Empirical Experiment
Authors: Simon Schneider, Nicol\'as E. D\'iaz Ferreyra, Pierre-Jean Qu\'eval, Georg Simhandl, Uwe Zdun, Riccardo Scandariato
Abstract summary: We present the findings of an empirical experiment conducted to investigate DFDs’ impact on the performance of analysts in a security analysis setting. We found that the participants performed significantly better in answering the analysis tasks correctly in the model-supported condition. We identified three open challenges of using DFDs for security analysis based on the insights gained in the experiment.
Score: 5.6169596483204085
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Models of software systems are used throughout the software development lifecycle. Dataflow diagrams (DFDs), in particular, are well-established resources for security analysis. Many techniques, such as threat modelling, are based on DFDs of the analysed application. However, their impact on the performance of analysts in a security analysis setting has not been explored before. In this paper, we present the findings of an empirical experiment conducted to investigate this effect. Following a within-groups design, participants were asked to solve security-relevant tasks for a given microservice application. In the control condition, the participants had to examine the source code manually. In the model-supported condition, they were additionally provided a DFD of the analysed application and traceability information linking model items to artefacts in source code. We found that the participants (n = 24) performed significantly better in answering the analysis tasks correctly in the model-supported condition (41% increase in analysis correctness). Further, participants who reported using the provided traceability information performed better in giving evidence for their answers (315% increase in correctness of evidence). Finally, we identified three open challenges of using DFDs for security analysis based on the insights gained in the experiment.

Related papers

LLMs in Software Security: A Survey of Vulnerability Detection Techniques and Insights [12.424610893030353]
Large Language Models (LLMs) are emerging as transformative tools for software vulnerability detection. This paper provides a detailed survey of LLMs in vulnerability detection. We address challenges such as cross-language vulnerability detection, multimodal data integration, and repository-level analysis.
arXiv Detail & Related papers (2025-02-10T21:33:38Z)
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset [4.522849055040843]
This study audited the Helpful and Harmless dataset by Anthropic. Our findings highlight the need for more nuanced, context-sensitive approaches to safety mitigation in large language models.
arXiv Detail & Related papers (2024-11-12T23:43:20Z)
Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges. We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow. We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z)
Usefulness of data flow diagrams and large language models for security threat validation: a registered report [1.8876415010297898]
Threat analysis and risk assessment are used to identify security threats for new ored systems. There is a lack of definition-of-done, so identified threats have to be validated which slows down the analysis. Existing literature has focused on the overall performance of threat analysis, but no previous work has investigated how deep must the analysts dig into the material before they can effectively validate the identified security threats.
arXiv Detail & Related papers (2024-08-14T13:14:27Z)
InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation [79.09622602860703]
We introduce InsightBench, a benchmark dataset with three key features. It consists of 100 datasets representing diverse business use cases such as finance and incident management. Unlike existing benchmarks focusing on answering single queries, InsightBench evaluates agents based on their ability to perform end-to-end data analytics.
arXiv Detail & Related papers (2024-07-08T22:06:09Z)
LMD3: Language Model Data Density Dependence [78.76731603461832]
We develop a methodology for analyzing language model task performance at the individual example level based on training data density estimation. Experiments with paraphrasing as a controlled intervention on finetuning data demonstrate that increasing the support in the training distribution for specific test queries results in a measurable increase in density. We conclude that our framework can provide statistical evidence of the dependence of a target model's predictions on subsets of its training data.
arXiv Detail & Related papers (2024-05-10T09:03:27Z)
An Extensible Framework for Architecture-Based Data Flow Analysis for Information Security [1.7749883815108154]
Security-related properties are often analyzed based on data flow diagrams (DFDs) We present an open and framework for data flow analysis. The framework is compatible with DFDs and can also extract data flows from the Palladio architectural description language.
arXiv Detail & Related papers (2024-03-14T13:52:41Z)
DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation [83.30006900263744]
Data analysis is a crucial analytical process to generate in-depth studies and conclusive insights. We propose to automatically generate high-quality answer annotations leveraging the code-generation capabilities of LLMs. Our DACO-RL algorithm is evaluated by human annotators to produce more helpful answers than SFT model in 57.72% cases.
arXiv Detail & Related papers (2024-03-04T22:47:58Z)
Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data [89.2410799619405]
We introduce the Quantitative Reasoning with Data benchmark to evaluate Large Language Models' capability in statistical and causal reasoning with real-world data. The benchmark comprises a dataset of 411 questions accompanied by data sheets from textbooks, online learning materials, and academic papers. To compare models' quantitative reasoning abilities on data and text, we enrich the benchmark with an auxiliary set of 290 text-only questions, namely QRText.
arXiv Detail & Related papers (2024-02-27T16:15:03Z)
Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification. We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations. Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z)
3D Human Pose Analysis via Diffusion Synthesis [65.268245109828]
PADS represents the first diffusion-based framework for tackling general 3D human pose analysis within the inverse problem framework. Its performance has been validated on different benchmarks, signaling the adaptability and robustness of this pipeline.
arXiv Detail & Related papers (2024-01-17T02:59:34Z)
HAlf-MAsked Model for Named Entity Sentiment analysis [0.0]
We study different transformers-based solutions NESA in RuSentNE-23 evaluation. We present several approaches to overcome this problem, among which there is a novel technique of additional pass over given data with masked entity. Our proposed model achieves the best result on RuSentNE-23 evaluation data and demonstrates improved consistency in entity-level sentiment analysis.
arXiv Detail & Related papers (2023-08-30T06:53:24Z)
Reinforced Approximate Exploratory Data Analysis [7.974685452145769]
We are first to consider the impact of sampling in interactive data exploration settings as they introduce approximation errors. We propose a Deep Reinforcement Learning (DRL) based framework which can optimize the sample selection in order to keep the analysis and insight generation flow intact.
arXiv Detail & Related papers (2022-12-12T20:20:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.