How Dataflow Diagrams Impact Software Security Analysis: an Empirical
Experiment
- URL: http://arxiv.org/abs/2401.04446v1
- Date: Tue, 9 Jan 2024 09:22:35 GMT
- Title: How Dataflow Diagrams Impact Software Security Analysis: an Empirical
Experiment
- Authors: Simon Schneider, Nicol\'as E. D\'iaz Ferreyra, Pierre-Jean Qu\'eval,
Georg Simhandl, Uwe Zdun, Riccardo Scandariato
- Abstract summary: We present the findings of an empirical experiment conducted to investigate DFDs’ impact on the performance of analysts in a security analysis setting.
We found that the participants performed significantly better in answering the analysis tasks correctly in the model-supported condition.
We identified three open challenges of using DFDs for security analysis based on the insights gained in the experiment.
- Score: 5.6169596483204085
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Models of software systems are used throughout the software development
lifecycle. Dataflow diagrams (DFDs), in particular, are well-established
resources for security analysis. Many techniques, such as threat modelling, are
based on DFDs of the analysed application. However, their impact on the
performance of analysts in a security analysis setting has not been explored
before. In this paper, we present the findings of an empirical experiment
conducted to investigate this effect. Following a within-groups design,
participants were asked to solve security-relevant tasks for a given
microservice application. In the control condition, the participants had to
examine the source code manually. In the model-supported condition, they were
additionally provided a DFD of the analysed application and traceability
information linking model items to artefacts in source code. We found that the
participants (n = 24) performed significantly better in answering the analysis
tasks correctly in the model-supported condition (41% increase in analysis
correctness). Further, participants who reported using the provided
traceability information performed better in giving evidence for their answers
(315% increase in correctness of evidence). Finally, we identified three open
challenges of using DFDs for security analysis based on the insights gained in
the experiment.
Related papers
- InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation [81.4242018694792]
We introduce InsightBench, a benchmark dataset with three key features.
It consists of 31 datasets representing diverse business use cases such as finance and incident management.
Unlike existing benchmarks focusing on answering single queries, InsightBench evaluates agents based on their ability to perform end-to-end data analytics.
arXiv Detail & Related papers (2024-07-08T22:06:09Z) - LMD3: Language Model Data Density Dependence [78.76731603461832]
We develop a methodology for analyzing language model task performance at the individual example level based on training data density estimation.
Experiments with paraphrasing as a controlled intervention on finetuning data demonstrate that increasing the support in the training distribution for specific test queries results in a measurable increase in density.
We conclude that our framework can provide statistical evidence of the dependence of a target model's predictions on subsets of its training data.
arXiv Detail & Related papers (2024-05-10T09:03:27Z) - An Extensible Framework for Architecture-Based Data Flow Analysis for Information Security [1.7749883815108154]
Security-related properties are often analyzed based on data flow diagrams (DFDs)
We present an open and framework for data flow analysis.
The framework is compatible with DFDs and can also extract data flows from the Palladio architectural description language.
arXiv Detail & Related papers (2024-03-14T13:52:41Z) - DACO: Towards Application-Driven and Comprehensive Data Analysis via
Code Generation [86.4326416303723]
Data analysis is a crucial analytical process to generate in-depth studies and conclusive insights.
We propose to automatically generate high-quality answer annotations leveraging the code-generation capabilities of LLMs.
Our DACO-RL algorithm is evaluated by human annotators to produce more helpful answers than SFT model in 57.72% cases.
arXiv Detail & Related papers (2024-03-04T22:47:58Z) - Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data [89.2410799619405]
We introduce the Quantitative Reasoning with Data benchmark to evaluate Large Language Models' capability in statistical and causal reasoning with real-world data.
The benchmark comprises a dataset of 411 questions accompanied by data sheets from textbooks, online learning materials, and academic papers.
To compare models' quantitative reasoning abilities on data and text, we enrich the benchmark with an auxiliary set of 290 text-only questions, namely QRText.
arXiv Detail & Related papers (2024-02-27T16:15:03Z) - When Dataflow Analysis Meets Large Language Models [9.458251511218817]
This paper introduces LLMDFA, an LLM-powered dataflow analysis framework that analyzes arbitrary code snippets without requiring a compilation infrastructure.
Inspired by summary-based dataflow analysis, LLMDFA decomposes the problem into three sub-problems, which are effectively resolved by several essential strategies.
Our evaluation has shown that the design can mitigate the hallucination and improve the reasoning ability, obtaining high precision and recall in detecting dataflow-related bugs.
arXiv Detail & Related papers (2024-02-16T15:21:35Z) - Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification.
We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations.
Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z) - 3D Human Pose Analysis via Diffusion Synthesis [65.268245109828]
PADS represents the first diffusion-based framework for tackling general 3D human pose analysis within the inverse problem framework.
Its performance has been validated on different benchmarks, signaling the adaptability and robustness of this pipeline.
arXiv Detail & Related papers (2024-01-17T02:59:34Z) - Using causal inference to avoid fallouts in data-driven parametric
analysis: a case study in the architecture, engineering, and construction
industry [0.7566148383213173]
The decision-making process in real-world implementations has been affected by a growing reliance on data-driven models.
We investigated the synergetic pattern between the data-driven methods, empirical domain knowledge, and first-principles simulations.
arXiv Detail & Related papers (2023-09-11T13:54:58Z) - HAlf-MAsked Model for Named Entity Sentiment analysis [0.0]
We study different transformers-based solutions NESA in RuSentNE-23 evaluation.
We present several approaches to overcome this problem, among which there is a novel technique of additional pass over given data with masked entity.
Our proposed model achieves the best result on RuSentNE-23 evaluation data and demonstrates improved consistency in entity-level sentiment analysis.
arXiv Detail & Related papers (2023-08-30T06:53:24Z) - Reinforced Approximate Exploratory Data Analysis [7.974685452145769]
We are first to consider the impact of sampling in interactive data exploration settings as they introduce approximation errors.
We propose a Deep Reinforcement Learning (DRL) based framework which can optimize the sample selection in order to keep the analysis and insight generation flow intact.
arXiv Detail & Related papers (2022-12-12T20:20:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.