Related papers: Quantitative Assurance and Synthesis of Controllers from Activity Diagrams

Quantitative Assurance and Synthesis of Controllers from Activity Diagrams

URL: http://arxiv.org/abs/2403.00169v1
Date: Thu, 29 Feb 2024 22:40:39 GMT
Title: Quantitative Assurance and Synthesis of Controllers from Activity Diagrams
Authors: Kangfeng Ye, Fang Yan, Simos Gerasimou
Abstract summary: Probabilistic model checking is a widely used formal verification technique to automatically verify qualitative and quantitative properties. This makes it not accessible for researchers and engineers who may not have the required knowledge. We propose a comprehensive verification framework for ADs, including a new profile for probability time, quality annotations, a semantics interpretation of ADs in three Markov models, and a set of transformation rules from activity diagrams to the PRISM language. Most importantly, we developed algorithms for transformation and implemented them in a tool, called QASCAD, using model-based techniques, for fully automated verification.
Score: 4.419843514606336
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Probabilistic model checking is a widely used formal verification technique to automatically verify qualitative and quantitative properties for probabilistic models. However, capturing such systems, writing corresponding properties, and verifying them require domain knowledge. This makes it not accessible for researchers and engineers who may not have the required knowledge. Previous studies have extended UML activity diagrams (ADs), developed transformations, and implemented accompanying tools for automation. The research, however, is incomprehensive and not fully open, which makes it hard to be evaluated, extended, adapted, and accessed. In this paper, we propose a comprehensive verification framework for ADs, including a new profile for probability, time, and quality annotations, a semantics interpretation of ADs in three Markov models, and a set of transformation rules from activity diagrams to the PRISM language, supported by PRISM and Storm. Most importantly, we developed algorithms for transformation and implemented them in a tool, called QASCAD, using model-based techniques, for fully automated verification. We evaluated one case study where multiple robots are used for delivery in a hospital and further evaluated six other examples from the literature. With all these together, this work makes noteworthy contributions to the verification of ADs by improving evaluation, extensibility, adaptability, and accessibility.

Related papers

Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs [58.24692529185971]
We introduce a comprehensive auditing framework for unlearning evaluation comprising three benchmark datasets, six unlearning algorithms, and five prompt-based auditing methods.<n>We evaluate the effectiveness and robustness of different unlearning strategies.
arXiv Detail & Related papers (2025-05-29T09:19:07Z)
KBAlign: Efficient Self Adaptation on Specific Knowledge Bases [73.34893326181046]
We present KBAlign, a self-supervised framework that enhances RAG systems through efficient model adaptation.<n>Our key insight is to leverage the model's intrinsic capabilities for knowledge alignment through two innovative mechanisms.<n> Experiments demonstrate that KBAlign can achieve 90% of the performance gain obtained through GPT-4-supervised adaptation.
arXiv Detail & Related papers (2024-11-22T08:21:03Z)
Automatic Generation of Behavioral Test Cases For Natural Language Processing Using Clustering and Prompting [6.938766764201549]
This paper introduces an automated approach to develop test cases by exploiting the power of large language models and statistical techniques. We analyze the behavioral test profiles across four different classification algorithms and discuss the limitations and strengths of those models.
arXiv Detail & Related papers (2024-07-31T21:12:21Z)
KGValidator: A Framework for Automatic Validation of Knowledge Graph Construction [2.9526207670430384]
We introduce a framework for consistency and validation when using generative models to validate knowledge graphs. The design is easy to adapt and extend, and can be used to verify any kind of graph-structured data.
arXiv Detail & Related papers (2024-04-24T15:27:25Z)
AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving [68.73885845181242]
We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios. We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
arXiv Detail & Related papers (2024-03-26T04:27:56Z)
Towards Automatic Translation of Machine Learning Visual Insights to Analytical Assertions [23.535630175567146]
We present our vision for developing an automated tool capable of translating visual properties observed in Machine Learning (ML) visualisations into Python assertions. The tool aims to streamline the process of manually verifying these visualisations in the ML development cycle, which is critical as real-world data and assumptions often change post-deployment.
arXiv Detail & Related papers (2024-01-15T14:11:59Z)
QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement. QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights. We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z)
FIND: A Function Description Benchmark for Evaluating Interpretability Methods [86.80718559904854]
This paper introduces FIND (Function INterpretation and Description), a benchmark suite for evaluating automated interpretability methods. FIND contains functions that resemble components of trained neural networks, and accompanying descriptions of the kind we seek to generate. We evaluate methods that use pretrained language models to produce descriptions of function behavior in natural language and code.
arXiv Detail & Related papers (2023-09-07T17:47:26Z)
Scaling up Memory-Efficient Formal Verification Tools for Tree Ensembles [2.588973722689844]
We formalise and extend the VoTE algorithm presented earlier as a tool description. We show how the separation of property checking from the core verification engine enables verification of versatile requirements. We demonstrate the application of the tool in two case studies, namely digit recognition and aircraft collision avoidance.
arXiv Detail & Related papers (2021-05-06T11:50:22Z)
DirectDebug: Automated Testing and Debugging of Feature Models [55.41644538483948]
Variability models (e.g., feature models) are a common way for the representation of variabilities and commonalities of software artifacts. Complex and often large-scale feature models can become faulty, i.e., do not represent the expected variability properties of the underlying software artifact.
arXiv Detail & Related papers (2021-02-11T11:22:20Z)
Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings. We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data. We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
IFC models for (semi)automating common planning checks for building permits [0.0]
A tool was developed to extract the necessary information from IFC models to check representative regulations. While the case study is specific in location, regulations and input models, the type of issues encountered are a generally applicable example for automated code compliance checking.
arXiv Detail & Related papers (2020-11-03T15:29:47Z)
Universal Source-Free Domain Adaptation [57.37520645827318]
We propose a novel two-stage learning process for domain adaptation. In the Procurement stage, we aim to equip the model for future source-free deployment, assuming no prior knowledge of the upcoming category-gap and domain-shift. In the Deployment stage, the goal is to design a unified adaptation algorithm capable of operating across a wide range of category-gaps.
arXiv Detail & Related papers (2020-04-09T07:26:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.