Evaluating the Effectiveness of GPT-4 Turbo in Creating Defeaters for
Assurance Cases
- URL: http://arxiv.org/abs/2401.17991v1
- Date: Wed, 31 Jan 2024 16:51:23 GMT
- Title: Evaluating the Effectiveness of GPT-4 Turbo in Creating Defeaters for
Assurance Cases
- Authors: Kimya Khakzad Shahandashti, Mithila Sivakumar, Mohammad Mahdi Mohajer,
Alvine B. Belle, Song Wang, Timothy C. Lethbridge
- Abstract summary: We use GPT-4 Turbo, an advanced Large Language Model (LLM) developed by OpenAI, to identify defeaters within ACs formalized using the Eliminative Argumentation (EA) notation.
Our initial evaluation gauges the model's proficiency in understanding and generating arguments within this framework.
The findings indicate that GPT-4 Turbo excels in EA notation and is capable of generating various types of defeaters.
- Score: 6.231203956284574
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Assurance cases (ACs) are structured arguments that support the verification
of the correct implementation of systems' non-functional requirements, such as
safety and security, thereby preventing system failures which could lead to
catastrophic outcomes, including loss of lives. ACs facilitate the
certification of systems in accordance with industrial standards, for example,
DO-178C and ISO 26262. Identifying defeaters arguments that refute these ACs is
essential for improving the robustness and confidence in ACs. To automate this
task, we introduce a novel method that leverages the capabilities of GPT-4
Turbo, an advanced Large Language Model (LLM) developed by OpenAI, to identify
defeaters within ACs formalized using the Eliminative Argumentation (EA)
notation. Our initial evaluation gauges the model's proficiency in
understanding and generating arguments within this framework. The findings
indicate that GPT-4 Turbo excels in EA notation and is capable of generating
various types of defeaters.
Related papers
- Retrieval Augmented Generation Integrated Large Language Models in Smart Contract Vulnerability Detection [0.0]
Decentralized Finance (DeFi) has been accompanied by substantial financial losses due to smart contract vulnerabilities.
With attacks becoming more frequent, the necessity and demand for auditing services has escalated.
This study builds upon existing frameworks by integrating Retrieval-Augmented Generation (RAG) with large language models (LLMs)
arXiv Detail & Related papers (2024-07-20T10:46:42Z) - Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation [86.05704141217036]
Black-box finetuning is an emerging interface for adapting state-of-the-art language models to user needs.
We introduce covert malicious finetuning, a method to compromise model safety via finetuning while evading detection.
arXiv Detail & Related papers (2024-06-28T17:05:46Z) - SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors [64.9938658716425]
Existing evaluations of large language models' (LLMs) ability to recognize and reject unsafe user requests face three limitations.
First, existing methods often use coarse-grained of unsafe topics, and are over-representing some fine-grained topics.
Second, linguistic characteristics and formatting of prompts are often overlooked, like different languages, dialects, and more -- which are only implicitly considered in many evaluations.
Third, existing evaluations rely on large LLMs for evaluation, which can be expensive.
arXiv Detail & Related papers (2024-06-20T17:56:07Z) - PVF (Parameter Vulnerability Factor): A Scalable Metric for Understanding AI Vulnerability Against SDCs in Model Parameters [7.652441604508354]
Vulnerability Factor (PVF) is a metric aiming to standardize the quantification of AI model vulnerability against parameter corruptions.
PVF can provide pivotal insights to AI hardware designers in balancing the tradeoff between fault protection and performance/efficiency.
We present several use cases on applying PVF to three types of tasks/models during inference -- recommendation (DLRM), vision classification (CNN), and text classification (BERT)
arXiv Detail & Related papers (2024-05-02T21:23:34Z) - FaultGuard: A Generative Approach to Resilient Fault Prediction in Smart Electrical Grids [53.2306792009435]
FaultGuard is the first framework for fault type and zone classification resilient to adversarial attacks.
We propose a low-complexity fault prediction model and an online adversarial training technique to enhance robustness.
Our model outclasses the state-of-the-art for resilient fault prediction benchmarking, with an accuracy of up to 0.958.
arXiv Detail & Related papers (2024-03-26T08:51:23Z) - GPT-4 and Safety Case Generation: An Exploratory Analysis [2.3361634876233817]
This paper investigates the exploration of generating safety cases with large language models (LLMs) and conversational interfaces (ChatGPT)
Our primary objective is to delve into the existing knowledge base of GPT-4, focusing on its understanding of the Goal Structuring Notation (GSN)
We perform four distinct experiments with GPT-4 to assess its capacity for generating safety cases within a defined system and application domain.
arXiv Detail & Related papers (2023-12-09T22:28:48Z) - Ring-A-Bell! How Reliable are Concept Removal Methods for Diffusion Models? [52.238883592674696]
Ring-A-Bell is a model-agnostic red-teaming tool for T2I diffusion models.
It identifies problematic prompts for diffusion models with the corresponding generation of inappropriate content.
Our results show that Ring-A-Bell, by manipulating safe prompting benchmarks, can transform prompts that were originally regarded as safe to evade existing safety mechanisms.
arXiv Detail & Related papers (2023-10-16T02:11:20Z) - A LLM Assisted Exploitation of AI-Guardian [57.572998144258705]
We evaluate the robustness of AI-Guardian, a recent defense to adversarial examples published at IEEE S&P 2023.
We write none of the code to attack this model, and instead prompt GPT-4 to implement all attack algorithms following our instructions and guidance.
This process was surprisingly effective and efficient, with the language model at times producing code from ambiguous instructions faster than the author of this paper could have done.
arXiv Detail & Related papers (2023-07-20T17:33:25Z) - Security and Interpretability in Automotive Systems [0.0]
The lack of any sender authentication mechanism in place makes CAN (Controller Area Network) vulnerable to security threats.
This thesis demonstrates a sender authentication technique that uses power consumption measurements of the electronic control units (ECUs) and a classification model to determine the transmitting states of the ECUs.
arXiv Detail & Related papers (2022-12-23T01:33:09Z) - Exploring Robustness of Unsupervised Domain Adaptation in Semantic
Segmentation [74.05906222376608]
We propose adversarial self-supervision UDA (or ASSUDA) that maximizes the agreement between clean images and their adversarial examples by a contrastive loss in the output space.
This paper is rooted in two observations: (i) the robustness of UDA methods in semantic segmentation remains unexplored, which pose a security concern in this field; and (ii) although commonly used self-supervision (e.g., rotation and jigsaw) benefits image tasks such as classification and recognition, they fail to provide the critical supervision signals that could learn discriminative representation for segmentation tasks.
arXiv Detail & Related papers (2021-05-23T01:50:44Z) - Runtime Safety Assurance Using Reinforcement Learning [37.61747231296097]
This paper aims to design a meta-controller capable of identifying unsafe situations with high accuracy.
We frame the design of RTSA with the Markov decision process (MDP) and use reinforcement learning (RL) to solve it.
arXiv Detail & Related papers (2020-10-20T20:54:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.