Knowledge-Augmented Reasoning for EUAIA Compliance and Adversarial Robustness of LLMs
- URL: http://arxiv.org/abs/2410.09078v1
- Date: Fri, 4 Oct 2024 18:23:14 GMT
- Title: Knowledge-Augmented Reasoning for EUAIA Compliance and Adversarial Robustness of LLMs
- Authors: Tomas Bueno Momcilovic, Dian Balta, Beat Buesser, Giulio Zizzo, Mark Purcell,
- Abstract summary: The EU AI Act (EUAIA) introduces requirements for AI systems which intersect with the processes required to establish adversarial robustness.
This paper presents a functional architecture that focuses on bridging the two properties.
We aim to support developers and auditors with a reasoning layer based on knowledge augmentation.
- Score: 1.368472250332885
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The EU AI Act (EUAIA) introduces requirements for AI systems which intersect with the processes required to establish adversarial robustness. However, given the ambiguous language of regulation and the dynamicity of adversarial attacks, developers of systems with highly complex models such as LLMs may find their effort to be duplicated without the assurance of having achieved either compliance or robustness. This paper presents a functional architecture that focuses on bridging the two properties, by introducing components with clear reference to their source. Taking the detection layer recommended by the literature, and the reporting layer required by the law, we aim to support developers and auditors with a reasoning layer based on knowledge augmentation (rules, assurance cases, contextual mappings). Our findings demonstrate a novel direction for ensuring LLMs deployed in the EU are both compliant and adversarially robust, which underpin trustworthiness.
Related papers
- Towards Assurance of LLM Adversarial Robustness using Ontology-Driven Argumentation [1.368472250332885]
We introduce a novel approach for assurance of large language models (LLMs) based on formal argumentation.
We structure state-of-the-art attacks and defenses, facilitating creation of a human-readable assurance case.
We provide implications for theory and practice, by targeting engineers, data scientists, users and auditors.
arXiv Detail & Related papers (2024-10-10T14:24:43Z) - COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act [40.233017376716305]
The EU's Artificial Intelligence Act (AI Act) is a significant step towards responsible AI development.
It lacks clear technical interpretation, making it difficult to assess models' compliance.
This work presents COMPL-AI, a comprehensive framework consisting of the first technical interpretation of the Act.
arXiv Detail & Related papers (2024-10-10T14:23:51Z) - Towards Assuring EU AI Act Compliance and Adversarial Robustness of LLMs [1.368472250332885]
Large language models are prone to misuse and vulnerable to security threats.
The European Union's Artificial Intelligence Act seeks to enforce AI robustness in certain contexts.
arXiv Detail & Related papers (2024-10-04T18:38:49Z) - Developing Assurance Cases for Adversarial Robustness and Regulatory Compliance in LLMs [1.368472250332885]
We develop an approach to developing assurance cases for adversarial robustness and regulatory compliance in large language models (LLMs)
We propose a layered framework incorporating guardrails at various stages of deployment, aimed at mitigating these attacks and ensuring compliance with the EU AI Act.
We illustrate our method with two exemplary assurance cases, highlighting how different contexts demand tailored strategies to ensure robust and compliant AI systems.
arXiv Detail & Related papers (2024-10-04T18:14:29Z) - Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making.
Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations.
Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z) - TRACE: TRansformer-based Attribution using Contrastive Embeddings in LLMs [50.259001311894295]
We propose a novel TRansformer-based Attribution framework using Contrastive Embeddings called TRACE.
We show that TRACE significantly improves the ability to attribute sources accurately, making it a valuable tool for enhancing the reliability and trustworthiness of large language models.
arXiv Detail & Related papers (2024-07-06T07:19:30Z) - Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning [61.2224355547598]
Open-sourcing of large language models (LLMs) accelerates application development, innovation, and scientific progress.
Our investigation exposes a critical oversight in this belief.
By deploying carefully designed demonstrations, our research demonstrates that base LLMs could effectively interpret and execute malicious instructions.
arXiv Detail & Related papers (2024-04-16T13:22:54Z) - Data Poisoning for In-context Learning [49.77204165250528]
In-context learning (ICL) has been recognized for its innovative ability to adapt to new tasks.
This paper delves into the critical issue of ICL's susceptibility to data poisoning attacks.
We introduce ICLPoison, a specialized attacking framework conceived to exploit the learning mechanisms of ICL.
arXiv Detail & Related papers (2024-02-03T14:20:20Z) - Improving Open Information Extraction with Large Language Models: A
Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text.
Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z) - Red Teaming Language Model Detectors with Language Models [114.36392560711022]
Large language models (LLMs) present significant safety and ethical risks if exploited by malicious users.
Recent works have proposed algorithms to detect LLM-generated text and protect LLMs.
We study two types of attack strategies: 1) replacing certain words in an LLM's output with their synonyms given the context; 2) automatically searching for an instructional prompt to alter the writing style of the generation.
arXiv Detail & Related papers (2023-05-31T10:08:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.