NeuralSentinel: Safeguarding Neural Network Reliability and
Trustworthiness
- URL: http://arxiv.org/abs/2402.07506v1
- Date: Mon, 12 Feb 2024 09:24:34 GMT
- Title: NeuralSentinel: Safeguarding Neural Network Reliability and
Trustworthiness
- Authors: Xabier Echeberria-Barrio, Mikel Gorricho, Selene Valencia, Francesco
Zola
- Abstract summary: We present NeuralSentinel (NS), a tool able to validate the reliability and trustworthiness of AI models.
NS help non-expert staff increase their confidence in this new system by understanding the model decisions.
This tool was deployed and used in a Hackathon event to evaluate the reliability of a skin cancer image detector.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The usage of Artificial Intelligence (AI) systems has increased
exponentially, thanks to their ability to reduce the amount of data to be
analyzed, the user efforts and preserving a high rate of accuracy. However,
introducing this new element in the loop has converted them into attacked
points that can compromise the reliability of the systems. This new scenario
has raised crucial challenges regarding the reliability and trustworthiness of
the AI models, as well as about the uncertainties in their response decisions,
becoming even more crucial when applied in critical domains such as healthcare,
chemical, electrical plants, etc. To contain these issues, in this paper, we
present NeuralSentinel (NS), a tool able to validate the reliability and
trustworthiness of AI models. This tool combines attack and defence strategies
and explainability concepts to stress an AI model and help non-expert staff
increase their confidence in this new system by understanding the model
decisions. NS provide a simple and easy-to-use interface for helping humans in
the loop dealing with all the needed information. This tool was deployed and
used in a Hackathon event to evaluate the reliability of a skin cancer image
detector. During the event, experts and non-experts attacked and defended the
detector, learning which factors were the most important for model
misclassification and which techniques were the most efficient. The event was
also used to detect NS's limitations and gather feedback for further
improvements.
Related papers
- Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification.
We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations.
Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z) - Building Trustworthy NeuroSymbolic AI Systems: Consistency, Reliability,
Explainability, and Safety [11.933469815219544]
We present the CREST framework that shows how Consistency, Reliability, user-level Explainability, and Safety are built on NeuroSymbolic methods.
This article focuses on Large Language Models (LLMs) as the chosen AI system within the CREST framework.
arXiv Detail & Related papers (2023-12-05T06:13:55Z) - Building Safe and Reliable AI systems for Safety Critical Tasks with
Vision-Language Processing [1.2183405753834557]
Current AI algorithms are unable to identify common causes for failure detection.
Additional techniques are required to quantify the quality of predictions.
This thesis will focus on vision-language data processing for tasks like classification, image captioning, and vision question answering.
arXiv Detail & Related papers (2023-08-06T18:05:59Z) - Understanding and Enhancing Robustness of Concept-based Models [41.20004311158688]
We study robustness of concept-based models to adversarial perturbations.
In this paper, we first propose and analyze different malicious attacks to evaluate the security vulnerability of concept based models.
We then propose a potential general adversarial training-based defense mechanism to increase robustness of these systems to the proposed malicious attacks.
arXiv Detail & Related papers (2022-11-29T10:43:51Z) - Inter-Domain Fusion for Enhanced Intrusion Detection in Power Systems:
An Evidence Theoretic and Meta-Heuristic Approach [0.0]
False alerts due to/ compromised IDS in ICS networks can lead to severe economic and operational damage.
This work presents an approach for reducing false alerts in CPS power systems by dealing with uncertainty without prior distribution of alerts.
arXiv Detail & Related papers (2021-11-20T00:05:39Z) - Federated Learning with Unreliable Clients: Performance Analysis and
Mechanism Design [76.29738151117583]
Federated Learning (FL) has become a promising tool for training effective machine learning models among distributed clients.
However, low quality models could be uploaded to the aggregator server by unreliable clients, leading to a degradation or even a collapse of training.
We model these unreliable behaviors of clients and propose a defensive mechanism to mitigate such a security risk.
arXiv Detail & Related papers (2021-05-10T08:02:27Z) - On the benefits of robust models in modulation recognition [53.391095789289736]
Deep Neural Networks (DNNs) using convolutional layers are state-of-the-art in many tasks in communications.
In other domains, like image classification, DNNs have been shown to be vulnerable to adversarial perturbations.
We propose a novel framework to test the robustness of current state-of-the-art models.
arXiv Detail & Related papers (2021-03-27T19:58:06Z) - Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations.
We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z) - Trustworthy AI [75.99046162669997]
Brittleness to minor adversarial changes in the input data, ability to explain the decisions, address the bias in their training data, are some of the most prominent limitations.
We propose the tutorial on Trustworthy AI to address six critical issues in enhancing user and public trust in AI systems.
arXiv Detail & Related papers (2020-11-02T20:04:18Z) - A Safety Framework for Critical Systems Utilising Deep Neural Networks [13.763070043077633]
This paper presents a principled novel safety argument framework for critical systems that utilise deep neural networks.
The approach allows various forms of predictions, e.g., future reliability of passing some demands, or confidence on a required reliability level.
It is supported by a Bayesian analysis using operational data and the recent verification and validation techniques for deep learning.
arXiv Detail & Related papers (2020-03-07T23:35:05Z) - Adversarial vs behavioural-based defensive AI with joint, continual and
active learning: automated evaluation of robustness to deception, poisoning
and concept drift [62.997667081978825]
Recent advancements in Artificial Intelligence (AI) have brought new capabilities to behavioural analysis (UEBA) for cyber-security.
In this paper, we present a solution to effectively mitigate this attack by improving the detection process and efficiently leveraging human expertise.
arXiv Detail & Related papers (2020-01-13T13:54:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.