Related papers: NeuralSentinel: Safeguarding Neural Network Reliability and Trustworthiness

NeuralSentinel: Safeguarding Neural Network Reliability and Trustworthiness

URL: http://arxiv.org/abs/2402.07506v1
Date: Mon, 12 Feb 2024 09:24:34 GMT
Title: NeuralSentinel: Safeguarding Neural Network Reliability and Trustworthiness
Authors: Xabier Echeberria-Barrio, Mikel Gorricho, Selene Valencia, Francesco Zola
Abstract summary: We present NeuralSentinel (NS), a tool able to validate the reliability and trustworthiness of AI models. NS help non-expert staff increase their confidence in this new system by understanding the model decisions. This tool was deployed and used in a Hackathon event to evaluate the reliability of a skin cancer image detector.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The usage of Artificial Intelligence (AI) systems has increased exponentially, thanks to their ability to reduce the amount of data to be analyzed, the user efforts and preserving a high rate of accuracy. However, introducing this new element in the loop has converted them into attacked points that can compromise the reliability of the systems. This new scenario has raised crucial challenges regarding the reliability and trustworthiness of the AI models, as well as about the uncertainties in their response decisions, becoming even more crucial when applied in critical domains such as healthcare, chemical, electrical plants, etc. To contain these issues, in this paper, we present NeuralSentinel (NS), a tool able to validate the reliability and trustworthiness of AI models. This tool combines attack and defence strategies and explainability concepts to stress an AI model and help non-expert staff increase their confidence in this new system by understanding the model decisions. NS provide a simple and easy-to-use interface for helping humans in the loop dealing with all the needed information. This tool was deployed and used in a Hackathon event to evaluate the reliability of a skin cancer image detector. During the event, experts and non-experts attacked and defended the detector, learning which factors were the most important for model misclassification and which techniques were the most efficient. The event was also used to detect NS's limitations and gather feedback for further improvements.

Related papers

Evaluating explainable AI for deep learning-based network intrusion detection system alert classification [0.7864304771129751]
A Network Intrusion Detection System (NIDS) monitors networks for cyber attacks and other unwanted activities.<n>NIDS solutions often generate an overwhelming number of alerts daily, making it challenging for analysts to prioritize high-priority threats.<n>This study highlights the critical need for explainable artificial intelligence (XAI) in NIDS alert classification to improve trust and interpretability.
arXiv Detail & Related papers (2025-06-09T15:53:30Z)
Foundations of Unknown-aware Machine Learning [15.159780181377679]
This thesis develops both algorithmic and theoretical foundations to address key reliability issues arising from distributional uncertainty and unknown classes.<n>A core contribution is the development of an unknown-aware learning framework that enables models to recognize and handle novel inputs without labeled OOD data.<n>The thesis also extends reliable learning to foundation models, including large language models (LLMs)<n>Overall, these contributions promote unknown-aware learning as a new paradigm, and we hope it can advance the reliability of AI systems with minimal human efforts.
arXiv Detail & Related papers (2025-05-20T21:39:08Z)
Towards Trustworthy GUI Agents: A Survey [64.6445117343499]
This survey examines the trustworthiness of GUI agents in five critical dimensions. We identify major challenges such as vulnerability to adversarial attacks, cascading failure modes in sequential decision-making. As GUI agents become more widespread, establishing robust safety standards and responsible development practices is essential.
arXiv Detail & Related papers (2025-03-30T13:26:00Z)
Computational Safety for Generative AI: A Signal Processing Perspective [65.268245109828]
computational safety is a mathematical framework that enables the quantitative assessment, formulation, and study of safety challenges in GenAI. We show how sensitivity analysis and loss landscape analysis can be used to detect malicious prompts with jailbreak attempts. We discuss key open research challenges, opportunities, and the essential role of signal processing in computational AI safety.
arXiv Detail & Related papers (2025-02-18T02:26:50Z)
Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification. We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations. Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z)
Building Trustworthy NeuroSymbolic AI Systems: Consistency, Reliability, Explainability, and Safety [11.933469815219544]
We present the CREST framework that shows how Consistency, Reliability, user-level Explainability, and Safety are built on NeuroSymbolic methods. This article focuses on Large Language Models (LLMs) as the chosen AI system within the CREST framework.
arXiv Detail & Related papers (2023-12-05T06:13:55Z)
Building Safe and Reliable AI systems for Safety Critical Tasks with Vision-Language Processing [1.2183405753834557]
Current AI algorithms are unable to identify common causes for failure detection. Additional techniques are required to quantify the quality of predictions. This thesis will focus on vision-language data processing for tasks like classification, image captioning, and vision question answering.
arXiv Detail & Related papers (2023-08-06T18:05:59Z)
Understanding and Enhancing Robustness of Concept-based Models [41.20004311158688]
We study robustness of concept-based models to adversarial perturbations. In this paper, we first propose and analyze different malicious attacks to evaluate the security vulnerability of concept based models. We then propose a potential general adversarial training-based defense mechanism to increase robustness of these systems to the proposed malicious attacks.
arXiv Detail & Related papers (2022-11-29T10:43:51Z)
Inter-Domain Fusion for Enhanced Intrusion Detection in Power Systems: An Evidence Theoretic and Meta-Heuristic Approach [0.0]
False alerts due to/ compromised IDS in ICS networks can lead to severe economic and operational damage. This work presents an approach for reducing false alerts in CPS power systems by dealing with uncertainty without prior distribution of alerts.
arXiv Detail & Related papers (2021-11-20T00:05:39Z)
Federated Learning with Unreliable Clients: Performance Analysis and Mechanism Design [76.29738151117583]
Federated Learning (FL) has become a promising tool for training effective machine learning models among distributed clients. However, low quality models could be uploaded to the aggregator server by unreliable clients, leading to a degradation or even a collapse of training. We model these unreliable behaviors of clients and propose a defensive mechanism to mitigate such a security risk.
arXiv Detail & Related papers (2021-05-10T08:02:27Z)
On the benefits of robust models in modulation recognition [53.391095789289736]
Deep Neural Networks (DNNs) using convolutional layers are state-of-the-art in many tasks in communications. In other domains, like image classification, DNNs have been shown to be vulnerable to adversarial perturbations. We propose a novel framework to test the robustness of current state-of-the-art models.
arXiv Detail & Related papers (2021-03-27T19:58:06Z)
Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations. We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z)
Trustworthy AI [75.99046162669997]
Brittleness to minor adversarial changes in the input data, ability to explain the decisions, address the bias in their training data, are some of the most prominent limitations. We propose the tutorial on Trustworthy AI to address six critical issues in enhancing user and public trust in AI systems.
arXiv Detail & Related papers (2020-11-02T20:04:18Z)
A Safety Framework for Critical Systems Utilising Deep Neural Networks [13.763070043077633]
This paper presents a principled novel safety argument framework for critical systems that utilise deep neural networks. The approach allows various forms of predictions, e.g., future reliability of passing some demands, or confidence on a required reliability level. It is supported by a Bayesian analysis using operational data and the recent verification and validation techniques for deep learning.
arXiv Detail & Related papers (2020-03-07T23:35:05Z)
Adversarial vs behavioural-based defensive AI with joint, continual and active learning: automated evaluation of robustness to deception, poisoning and concept drift [62.997667081978825]
Recent advancements in Artificial Intelligence (AI) have brought new capabilities to behavioural analysis (UEBA) for cyber-security. In this paper, we present a solution to effectively mitigate this attack by improving the detection process and efficiently leveraging human expertise.
arXiv Detail & Related papers (2020-01-13T13:54:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.