Related papers: Towards integration of Privacy Enhancing Technologies in Explainable Artificial Intelligence

Towards integration of Privacy Enhancing Technologies in Explainable Artificial Intelligence

URL: http://arxiv.org/abs/2507.04528v1
Date: Sun, 06 Jul 2025 20:45:34 GMT
Title: Towards integration of Privacy Enhancing Technologies in Explainable Artificial Intelligence
Authors: Sonal Allana, Rozita Dara, Xiaodong Lin, Pulei Xiong,
Abstract summary: We explore Privacy Enhancing Technologies (PETs) as a defense mechanism against attribute inference on explanations provided by feature-based XAI methods.<n>PETs integration in explanations reduced the risk of the attack by 49.47%, while maintaining model utility and explanation quality.
Score: 3.212506950784342
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Explainable Artificial Intelligence (XAI) is a crucial pathway in mitigating the risk of non-transparency in the decision-making process of black-box Artificial Intelligence (AI) systems. However, despite the benefits, XAI methods are found to leak the privacy of individuals whose data is used in training or querying the models. Researchers have demonstrated privacy attacks that exploit explanations to infer sensitive personal information of individuals. Currently there is a lack of defenses against known privacy attacks targeting explanations when vulnerable XAI are used in production and machine learning as a service system. To address this gap, in this article, we explore Privacy Enhancing Technologies (PETs) as a defense mechanism against attribute inference on explanations provided by feature-based XAI methods. We empirically evaluate 3 types of PETs, namely synthetic training data, differentially private training and noise addition, on two categories of feature-based XAI. Our evaluation determines different responses from the mitigation methods and side-effects of PETs on other system properties such as utility and performance. In the best case, PETs integration in explanations reduced the risk of the attack by 49.47%, while maintaining model utility and explanation quality. Through our evaluation, we identify strategies for using PETs in XAI for maximizing benefits and minimizing the success of this privacy attack on sensitive personal information.

Related papers

Enhancing IoMT Security with Explainable Machine Learning: A Case Study on the CICIOMT2024 Dataset [0.0]
Explainable Artificial Intelligence (XAI) enhances the transparency and interpretability of AI models.<n>In cybersecurity, particularly within the Internet of Medical Things (IoMT), the black-box nature of AI-driven threat detection poses a significant challenge.<n>This study compares two ensemble learning techniques, bagging and boosting, for cyber-attack classification in IoMT environments.
arXiv Detail & Related papers (2025-09-10T09:17:46Z)
A Systematic Survey of Model Extraction Attacks and Defenses: State-of-the-Art and Perspectives [65.3369988566853]
Recent studies have demonstrated that adversaries can replicate a target model's functionality.<n>Model Extraction Attacks pose threats to intellectual property, privacy, and system security.<n>We propose a novel taxonomy that classifies MEAs according to attack mechanisms, defense approaches, and computing environments.
arXiv Detail & Related papers (2025-08-20T19:49:59Z)
Differential Privacy in Machine Learning: From Symbolic AI to LLMs [49.1574468325115]
Differential privacy provides a formal framework to mitigate privacy risks.<n>It ensures that the inclusion or exclusion of any single data point does not significantly alter the output of an algorithm.
arXiv Detail & Related papers (2025-06-13T11:30:35Z)
Study on the Helpfulness of Explainable Artificial Intelligence [0.0]
Legal, business, and ethical requirements motivate using effective XAI. We propose to evaluate XAI methods via the user's ability to successfully perform a proxy task. In other words, we address the helpfulness of XAI for human decision-making.
arXiv Detail & Related papers (2024-10-14T14:03:52Z)
Privacy Implications of Explainable AI in Data-Driven Systems [0.0]
Machine learning (ML) models suffer from a lack of interpretability. The absence of transparency, often referred to as the black box nature of ML models, undermines trust. XAI techniques address this challenge by providing frameworks and methods to explain the internal decision-making processes.
arXiv Detail & Related papers (2024-06-22T08:51:58Z)
Privacy-Enhancing Technologies for Artificial Intelligence-Enabled Systems [0.0]
Artificial intelligence (AI) models introduce privacy vulnerabilities to systems. These vulnerabilities exist during model development, deployment, and inference phases. We propose the use of several privacy-enhancing technologies (PETs) to defend AI-enabled systems.
arXiv Detail & Related papers (2024-04-04T15:14:40Z)
The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements. LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information. Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z)
Privacy Risks in Reinforcement Learning for Household Robots [42.675213619562975]
Privacy emerges as a pivotal concern within the realm of embodied AI, as the robot accesses substantial personal information.<n>This paper proposes an attack on the training process of the value-based algorithm and the gradient-based algorithm, utilizing gradient inversion to reconstruct states, actions, and supervisory signals.
arXiv Detail & Related papers (2023-06-15T16:53:26Z)
Adaptive cognitive fit: Artificial intelligence augmented management of information facets and representations [62.997667081978825]
Explosive growth in big data technologies and artificial intelligence [AI] applications have led to increasing pervasiveness of information facets. Information facets, such as equivocality and veracity, can dominate and significantly influence human perceptions of information. We suggest that artificially intelligent technologies that can adapt information representations to overcome cognitive limitations are necessary.
arXiv Detail & Related papers (2022-04-25T02:47:25Z)
Counterfactual Explanations as Interventions in Latent Space [62.997667081978825]
Counterfactual explanations aim to provide to end users a set of features that need to be changed in order to achieve a desired outcome. Current approaches rarely take into account the feasibility of actions needed to achieve the proposed explanations. We present Counterfactual Explanations as Interventions in Latent Space (CEILS), a methodology to generate counterfactual explanations.
arXiv Detail & Related papers (2021-06-14T20:48:48Z)
Trustworthy AI [75.99046162669997]
Brittleness to minor adversarial changes in the input data, ability to explain the decisions, address the bias in their training data, are some of the most prominent limitations. We propose the tutorial on Trustworthy AI to address six critical issues in enhancing user and public trust in AI systems.
arXiv Detail & Related papers (2020-11-02T20:04:18Z)
Adversarial vs behavioural-based defensive AI with joint, continual and active learning: automated evaluation of robustness to deception, poisoning and concept drift [62.997667081978825]
Recent advancements in Artificial Intelligence (AI) have brought new capabilities to behavioural analysis (UEBA) for cyber-security. In this paper, we present a solution to effectively mitigate this attack by improving the detection process and efficiently leveraging human expertise.
arXiv Detail & Related papers (2020-01-13T13:54:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.