Against Algorithmic Exploitation of Human Vulnerabilities
- URL: http://arxiv.org/abs/2301.04993v1
- Date: Thu, 12 Jan 2023 13:15:24 GMT
- Title: Against Algorithmic Exploitation of Human Vulnerabilities
- Authors: Inga Str\"umke and Marija Slavkovik and Clemens Stachl
- Abstract summary: We are concerned with the problem of machine learning models inadvertently modelling vulnerabilities.
We describe common vulnerabilities, and illustrate cases where they are likely to play a role in algorithmic decision-making.
We propose a set of requirements for methods to detect the potential for vulnerability modelling.
- Score: 2.6918074738262194
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decisions such as which movie to watch next, which song to listen to, or
which product to buy online, are increasingly influenced by recommender systems
and user models that incorporate information on users' past behaviours,
preferences, and digitally created content. Machine learning models that enable
recommendations and that are trained on user data may unintentionally leverage
information on human characteristics that are considered vulnerabilities, such
as depression, young age, or gambling addiction. The use of algorithmic
decisions based on latent vulnerable state representations could be considered
manipulative and could have a deteriorating impact on the condition of
vulnerable individuals. In this paper, we are concerned with the problem of
machine learning models inadvertently modelling vulnerabilities, and want to
raise awareness for this issue to be considered in legislation and AI ethics.
Hence, we define and describe common vulnerabilities, and illustrate cases
where they are likely to play a role in algorithmic decision-making. We propose
a set of requirements for methods to detect the potential for vulnerability
modelling, detect whether vulnerable groups are treated differently by a model,
and detect whether a model has created an internal representation of
vulnerability. We conclude that explainable artificial intelligence methods may
be necessary for detecting vulnerability exploitation by machine learning-based
recommendation systems.
Related papers
- Verification of Machine Unlearning is Fragile [48.71651033308842]
We introduce two novel adversarial unlearning processes capable of circumventing both types of verification strategies.
This study highlights the vulnerabilities and limitations in machine unlearning verification, paving the way for further research into the safety of machine unlearning.
arXiv Detail & Related papers (2024-08-01T21:37:10Z) - Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification.
We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations.
Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z) - Machine Unlearning: Solutions and Challenges [21.141664917477257]
Machine learning models may inadvertently memorize sensitive, unauthorized, or malicious data, posing risks of privacy breaches, security vulnerabilities, and performance degradation.
To address these issues, machine unlearning has emerged as a critical technique to selectively remove specific training data points' influence on trained models.
This paper provides a comprehensive taxonomy and analysis of the solutions in machine unlearning.
arXiv Detail & Related papers (2023-08-14T10:45:51Z) - Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of
Foundation Models [103.71308117592963]
We present an algorithm for training self-destructing models leveraging techniques from meta-learning and adversarial learning.
In a small-scale experiment, we show MLAC can largely prevent a BERT-style model from being re-purposed to perform gender identification.
arXiv Detail & Related papers (2022-11-27T21:43:45Z) - Explainability for identification of vulnerable groups in machine
learning models [1.7403133838762446]
Machine learning fairness as a field is focused on the just treatment of individuals and groups under information processing.
This raises new challenges on how and when to protect vulnerable individuals and groups under machine learning.
Neither existing fairness nor existing explainability methods allow us to ascertain if a prediction model identifies vulnerability.
arXiv Detail & Related papers (2022-03-01T09:44:19Z) - When and How to Fool Explainable Models (and Humans) with Adversarial
Examples [1.439518478021091]
We explore the possibilities and limits of adversarial attacks for explainable machine learning models.
First, we extend the notion of adversarial examples to fit in explainable machine learning scenarios.
Next, we propose a comprehensive framework to study whether adversarial examples can be generated for explainable models.
arXiv Detail & Related papers (2021-07-05T11:20:55Z) - Individual Explanations in Machine Learning Models: A Survey for
Practitioners [69.02688684221265]
The use of sophisticated statistical models that influence decisions in domains of high societal relevance is on the rise.
Many governments, institutions, and companies are reluctant to their adoption as their output is often difficult to explain in human-interpretable ways.
Recently, the academic literature has proposed a substantial amount of methods for providing interpretable explanations to machine learning models.
arXiv Detail & Related papers (2021-04-09T01:46:34Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - (Un)fairness in Post-operative Complication Prediction Models [20.16366948502659]
We consider a real-life example of risk estimation before surgery and investigate the potential for bias or unfairness of a variety of algorithms.
Our approach creates transparent documentation of potential bias so that the users can apply the model carefully.
arXiv Detail & Related papers (2020-11-03T22:11:19Z) - Plausible Counterfactuals: Auditing Deep Learning Classifiers with
Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data.
Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model.
Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.