Protecting Society from AI Misuse: When are Restrictions on Capabilities
Warranted?
- URL: http://arxiv.org/abs/2303.09377v3
- Date: Wed, 29 Mar 2023 14:46:46 GMT
- Title: Protecting Society from AI Misuse: When are Restrictions on Capabilities
Warranted?
- Authors: Markus Anderljung and Julian Hazell
- Abstract summary: We argue that targeted interventions on certain capabilities will be warranted to prevent some misuses of AI.
These restrictions may include controlling who can access certain types of AI models, what they can be used for, whether outputs are filtered or can be traced back to their user.
We apply this reasoning to three examples: predicting novel toxins, creating harmful images, and automating spear phishing campaigns.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial intelligence (AI) systems will increasingly be used to cause harm
as they grow more capable. In fact, AI systems are already starting to be used
to automate fraudulent activities, violate human rights, create harmful fake
images, and identify dangerous toxins. To prevent some misuses of AI, we argue
that targeted interventions on certain capabilities will be warranted. These
restrictions may include controlling who can access certain types of AI models,
what they can be used for, whether outputs are filtered or can be traced back
to their user, and the resources needed to develop them. We also contend that
some restrictions on non-AI capabilities needed to cause harm will be required.
Though capability restrictions risk reducing use more than misuse (facing an
unfavorable Misuse-Use Tradeoff), we argue that interventions on capabilities
are warranted when other interventions are insufficient, the potential harm
from misuse is high, and there are targeted ways to intervene on capabilities.
We provide a taxonomy of interventions that can reduce AI misuse, focusing on
the specific steps required for a misuse to cause harm (the Misuse Chain), and
a framework to determine if an intervention is warranted. We apply this
reasoning to three examples: predicting novel toxins, creating harmful images,
and automating spear phishing campaigns.
Related papers
- Fake It Until You Break It: On the Adversarial Robustness of AI-generated Image Detectors [14.284639462471274]
We evaluate state-of-the-art AI-generated image (AIGI) detectors under different attack scenarios.
Attacks can significantly reduce detection accuracy to the extent that the risks of relying on detectors outweigh their benefits.
We propose a simple defense mechanism to make CLIP-based detectors, which are currently the best-performing detectors, robust against these attacks.
arXiv Detail & Related papers (2024-10-02T14:11:29Z) - Societal Adaptation to Advanced AI [1.2607853680700076]
Existing strategies for managing risks from advanced AI systems often focus on affecting what AI systems are developed and how they diffuse.
We urge a complementary approach: increasing societal adaptation to advanced AI.
We introduce a conceptual framework which helps identify adaptive interventions that avoid, defend against and remedy potentially harmful uses of AI systems.
arXiv Detail & Related papers (2024-05-16T17:52:12Z) - A Technological Perspective on Misuse of Available AI [41.94295877935867]
Potential malicious misuse of civilian artificial intelligence (AI) poses serious threats to security on a national and international level.
We show how already existing and openly available AI technology could be misused.
We develop three exemplary use cases of potentially misused AI that threaten political, digital and physical security.
arXiv Detail & Related papers (2024-03-22T16:30:58Z) - Control Risk for Potential Misuse of Artificial Intelligence in Science [85.91232985405554]
We aim to raise awareness of the dangers of AI misuse in science.
We highlight real-world examples of misuse in chemical science.
We propose a system called SciGuard to control misuse risks for AI models in science.
arXiv Detail & Related papers (2023-12-11T18:50:57Z) - Managing extreme AI risks amid rapid progress [171.05448842016125]
We describe risks that include large-scale social harms, malicious uses, and irreversible loss of human control over autonomous AI systems.
There is a lack of consensus about how exactly such risks arise, and how to manage them.
Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems.
arXiv Detail & Related papers (2023-10-26T17:59:06Z) - Absolutist AI [0.0]
Training AI systems with absolute constraints may make considerable progress on many AI safety problems.
It provides a guardrail for avoiding the very worst outcomes of misalignment.
It could prevent AIs from causing catastrophes for the sake of very valuable consequences.
arXiv Detail & Related papers (2023-07-19T03:40:37Z) - Fairness in AI and Its Long-Term Implications on Society [68.8204255655161]
We take a closer look at AI fairness and analyze how lack of AI fairness can lead to deepening of biases over time.
We discuss how biased models can lead to more negative real-world outcomes for certain groups.
If the issues persist, they could be reinforced by interactions with other risks and have severe implications on society in the form of social unrest.
arXiv Detail & Related papers (2023-04-16T11:22:59Z) - DDoD: Dual Denial of Decision Attacks on Human-AI Teams [29.584936458736813]
We propose textitDual Denial of Decision (DDoD) attacks against collaborative Human-AI teams.
We discuss how such attacks aim to deplete textitboth computational and human resources, and significantly impair decision-making capabilities.
arXiv Detail & Related papers (2022-12-07T22:30:17Z) - Seamful XAI: Operationalizing Seamful Design in Explainable AI [59.89011292395202]
Mistakes in AI systems are inevitable, arising from both technical limitations and sociotechnical gaps.
We propose that seamful design can foster AI explainability by revealing sociotechnical and infrastructural mismatches.
We explore this process with 43 AI practitioners and real end-users.
arXiv Detail & Related papers (2022-11-12T21:54:05Z) - When to Ask for Help: Proactive Interventions in Autonomous
Reinforcement Learning [57.53138994155612]
A long-term goal of reinforcement learning is to design agents that can autonomously interact and learn in the world.
A critical challenge is the presence of irreversible states which require external assistance to recover from, such as when a robot arm has pushed an object off of a table.
We propose an algorithm that efficiently learns to detect and avoid states that are irreversible, and proactively asks for help in case the agent does enter them.
arXiv Detail & Related papers (2022-10-19T17:57:24Z) - Overcoming Failures of Imagination in AI Infused System Development and
Deployment [71.9309995623067]
NeurIPS 2020 requested that research paper submissions include impact statements on "potential nefarious uses and the consequences of failure"
We argue that frameworks of harms must be context-aware and consider a wider range of potential stakeholders, system affordances, as well as viable proxies for assessing harms in the widest sense.
arXiv Detail & Related papers (2020-11-26T18:09:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.