A Safe Harbor for AI Evaluation and Red Teaming
- URL: http://arxiv.org/abs/2403.04893v1
- Date: Thu, 7 Mar 2024 20:55:08 GMT
- Title: A Safe Harbor for AI Evaluation and Red Teaming
- Authors: Shayne Longpre, Sayash Kapoor, Kevin Klyman, Ashwin Ramaswami, Rishi
Bommasani, Borhane Blili-Hamelin, Yangsibo Huang, Aviya Skowron, Zheng-Xin
Yong, Suhas Kotha, Yi Zeng, Weiyan Shi, Xianjun Yang, Reid Southen, Alexander
Robey, Patrick Chao, Diyi Yang, Ruoxi Jia, Daniel Kang, Sandy Pentland,
Arvind Narayanan, Percy Liang, Peter Henderson
- Abstract summary: Some researchers fear that conducting such research or releasing their findings will result in account suspensions or legal reprisal.
We propose that major AI developers commit to providing a legal and technical safe harbor.
We believe these commitments are a necessary step towards more inclusive and unimpeded community efforts to tackle the risks of generative AI.
- Score: 124.89885800509505
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Independent evaluation and red teaming are critical for identifying the risks
posed by generative AI systems. However, the terms of service and enforcement
strategies used by prominent AI companies to deter model misuse have
disincentives on good faith safety evaluations. This causes some researchers to
fear that conducting such research or releasing their findings will result in
account suspensions or legal reprisal. Although some companies offer researcher
access programs, they are an inadequate substitute for independent research
access, as they have limited community representation, receive inadequate
funding, and lack independence from corporate incentives. We propose that major
AI developers commit to providing a legal and technical safe harbor,
indemnifying public interest safety research and protecting it from the threat
of account suspensions or legal reprisal. These proposals emerged from our
collective experience conducting safety, privacy, and trustworthiness research
on generative AI systems, where norms and incentives could be better aligned
with public interests, without exacerbating model misuse. We believe these
commitments are a necessary step towards more inclusive and unimpeded community
efforts to tackle the risks of generative AI.
Related papers
- Particip-AI: A Democratic Surveying Framework for Anticipating Future AI Use Cases, Harms and Benefits [54.648819983899614]
Particip-AI is a framework to gather current and future AI use cases and their harms and benefits from non-expert public.
We gather responses from 295 demographically diverse participants.
arXiv Detail & Related papers (2024-03-21T19:12:37Z) - The risks of risk-based AI regulation: taking liability seriously [46.90451304069951]
The development and regulation of AI seems to have reached a critical stage.
Some experts are calling for a moratorium on the training of AI systems more powerful than GPT-4.
This paper analyses the most advanced legal proposal, the European Union's AI Act.
arXiv Detail & Related papers (2023-11-03T12:51:37Z) - Taking control: Policies to address extinction risks from AI [0.0]
We argue that voluntary commitments from AI companies would be an inappropriate and insufficient response.
We describe three policy proposals that would meaningfully address the threats from advanced AI.
arXiv Detail & Related papers (2023-10-31T15:53:14Z) - Managing extreme AI risks amid rapid progress [171.05448842016125]
We describe risks that include large-scale social harms, malicious uses, and irreversible loss of human control over autonomous AI systems.
There is a lack of consensus about how exactly such risks arise, and how to manage them.
Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems.
arXiv Detail & Related papers (2023-10-26T17:59:06Z) - Is the U.S. Legal System Ready for AI's Challenges to Human Values? [16.510834081597377]
This study investigates how effectively U.S. laws confront the challenges posed by Generative AI to human values.
We identify notable gaps and uncertainties within the existing legal framework regarding the protection of fundamental values.
We advocate for legal frameworks that evolve to recognize new threats and provide proactive, auditable guidelines to industry stakeholders.
arXiv Detail & Related papers (2023-08-30T09:19:06Z) - Dual Governance: The intersection of centralized regulation and
crowdsourced safety mechanisms for Generative AI [1.2691047660244335]
Generative Artificial Intelligence (AI) has seen mainstream adoption lately, especially in the form of consumer-facing, open-ended, text and image generating models.
The potential for generative AI to displace human creativity and livelihoods has also been under intense scrutiny.
Existing and proposed centralized regulations by governments to rein in AI face criticisms such as not having sufficient clarity or uniformity.
Decentralized protections via crowdsourced safety tools and mechanisms are a potential alternative.
arXiv Detail & Related papers (2023-08-02T23:25:21Z) - Both eyes open: Vigilant Incentives help Regulatory Markets improve AI
Safety [69.59465535312815]
Regulatory Markets for AI is a proposal designed with adaptability in mind.
It involves governments setting outcome-based targets for AI companies to achieve.
We warn that it is alarmingly easy to stumble on incentives which would prevent Regulatory Markets from achieving this goal.
arXiv Detail & Related papers (2023-03-06T14:42:05Z) - Filling gaps in trustworthy development of AI [20.354549569362035]
Growing awareness of potential risks from AI systems has spurred action to address those risks.
But the principles often leave a gap between the "what" and the "how" of trustworthy AI development.
There is thus an urgent need for concrete methods that both enable AI developers to prevent harm and allow them to demonstrate their trustworthiness.
arXiv Detail & Related papers (2021-12-14T22:45:28Z) - Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable
Claims [59.64274607533249]
AI developers need to make verifiable claims to which they can be held accountable.
This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems.
We analyze ten mechanisms for this purpose--spanning institutions, software, and hardware--and make recommendations aimed at implementing, exploring, or improving those mechanisms.
arXiv Detail & Related papers (2020-04-15T17:15:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.