AI Red-Teaming is a Sociotechnical System. Now What?
- URL: http://arxiv.org/abs/2412.09751v1
- Date: Thu, 12 Dec 2024 22:48:19 GMT
- Title: AI Red-Teaming is a Sociotechnical System. Now What?
- Authors: Tarleton Gillespie, Ryland Shaw, Mary L. Gray, Jina Suh,
- Abstract summary: generative AI technologies find more and more real-world applications, the importance of testing their performance and safety seems paramount.
Red-teaming has quickly become the primary approach to test AI models--prioritized by AI companies.
We highlight the importance of understanding the values and assumptions behind red-teaming, the labor involved, and the psychological impacts on red-teamers.
- Score: 3.0001147629373195
- License:
- Abstract: As generative AI technologies find more and more real-world applications, the importance of testing their performance and safety seems paramount. ``Red-teaming'' has quickly become the primary approach to test AI models--prioritized by AI companies, and enshrined in AI policy and regulation. Members of red teams act as adversaries, probing AI systems to test their safety mechanisms and uncover vulnerabilities. Yet we know too little about this work and its implications. This essay calls for collaboration between computer scientists and social scientists to study the sociotechnical systems surrounding AI technologies, including the work of red-teaming, to avoid repeating the mistakes of the recent past. We highlight the importance of understanding the values and assumptions behind red-teaming, the labor involved, and the psychological impacts on red-teamers.
Related papers
- Lessons From Red Teaming 100 Generative AI Products [1.5285633805077958]
In recent years, AI red teaming has emerged as a practice for probing the safety and security of generative AI systems.
We offer practical recommendations aimed at aligning red teaming efforts with real world risks.
arXiv Detail & Related papers (2025-01-13T11:36:33Z) - Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI [52.138044013005]
generative AI, particularly large language models (LLMs), become increasingly integrated into production applications.
New attack surfaces and vulnerabilities emerge and put a focus on adversarial threats in natural language and multi-modal systems.
Red-teaming has gained importance in proactively identifying weaknesses in these systems, while blue-teaming works to protect against such adversarial attacks.
This work aims to bridge the gap between academic insights and practical security measures for the protection of generative AI systems.
arXiv Detail & Related papers (2024-09-23T10:18:10Z) - The Human Factor in AI Red Teaming: Perspectives from Social and Collaborative Computing [4.933252611303578]
Rapid progress in general-purpose AI has sparked significant interest in "red teaming"
Questions about how red teamers are selected, biases and blindspots in how tests are conducted, and harmful content's psychological effects on red teamers.
Future studies may explore topics ranging from fairness to mental health and other areas of potential harm.
arXiv Detail & Related papers (2024-07-10T16:02:13Z) - Red-Teaming for Generative AI: Silver Bullet or Security Theater? [42.35800543892003]
We argue that while red-teaming may be a valuable big-tent idea for characterizing GenAI harm mitigations, industry may effectively apply red-teaming and other strategies behind closed doors to safeguard AI.
To move toward a more robust toolbox of evaluations for generative AI, we synthesize our recommendations into a question bank meant to guide and scaffold future AI red-teaming practices.
arXiv Detail & Related papers (2024-01-29T05:46:14Z) - A Red Teaming Framework for Securing AI in Maritime Autonomous Systems [0.0]
We propose one of the first red team frameworks for evaluating the AI security of maritime autonomous systems.
This framework is a multi-part checklist, which can be tailored to different systems and requirements.
We demonstrate this framework to be highly effective for a red team to use to uncover numerous vulnerabilities within a real-world maritime autonomous systems AI.
arXiv Detail & Related papers (2023-12-08T14:59:07Z) - Managing extreme AI risks amid rapid progress [171.05448842016125]
We describe risks that include large-scale social harms, malicious uses, and irreversible loss of human control over autonomous AI systems.
There is a lack of consensus about how exactly such risks arise, and how to manage them.
Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems.
arXiv Detail & Related papers (2023-10-26T17:59:06Z) - The Promise and Peril of Artificial Intelligence -- Violet Teaming
Offers a Balanced Path Forward [56.16884466478886]
This paper reviews emerging issues with opaque and uncontrollable AI systems.
It proposes an integrative framework called violet teaming to develop reliable and responsible AI.
It emerged from AI safety research to manage risks proactively by design.
arXiv Detail & Related papers (2023-08-28T02:10:38Z) - Proceedings of the Artificial Intelligence for Cyber Security (AICS)
Workshop at AAAI 2022 [55.573187938617636]
The workshop will focus on the application of AI to problems in cyber security.
Cyber systems generate large volumes of data, utilizing this effectively is beyond human capabilities.
arXiv Detail & Related papers (2022-02-28T18:27:41Z) - A User-Centred Framework for Explainable Artificial Intelligence in
Human-Robot Interaction [70.11080854486953]
We propose a user-centred framework for XAI that focuses on its social-interactive aspect.
The framework aims to provide a structure for interactive XAI solutions thought for non-expert users.
arXiv Detail & Related papers (2021-09-27T09:56:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.