Effective Automation to Support the Human Infrastructure in AI Red Teaming
- URL: http://arxiv.org/abs/2503.22116v1
- Date: Fri, 28 Mar 2025 03:36:15 GMT
- Title: Effective Automation to Support the Human Infrastructure in AI Red Teaming
- Authors: Alice Qian Zhang, Jina Suh, Mary L. Gray, Hong Shen,
- Abstract summary: We argue for a balanced approach that combines human expertise with automated tools to strengthen AI risk assessment.<n>We highlight key challenges in scaling automated red teaming, including considerations around worker proficiency, agency, and context-awareness.
- Score: 5.463538170874778
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: As artificial intelligence (AI) systems become increasingly embedded in critical societal functions, the need for robust red teaming methodologies continues to grow. In this forum piece, we examine emerging approaches to automating AI red teaming, with a particular focus on how the application of automated methods affects human-driven efforts. We discuss the role of labor in automated red teaming processes, the benefits and limitations of automation, and its broader implications for AI safety and labor practices. Drawing on existing frameworks and case studies, we argue for a balanced approach that combines human expertise with automated tools to strengthen AI risk assessment. Finally, we highlight key challenges in scaling automated red teaming, including considerations around worker proficiency, agency, and context-awareness.
Related papers
- Human-Centered AI and Autonomy in Robotics: Insights from a Bibliometric Study [3.794859469462058]
Human-Centered AI (HCAI) aims to balance human control and automation.
This paper presents a bibliometric analysis of intelligent autonomous robotic systems using SciMAT and VOSViewer.
The findings highlight academic trends, emerging topics, and AI's role in self-adaptive robotic behaviour.
arXiv Detail & Related papers (2025-04-28T14:45:48Z) - Lessons From Red Teaming 100 Generative AI Products [1.5285633805077958]
In recent years, AI red teaming has emerged as a practice for probing the safety and security of generative AI systems.<n>We offer practical recommendations aimed at aligning red teaming efforts with real world risks.
arXiv Detail & Related papers (2025-01-13T11:36:33Z) - AI red-teaming is a sociotechnical challenge: on values, labor, and harms [3.0001147629373195]
"Red-teaming" has quickly become the primary approach to test AI models.
We highlight the importance of understanding the values and assumptions behind red-teaming.
arXiv Detail & Related papers (2024-12-12T22:48:19Z) - Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics [68.36528819227641]
This paper systematically quantifies the robustness of VLA-based robotic systems.<n>We introduce two untargeted attack objectives that leverage spatial foundations to destabilize robotic actions, and a targeted attack objective that manipulates the robotic trajectory.<n>We design an adversarial patch generation approach that places a small, colorful patch within the camera's view, effectively executing the attack in both digital and physical environments.
arXiv Detail & Related papers (2024-11-18T01:52:20Z) - Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI [52.138044013005]
generative AI, particularly large language models (LLMs), become increasingly integrated into production applications.
New attack surfaces and vulnerabilities emerge and put a focus on adversarial threats in natural language and multi-modal systems.
Red-teaming has gained importance in proactively identifying weaknesses in these systems, while blue-teaming works to protect against such adversarial attacks.
This work aims to bridge the gap between academic insights and practical security measures for the protection of generative AI systems.
arXiv Detail & Related papers (2024-09-23T10:18:10Z) - Human-Centered Automation [0.3626013617212666]
The paper argues for the emerging area of Human-Centered Automation (HCA), which prioritizes user needs and preferences in the design and development of automation systems.
The paper discusses the limitations of existing automation approaches, the challenges in integrating AI and RPA, and the benefits of human-centered automation for productivity, innovation, and democratizing access to these technologies.
arXiv Detail & Related papers (2024-05-24T22:12:28Z) - Position Paper: Agent AI Towards a Holistic Intelligence [53.35971598180146]
We emphasize developing Agent AI -- an embodied system that integrates large foundation models into agent actions.
In this paper, we propose a novel large action model to achieve embodied intelligent behavior, the Agent Foundation Model.
arXiv Detail & Related papers (2024-02-28T16:09:56Z) - Red-Teaming for Generative AI: Silver Bullet or Security Theater? [42.35800543892003]
We argue that while red-teaming may be a valuable big-tent idea for characterizing GenAI harm mitigations, industry may effectively apply red-teaming and other strategies behind closed doors to safeguard AI.
To move toward a more robust toolbox of evaluations for generative AI, we synthesize our recommendations into a question bank meant to guide and scaffold future AI red-teaming practices.
arXiv Detail & Related papers (2024-01-29T05:46:14Z) - A Red Teaming Framework for Securing AI in Maritime Autonomous Systems [0.0]
We propose one of the first red team frameworks for evaluating the AI security of maritime autonomous systems.
This framework is a multi-part checklist, which can be tailored to different systems and requirements.
We demonstrate this framework to be highly effective for a red team to use to uncover numerous vulnerabilities within a real-world maritime autonomous systems AI.
arXiv Detail & Related papers (2023-12-08T14:59:07Z) - ProAgent: From Robotic Process Automation to Agentic Process Automation [87.0555252338361]
Large Language Models (LLMs) have emerged human-like intelligence.
This paper introduces Agentic Process Automation (APA), a groundbreaking automation paradigm using LLM-based agents for advanced automation.
We then instantiate ProAgent, an agent designed to craft from human instructions and make intricate decisions by coordinating specialized agents.
arXiv Detail & Related papers (2023-11-02T14:32:16Z) - AI Maintenance: A Robustness Perspective [91.28724422822003]
We introduce highlighted robustness challenges in the AI lifecycle and motivate AI maintenance by making analogies to car maintenance.
We propose an AI model inspection framework to detect and mitigate robustness risks.
Our proposal for AI maintenance facilitates robustness assessment, status tracking, risk scanning, model hardening, and regulation throughout the AI lifecycle.
arXiv Detail & Related papers (2023-01-08T15:02:38Z) - Automating Privilege Escalation with Deep Reinforcement Learning [71.87228372303453]
In this work, we exemplify the potential threat of malicious actors using deep reinforcement learning to train automated agents.
We present an agent that uses a state-of-the-art reinforcement learning algorithm to perform local privilege escalation.
Our agent is usable for generating realistic attack sensor data for training and evaluating intrusion detection systems.
arXiv Detail & Related papers (2021-10-04T12:20:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.