Locating Risk: Task Designers and the Challenge of Risk Disclosure in RAI Content Work
- URL: http://arxiv.org/abs/2505.24246v1
- Date: Fri, 30 May 2025 06:08:50 GMT
- Title: Locating Risk: Task Designers and the Challenge of Risk Disclosure in RAI Content Work
- Authors: Alice Qian Zhang, Ryland Shaw, Laura Dabbish, Jina Suh, Hong Shen,
- Abstract summary: Crowd workers are often tasked with responsible AI (RAI) content work.<n>While prior efforts have highlighted the risks to worker well-being associated with RAI content work, far less attention has been paid to how these risks are communicated to workers.<n>This study investigates how task designers approach risk disclosure in crowdsourced RAI tasks.
- Score: 7.749785705236486
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: As AI systems are increasingly tested and deployed in open-ended and high-stakes domains, crowd workers are often tasked with responsible AI (RAI) content work. These tasks include labeling violent content, moderating disturbing text, or simulating harmful behavior for red teaming exercises to shape AI system behaviors. While prior efforts have highlighted the risks to worker well-being associated with RAI content work, far less attention has been paid to how these risks are communicated to workers. Existing transparency frameworks and guidelines such as model cards, datasheets, and crowdworksheets focus on documenting model information and dataset collection processes, but they overlook an important aspect of disclosing well-being risks to workers. In the absence of standard workflows or clear guidance, the consistent application of content warnings, consent flows, or other forms of well-being risk disclosure remain unclear. This study investigates how task designers approach risk disclosure in crowdsourced RAI tasks. Drawing on interviews with 23 task designers across academic and industry sectors, we examine how well-being risk is recognized, interpreted, and communicated in practice. Our findings surface a need to support task designers in identifying and communicating well-being risk not only to support crowdworker well-being but also to strengthen the ethical integrity and technical efficacy of AI development pipelines.
Related papers
- Capability-Oriented Training Induced Alignment Risk [101.37328448441208]
We investigate whether language models, when trained with reinforcement learning, will spontaneously learn to exploit flaws to maximize their reward.<n>Our experiments show that models consistently learn to exploit these vulnerabilities, discovering opportunistic strategies that significantly increase their reward at the expense of task correctness or safety.<n>Our findings suggest that future AI safety work must extend beyond content moderation to rigorously auditing and securing the training environments and reward mechanisms themselves.
arXiv Detail & Related papers (2026-02-12T16:13:14Z) - The Missing Half: Unveiling Training-time Implicit Safety Risks Beyond Deployment [148.80266237240713]
implicit training-time safety risks are driven by a model's internal incentives and contextual background information.<n>We present the first systematic study of this problem, introducing a taxonomy with five risk levels, ten fine-grained risk categories, and three incentive types.<n>Our results identify an overlooked yet urgent safety challenge in training.
arXiv Detail & Related papers (2026-02-04T04:23:58Z) - Responsible Diffusion: A Comprehensive Survey on Safety, Ethics, and Trust in Diffusion Models [69.22690439422531]
Diffusion models (DMs) have been investigated in various domains due to their ability to generate high-quality data.<n>Similar to traditional deep learning systems, there also exist potential threats to DMs.<n>This survey comprehensively elucidates its framework, threats, and countermeasures.
arXiv Detail & Related papers (2025-09-25T02:51:43Z) - Worker Discretion Advised: Co-designing Risk Disclosure in Crowdsourced Responsible AI (RAI) Content Work [12.492380198885295]
Responsible AI (RAI) content work often exposes crowd workers to potentially harmful content.<n>We conduct co-design sessions with 29 task designers, workers, and platform representatives.<n>We identify design tensions and map the sociotechnical tradeoffs that shape disclosure practices.
arXiv Detail & Related papers (2025-09-15T17:05:34Z) - Teaching Language Models To Gather Information Proactively [53.85419549904644]
Large language models (LLMs) are increasingly expected to function as collaborative partners.<n>In this work, we introduce a new task paradigm: proactive information gathering.<n>We design a scalable framework that generates partially specified, real-world tasks, masking key information.<n>Within this setup, our core innovation is a reinforcement finetuning strategy that rewards questions that elicit genuinely new, implicit user information.
arXiv Detail & Related papers (2025-07-28T23:50:09Z) - AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons [62.374792825813394]
This paper introduces AILuminate v1.0, the first comprehensive industry-standard benchmark for assessing AI-product risk and reliability.<n>The benchmark evaluates an AI system's resistance to prompts designed to elicit dangerous, illegal, or undesirable behavior in 12 hazard categories.
arXiv Detail & Related papers (2025-02-19T05:58:52Z) - Interactive Agents to Overcome Ambiguity in Software Engineering [61.40183840499932]
AI agents are increasingly being deployed to automate tasks, often based on ambiguous and underspecified user instructions.<n>Making unwarranted assumptions and failing to ask clarifying questions can lead to suboptimal outcomes.<n>We study the ability of LLM agents to handle ambiguous instructions in interactive code generation settings by evaluating proprietary and open-weight models on their performance.
arXiv Detail & Related papers (2025-02-18T17:12:26Z) - Human services organizations and the responsible integration of AI: Considering ethics and contextualizing risk(s) [0.0]
Authors argue that ethical concerns about AI deployment vary significantly based on implementation context and specific use cases.<n>They propose a dimensional risk assessment approach that considers factors like data sensitivity, professional oversight requirements, and potential impact on client wellbeing.
arXiv Detail & Related papers (2025-01-20T19:38:21Z) - WATCHDOG: an ontology-aWare risk AssessmenT approaCH via object-oriented DisruptiOn Graphs [0.9387233631570749]
The Common Ontology of Value and Risk (COVER) highlights how the role of objects and their relationships remains pivotal to performing transparent, complete and accountable risk assessment.<n>We operationalize some of the notions proposed by COVER by presenting a new framework for risk assessment: WATCHDOG.
arXiv Detail & Related papers (2024-12-18T15:44:04Z) - Risks and NLP Design: A Case Study on Procedural Document QA [52.557503571760215]
We argue that clearer assessments of risks and harms to users will be possible when we specialize the analysis to more concrete applications and their plausible users.
We conduct a risk-oriented error analysis that could then inform the design of a future system to be deployed with lower risk of harm and better performance.
arXiv Detail & Related papers (2024-08-16T17:23:43Z) - EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.<n>Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.<n>However, the deployment of these agents in physical environments presents significant safety challenges.<n>This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - Do as I do (Safely): Mitigating Task-Specific Fine-tuning Risks in Large Language Models [93.08860674071636]
We show how malicious actors can subtly manipulate the structure of almost any task-specific dataset to foster dangerous model behaviors.<n>We propose a novel mitigation strategy that mixes in safety data which mimics the task format and prompting style of the user data.
arXiv Detail & Related papers (2024-06-12T18:33:11Z) - AI and the Iterable Epistopics of Risk [1.26404863283601]
The risks AI presents to society are broadly understood to be manageable through general calculus.
This paper elaborates how risk is apprehended and managed by a regulator, developer and cyber-security expert.
arXiv Detail & Related papers (2024-04-29T13:33:22Z) - Affirmative safety: An approach to risk management for high-risk AI [6.133009503054252]
We argue that entities developing or deploying high-risk AI systems should be required to present evidence of affirmative safety.
We propose a risk management approach for advanced AI in which model developers must provide evidence that their activities keep certain risks below regulator-set thresholds.
arXiv Detail & Related papers (2024-04-14T20:48:55Z) - The Reasoning Under Uncertainty Trap: A Structural AI Risk [0.0]
Report provides an exposition of what makes RUU so challenging for both humans and machines.
We detail how this misuse risk connects to a wider network of underlying structural risks.
arXiv Detail & Related papers (2024-01-29T17:16:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.