Related papers: SproutBench: A Benchmark for Safe and Ethical Large Language Models for Youth

SproutBench: A Benchmark for Safe and Ethical Large Language Models for Youth

URL: http://arxiv.org/abs/2508.11009v1
Date: Thu, 14 Aug 2025 18:21:39 GMT
Title: SproutBench: A Benchmark for Safe and Ethical Large Language Models for Youth
Authors: Wenpeng Xing, Lanyi Wei, Haixiao Hu, Rongchang Li, Mohan Li, Changting Lin, Meng Han,
Abstract summary: The rapid proliferation of large language models (LLMs) in applications targeting children and adolescents necessitates a fundamental reassessment of prevailing AI safety frameworks.<n>This paper highlights key deficiencies in existing LLM safety benchmarks, including their inadequate coverage of age-specific cognitive, emotional, and social risks.<n>We introduce SproutBench, an innovative evaluation suite comprising 1,283 developmentally grounded adversarial prompts designed to probe risks such as emotional dependency, privacy violations, and imitation of hazardous behaviors.
Score: 14.569766143989531
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The rapid proliferation of large language models (LLMs) in applications targeting children and adolescents necessitates a fundamental reassessment of prevailing AI safety frameworks, which are largely tailored to adult users and neglect the distinct developmental vulnerabilities of minors. This paper highlights key deficiencies in existing LLM safety benchmarks, including their inadequate coverage of age-specific cognitive, emotional, and social risks spanning early childhood (ages 0--6), middle childhood (7--12), and adolescence (13--18). To bridge these gaps, we introduce SproutBench, an innovative evaluation suite comprising 1,283 developmentally grounded adversarial prompts designed to probe risks such as emotional dependency, privacy violations, and imitation of hazardous behaviors. Through rigorous empirical evaluation of 47 diverse LLMs, we uncover substantial safety vulnerabilities, corroborated by robust inter-dimensional correlations (e.g., between Safety and Risk Prevention) and a notable inverse relationship between Interactivity and Age Appropriateness. These insights yield practical guidelines for advancing child-centric AI design and deployment.

Related papers

CASTLE: A Comprehensive Benchmark for Evaluating Student-Tailored Personalized Safety in Large Language Models [55.0103764229311]
We propose the concept of Student-Tailored Personalized Safety and construct CASTLE based on educational theories.<n>This benchmark covers 15 educational safety risks and 14 student attributes, comprising 92,908 bilingual scenarios.
arXiv Detail & Related papers (2026-02-05T13:13:19Z)
XR Design Framework for Early Childhood Education [9.133320151595084]
Extended Reality in early childhood education presents high-risk challenges due to children's rapid developmental changes.<n>While augmented and virtual reality offer immersive pedagogical benefits, they often impose excessive cognitive load or sensory conflict.<n>We introduce the Augmented Human Development framework to model these interactions through cognitive, sensory, environmental, and developmental parameters.
arXiv Detail & Related papers (2026-01-26T21:32:35Z)
Evaluating LLM Safety Across Child Development Stages: A Simulated Agent Approach [9.544657426086284]
We present ChildSafe, a benchmark that evaluates Large Language Models (LLMs) safety through simulated child agents.<n>ChildSafe assesses responses across nine safety dimensions using age-weighted scoring in both sensitive and neutral contexts.
arXiv Detail & Related papers (2025-10-07T01:01:04Z)
Beyond Reactive Safety: Risk-Aware LLM Alignment via Long-Horizon Simulation [69.63626052852153]
We propose a proof-of-concept framework that projects how model-generated advice could propagate through societal systems.<n>We also introduce a dataset of 100 indirect harm scenarios, testing models' ability to foresee adverse, non-obvious outcomes from seemingly harmless user prompts.
arXiv Detail & Related papers (2025-06-26T02:28:58Z)
ROSE: Toward Reality-Oriented Safety Evaluation of Large Language Models [60.28667314609623]
Large Language Models (LLMs) are increasingly deployed as black-box components in real-world applications.<n>We propose Reality-Oriented Safety Evaluation (ROSE), a novel framework that uses multi-objective reinforcement learning to fine-tune an adversarial LLM.
arXiv Detail & Related papers (2025-06-17T10:55:17Z)
Safe-Child-LLM: A Developmental Benchmark for Evaluating LLM Safety in Child-LLM Interactions [8.018569128518187]
We introduce Safe-Child-LLM, a benchmark and dataset for assessing AI safety across two developmental stages: children (7-12) and adolescents (13-17).<n>Our framework includes a novel multi-part dataset of 200 adversarial prompts, curated from red-teaming corpora, with human-annotated labels for jailbreak success and a standardized 0-5 ethical refusal scale.<n> evaluating leading LLMs -- including ChatGPT, Claude, Gemini, LLaMA, DeepSeek, Grok, Vicuna, and Mistral -- we uncover critical safety deficiencies in child-facing scenarios.
arXiv Detail & Related papers (2025-06-16T14:04:54Z)
MinorBench: A hand-built benchmark for content-based risks for children [0.0]
Large Language Models (LLMs) are rapidly entering children's lives through parent-driven adoption, schools, and peer networks.<n>Current AI ethics and safety research do not adequately address content-related risks specific to minors.<n>We propose a new taxonomy of content-based risks for minors and introduce MinorBench, an open-source benchmark designed to evaluate LLMs on their ability to refuse unsafe or inappropriate queries from children.
arXiv Detail & Related papers (2025-03-13T10:34:43Z)
AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons [62.374792825813394]
This paper introduces AILuminate v1.0, the first comprehensive industry-standard benchmark for assessing AI-product risk and reliability.<n>The benchmark evaluates an AI system's resistance to prompts designed to elicit dangerous, illegal, or undesirable behavior in 12 hazard categories.
arXiv Detail & Related papers (2025-02-19T05:58:52Z)
LLMs and Childhood Safety: Identifying Risks and Proposing a Protection Framework for Safe Child-LLM Interaction [8.018569128518187]
This study examines the growing use of Large Language Models (LLMs) in child-centered applications.<n>It highlights safety and ethical concerns such as bias, harmful content, and cultural insensitivity.<n>We propose a protection framework for safe Child-LLM interaction, incorporating metrics for content safety, behavioral ethics, and cultural sensitivity.
arXiv Detail & Related papers (2025-02-16T19:39:48Z)
Agent-SafetyBench: Evaluating the Safety of LLM Agents [72.92604341646691]
We introduce Agent-SafetyBench, a benchmark designed to evaluate the safety of large language models (LLMs)<n>Agent-SafetyBench encompasses 349 interaction environments and 2,000 test cases, evaluating 8 categories of safety risks and covering 10 common failure modes frequently encountered in unsafe interactions.<n>Our evaluation of 16 popular LLM agents reveals a concerning result: none of the agents achieves a safety score above 60%.
arXiv Detail & Related papers (2024-12-19T02:35:15Z)
Safety Assessment of Chinese Large Language Models [51.83369778259149]
Large language models (LLMs) may generate insulting and discriminatory content, reflect incorrect social values, and may be used for malicious purposes. To promote the deployment of safe, responsible, and ethical AI, we release SafetyPrompts including 100k augmented prompts and responses by LLMs.
arXiv Detail & Related papers (2023-04-20T16:27:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.