Related papers: LLM Safety for Children

LLM Safety for Children

URL: http://arxiv.org/abs/2502.12552v1
Date: Tue, 18 Feb 2025 05:26:27 GMT
Title: LLM Safety for Children
Authors: Prasanjit Rath, Hari Shrawgi, Parag Agrawal, Sandipan Dandapat,
Abstract summary: The study acknowledges the diverse nature of children often overlooked by standard safety evaluations.<n>We develop Child User Models that reflect the varied personalities and interests of children informed by literature in child care and psychology.
Score: 9.935219917903858
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper analyzes the safety of Large Language Models (LLMs) in interactions with children below age of 18 years. Despite the transformative applications of LLMs in various aspects of children's lives such as education and therapy, there remains a significant gap in understanding and mitigating potential content harms specific to this demographic. The study acknowledges the diverse nature of children often overlooked by standard safety evaluations and proposes a comprehensive approach to evaluating LLM safety specifically for children. We list down potential risks that children may encounter when using LLM powered applications. Additionally we develop Child User Models that reflect the varied personalities and interests of children informed by literature in child care and psychology. These user models aim to bridge the existing gap in child safety literature across various fields. We utilize Child User Models to evaluate the safety of six state of the art LLMs. Our observations reveal significant safety gaps in LLMs particularly in categories harmful to children but not adults

Related papers

Applying LLM-Powered Virtual Humans to Child Interviews in Child-Centered Design [0.0]
This study establishes key design guidelines for LLM-powered virtual humans tailored to child interviews. Using ChatGPT-based prompt engineering, we developed three distinct Human-AI (LLM-Auto, LLM-Interview, and LLM-Analyze) Results indicated that the LLM-Analyze workflow outperformed the others by eliciting longer responses.
arXiv Detail & Related papers (2025-04-28T17:35:46Z)
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment [291.03029298928857]
This paper introduces the concept of "full-stack" safety to systematically consider safety issues throughout the entire process of LLM training, deployment, and commercialization. Our research is grounded in an exhaustive review of over 800+ papers, ensuring comprehensive coverage and systematic organization of security issues. Our work identifies promising research directions, including safety in data generation, alignment techniques, model editing, and LLM-based agent systems.
arXiv Detail & Related papers (2025-04-22T05:02:49Z)
MinorBench: A hand-built benchmark for content-based risks for children [0.0]
Large Language Models (LLMs) are rapidly entering children's lives through parent-driven adoption, schools, and peer networks. Current AI ethics and safety research do not adequately address content-related risks specific to minors. We propose a new taxonomy of content-based risks for minors and introduce MinorBench, an open-source benchmark designed to evaluate LLMs on their ability to refuse unsafe or inappropriate queries from children.
arXiv Detail & Related papers (2025-03-13T10:34:43Z)
LLMs and Childhood Safety: Identifying Risks and Proposing a Protection Framework for Safe Child-LLM Interaction [8.018569128518187]
This study examines the growing use of Large Language Models (LLMs) in child-centered applications.<n>It highlights safety and ethical concerns such as bias, harmful content, and cultural insensitivity.<n>We propose a protection framework for safe Child-LLM interaction, incorporating metrics for content safety, behavioral ethics, and cultural sensitivity.
arXiv Detail & Related papers (2025-02-16T19:39:48Z)
SafeBench: A Safety Evaluation Framework for Multimodal Large Language Models [75.67623347512368]
We propose toolns, a comprehensive framework designed for conducting safety evaluations of MLLMs. Our framework consists of a comprehensive harmful query dataset and an automated evaluation protocol. Based on our framework, we conducted large-scale experiments on 15 widely-used open-source MLLMs and 6 commercial MLLMs.
arXiv Detail & Related papers (2024-10-24T17:14:40Z)
ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming [64.86326523181553]
ALERT is a large-scale benchmark to assess safety based on a novel fine-grained risk taxonomy. It aims to identify vulnerabilities, inform improvements, and enhance the overall safety of the language models.
arXiv Detail & Related papers (2024-04-06T15:01:47Z)
The Art of Defending: A Systematic Evaluation and Analysis of LLM Defense Strategies on Safety and Over-Defensiveness [56.174255970895466]
Large Language Models (LLMs) play an increasingly pivotal role in natural language processing applications. This paper presents Safety and Over-Defensiveness Evaluation (SODE) benchmark.
arXiv Detail & Related papers (2023-12-30T17:37:06Z)
A Survey on Evaluation of Large Language Models [87.60417393701331]
Large language models (LLMs) are gaining increasing popularity in both academia and industry. This paper focuses on three key dimensions: what to evaluate, where to evaluate, and how to evaluate.
arXiv Detail & Related papers (2023-07-06T16:28:35Z)
Safety Assessment of Chinese Large Language Models [51.83369778259149]
Large language models (LLMs) may generate insulting and discriminatory content, reflect incorrect social values, and may be used for malicious purposes. To promote the deployment of safe, responsible, and ethical AI, we release SafetyPrompts including 100k augmented prompts and responses by LLMs.
arXiv Detail & Related papers (2023-04-20T16:27:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.