Safety Assessment of Chinese Large Language Models
- URL: http://arxiv.org/abs/2304.10436v1
- Date: Thu, 20 Apr 2023 16:27:35 GMT
- Title: Safety Assessment of Chinese Large Language Models
- Authors: Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- Abstract summary: Large language models (LLMs) may generate insulting and discriminatory content, reflect incorrect social values, and may be used for malicious purposes.
To promote the deployment of safe, responsible, and ethical AI, we release SafetyPrompts including 100k augmented prompts and responses by LLMs.
- Score: 51.83369778259149
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rapid popularity of large language models such as ChatGPT and GPT-4,
a growing amount of attention is paid to their safety concerns. These models
may generate insulting and discriminatory content, reflect incorrect social
values, and may be used for malicious purposes such as fraud and dissemination
of misleading information. Evaluating and enhancing their safety is
particularly essential for the wide application of large language models
(LLMs). To further promote the safe deployment of LLMs, we develop a Chinese
LLM safety assessment benchmark. Our benchmark explores the comprehensive
safety performance of LLMs from two perspectives: 8 kinds of typical safety
scenarios and 6 types of more challenging instruction attacks. Our benchmark is
based on a straightforward process in which it provides the test prompts and
evaluates the safety of the generated responses from the evaluated model. In
evaluation, we utilize the LLM's strong evaluation ability and develop it as a
safety evaluator by prompting. On top of this benchmark, we conduct safety
assessments and analyze 15 LLMs including the OpenAI GPT series and other
well-known Chinese LLMs, where we observe some interesting findings. For
example, we find that instruction attacks are more likely to expose safety
issues of all LLMs. Moreover, to promote the development and deployment of
safe, responsible, and ethical AI, we publicly release SafetyPrompts including
100k augmented prompts and responses by LLMs.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.