Fugu-MT 論文翻訳(概要): VERA-MH: Validation of Ethical and Responsible AI in Mental Health

論文の概要: VERA-MH: Validation of Ethical and Responsible AI in Mental Health

arxiv url: http://arxiv.org/abs/2605.13318v2
Date: Tue, 19 May 2026 17:10:36 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-20 15:03:08.274573
Title: VERA-MH: Validation of Ethical and Responsible AI in Mental Health
Title（参考訳）: VERA-MH:精神保健における倫理的・責任的AIの検証
Authors: Luca Belli, Kate H. Bentley, Josh Gieringer, Emily Van Ark, Nilu Zhao, Pradip Thachile, Matt Hawrilenko, Millard Brown, Adam M. Chekroud,
Abstract要約: 精神保健支援の文脈において,チャットボットの安全性に関する臨床的に検証された新しい評価であるVERA-MHを紹介する。 VERA-MHは会話シミュレーション、会話判定、モデル評価の3つのステップから構成される。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Chatbot usage has increased, including in fields for which they were never developed for--notably mental health support. To that end, we introduce Validations of Ethical and Responsible AI in Mental Health (VERA-MH), a novel clinically-validated evaluation for safety of chatbots in the context of mental health support. The first iteration of VERA-MH focuses on Suicidal Ideation (SI) risks, by assessing how well chatbots can responds to users that might be in crisis. VERA-MH is comprised of three steps: conversation simulation, conversation judging and model rating. First, to simulate conversations with the chatbot under evaluation, another chatbot is tasked with role-playing users based on specific personas. Such user personas have been developed under clinical guidance, to make sure that, among others, multiple risk factors, demographic characteristics and disclosure factors were represented. In the judging step, a second support model is used as an LLM-as-a-Judge, together with a clinically-developed rubric. The rubric is structured as a flow, with a single Yes/No question asked each time, to improve answers' consistency and highlight models' failure modes. In the last stage, results of each conversation are aggregated to present the final evaluation of the chatbot. Together with the framework, we present the result of the evaluations for four leading LLM providers.
Abstract（参考訳）: チャットボットの使用は増加しており、特に精神的な健康支援のために開発されていない分野を含む。この目的のために,精神保健における倫理的・責任的AIの検証(VERA-MH)を紹介する。 VERA-MHの最初のイテレーションは、危機にさらされている可能性のあるユーザに対して、チャットボットがどれだけうまく対応できるかを評価することによって、自殺的理想(SI)リスクに焦点を当てている。 VERA-MHは会話シミュレーション、会話判定、モデル評価の3つのステップから構成される。まず、評価中のチャットボットとの会話をシミュレートするために、特定のペルソナに基づいて、別のチャットボットがロールプレイングユーザをタスクする。このようなユーザ・ペルソナは、複数のリスク要因、人口特性、開示要因が表現されることを確認するために、臨床ガイダンスの下で開発されている。判定ステップでは、LLM-as-a-Judgeとして第2の支持モデルが、臨床的に開発されたルーリックと共に使用される。ルーブリックはフローとして構成されており、回答の一貫性を改善し、モデルの失敗モードを強調するために、毎回Yes/Noの質問が1つずつ寄せられている。最後の段階では、各会話の結果を集約して、チャットボットの最終的な評価を示す。フレームワークとともに,4つの主要なLCMプロバイダの評価結果を示す。

論文の概要: VERA-MH: Validation of Ethical and Responsible AI in Mental Health

関連論文リスト