Fugu-MT 論文翻訳(概要): Evaluation of GPT-based large language generative AI models as study aids for the national licensure examination for registered dietitians in Japan

論文の概要: Evaluation of GPT-based large language generative AI models as study aids for the national licensure examination for registered dietitians in Japan

arxiv url: http://arxiv.org/abs/2508.10011v1
Date: Tue, 05 Aug 2025 03:33:11 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-15 22:24:48.002558
Title: Evaluation of GPT-based large language generative AI models as study aids for the national licensure examination for registered dietitians in Japan
Title（参考訳）: GPTをベースとした大規模言語生成AIモデルの評価 : 日本における登録栄養士の免許試験における研究支援として
Authors: Yuta Nagamori, Mikoto Kosai, Yuji Kawai, Haruka Marumo, Misaki Shibuya, Tatsuya Negishi, Masaki Imanishi, Yasumasa Ikeda, Koichiro Tsuchiya, Asuka Sawai, Licht Miyamoto,
Abstract要約: 大規模言語モデル(LLM)に基づく生成人工知能(AI)は、様々な専門分野において顕著な進歩を見せている。本研究は、栄養学生を対象とした研究支援として、現在のLLMベースのAIモデルの可能性を評価することを目的としている。 Bing-PreciseとBing-Creativeは一般に栄養教育以外の科目で成績を上げた。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative artificial intelligence (AI) based on large language models (LLMs), such as ChatGPT, has demonstrated remarkable progress across various professional fields, including medicine and education. However, their performance in nutritional education, especially in Japanese national licensure examination for registered dietitians, remains underexplored. This study aimed to evaluate the potential of current LLM-based generative AI models as study aids for nutrition students. Questions from the Japanese national examination for registered dietitians were used as prompts for ChatGPT and three Bing models (Precise, Creative, Balanced), based on GPT-3.5 and GPT-4. Each question was entered into independent sessions, and model responses were analyzed for accuracy, consistency, and response time. Additional prompt engineering, including role assignment, was tested to assess potential performance improvements. Bing-Precise (66.2%) and Bing-Creative (61.4%) surpassed the passing threshold (60%), while Bing-Balanced (43.3%) and ChatGPT (42.8%) did not. Bing-Precise and Bing-Creative generally outperformed others across subject fields except Nutrition Education, where all models underperformed. None of the models consistently provided the same correct responses across repeated attempts, highlighting limitations in answer stability. ChatGPT showed greater consistency in response patterns but lower accuracy. Prompt engineering had minimal effect, except for modest improvement when correct answers and explanations were explicitly provided. While some generative AI models marginally exceeded the passing threshold, overall accuracy and answer consistency remained suboptimal. Moreover, all the models demonstrated notable limitations in answer consistency and robustness. Further advancements are needed to ensure reliable and stable AI-based study aids for dietitian licensure preparation.
Abstract（参考訳）: ChatGPTのような大規模言語モデル(LLM)に基づく生成人工知能(AI)は、医学や教育など、様々な専門分野において顕著な進歩を見せている。しかし、栄養教育、特に日本における登録食生活士の免許試験における成績は未定である。本研究は、栄養学生を対象とした学習支援として、現在のLLMベースの生成AIモデルの可能性を評価することを目的としている。 GPT-3.5およびGPT-4に基づくChatGPTと3つのBingモデル(Precise, Creative, Balanced)のプロンプトとして,登録食生活者に対する全国試験からの質問紙を用いた。各質問は独立したセッションに入力され、モデルの応答は正確性、一貫性、応答時間のために分析された。ロール割り当てを含む追加のプロンプトエンジニアリングは、潜在的なパフォーマンス改善を評価するためにテストされた。 Bing-Precise (66.2%) と Bing-Creative (61.4%) は通過閾値 (60%) を越え、Bing-Balanced (43.3%) と ChatGPT (42.8%) は通過しなかった。 Bing-Precise と Bing-Creative は一般的に、栄養教育以外の科目で他よりも優れており、全てのモデルでは成績が低かった。いずれのモデルも繰り返し試みにまたがって同じ正しい応答を提供しておらず、答え安定性の制限を強調していた。 ChatGPTは応答パターンの整合性が高かったが,精度は低かった。プロンプトエンジニアリングは、正しい回答と説明が明示的に提供されたときを除いて、最小限の効果しか与えなかった。一部の生成AIモデルは通過閾値をわずかに上回ったが、全体的な精度と回答の一貫性は最適以下であった。さらに、全てのモデルでは、応答の一貫性と堅牢性に顕著な制限が示されていた。食事用ライセンスの準備のために、信頼性と安定したAIベースの研究支援を確保するために、さらなる進歩が必要である。

論文の概要: Evaluation of GPT-based large language generative AI models as study aids for the national licensure examination for registered dietitians in Japan

関連論文リスト