Fugu-MT 論文翻訳(概要): Automated Educational Question Generation at Different Bloom's Skill Levels using Large Language Models: Strategies and Evaluation

論文の概要: Automated Educational Question Generation at Different Bloom's Skill Levels using Large Language Models: Strategies and Evaluation

arxiv url: http://arxiv.org/abs/2408.04394v1
Date: Thu, 8 Aug 2024 11:56:57 GMT
ステータス: 翻訳完了
システム内更新日: 2024-08-09 15:48:23.171925
Title: Automated Educational Question Generation at Different Bloom's Skill Levels using Large Language Models: Strategies and Evaluation
Title（参考訳）: 大規模言語モデルを用いた異なるブルームスキルレベルでの教育的質問の自動生成:戦略と評価
Authors: Nicy Scaria, Suma Dharani Chenna, Deepak Subramani,
Abstract要約: 我々は,5つの最先端の大規模言語モデルを用いて,認知レベルの多様で高品質な質問を生成する能力について検討した。以上の結果から,LLmsは適切な情報によって認知レベルが異なる関連性のある,高品質な教育的質問を生じさせる可能性が示唆された。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Developing questions that are pedagogically sound, relevant, and promote learning is a challenging and time-consuming task for educators. Modern-day large language models (LLMs) generate high-quality content across multiple domains, potentially helping educators to develop high-quality questions. Automated educational question generation (AEQG) is important in scaling online education catering to a diverse student population. Past attempts at AEQG have shown limited abilities to generate questions at higher cognitive levels. In this study, we examine the ability of five state-of-the-art LLMs of different sizes to generate diverse and high-quality questions of different cognitive levels, as defined by Bloom's taxonomy. We use advanced prompting techniques with varying complexity for AEQG. We conducted expert and LLM-based evaluations to assess the linguistic and pedagogical relevance and quality of the questions. Our findings suggest that LLms can generate relevant and high-quality educational questions of different cognitive levels when prompted with adequate information, although there is a significant variance in the performance of the five LLms considered. We also show that automated evaluation is not on par with human evaluation.
Abstract（参考訳）: 教育者にとって、教育的に健全で、関連性があり、学習を促進するような質問を開発することは、困難で時間を要する課題である。現代の大規模言語モデル(LLM)は、複数のドメインにわたる高品質なコンテンツを生成し、教育者が高品質な質問を開発するのに役立つ可能性がある。オンライン教育を多様な学生に展開する上で,AEQG(Automated Education Question Generation)が重要である。 AEQGの過去の試みでは、高い認知レベルで質問を生成する能力は限られていた。本研究では,Bloomの分類学で定義された,異なる認知レベルの多様で高品質な質問を生成するために,異なる大きさの5つの最先端LCMの能力について検討した。 AEQGには様々な複雑さを持つ高度なプロンプト技術を用いる。言語的および教育的妥当性と質問の質を評価するために,専門家およびLSMによる評価を行った。以上より, LLmsは, 5つのLLmsの性能に有意な差異があるにもかかわらず, 認知レベルが異なる関連性, 高品質な教育的質問を生じさせる可能性が示唆された。また,自動評価は人的評価と同等ではないことを示す。

関連論文リスト

Benchmarking the Pedagogical Knowledge of Large Language Models [4.417539128489408]
本稿では,その教育的知識に基づいて,大規模言語モデルを評価するための新しいデータセットであるThe Pedagogy Benchmarkを紹介する。これらのベンチマークは、教師のための専門的開発試験から得られた、慎重にキュレートされた質問に基づいて構築されている。本報告では, 教育的知識に関する質問に対して, 精度が28%から89%の範囲で, 97モデルの結果を報告する。
論文参考訳（メタデータ） (2025-06-23T14:49:01Z)
Enhancing the Learning Experience: Using Vision-Language Models to Generate Questions for Educational Videos [6.689443785478135]
教育ビデオの学習指向質問生成における視覚言語モデルの有用性について検討する。本研究は,現状の視覚言語モデルの有効性を概説し,課題の微調整と解決の必要性を浮き彫りにした。
論文参考訳（メタデータ） (2025-05-03T11:37:31Z)
EducationQ: Evaluating LLMs' Teaching Capabilities Through Multi-Agent Dialogue Framework [9.76455227840645]
大規模言語モデル(LLM)は、ますます教育ツールとして機能するが、その教育能力を評価することは困難である。本研究では,動的シナリオをシミュレートして学習能力を効果的に評価するマルチエージェント対話フレームワークであるEducationQを紹介する。
論文参考訳（メタデータ） (2025-04-21T07:48:20Z)
YouLeQD: Decoding the Cognitive Complexity of Questions and Engagement in Online Educational Videos from Learners' Perspectives [1.2084539012992408]
YouLeQDデータセットには、YouTubeの講義ビデオコメントから学習者が提示した質問が含まれている。質問を検知し,その認知的複雑性を分析するために,RoBERTaに基づく2つの分類モデルを開発した。
論文参考訳（メタデータ） (2025-01-20T19:54:38Z)
The Future of Learning in the Age of Generative AI: Automated Question Generation and Assessment with Large Language Models [0.0]
大規模言語モデル(LLM)と生成AIは、自然言語処理(NLP)に革命をもたらした。本章では,自動質問生成と回答評価におけるLLMの変容の可能性について考察する。
論文参考訳（メタデータ） (2024-10-12T15:54:53Z)
Research on the Application of Large Language Models in Automatic Question Generation: A Case Study of ChatGLM in the Context of High School Information Technology Curriculum [3.0753648264454547]
モデルは多様な質問を生成するためにガイドされ、ドメインの専門家によって包括的に評価される。以上の結果から,ChatGLMは人為的な質問に対して,明快さと教師の利用意欲で優れていたことが示唆された。
論文参考訳（メタデータ） (2024-08-21T11:38:32Z)
Dr.Academy: A Benchmark for Evaluating Questioning Capability in Education for Large Language Models [30.759154473275043]
本研究では,大規模言語モデル(LLM)の教師として教育における質問能力を評価するためのベンチマークを紹介する。関連性, カバレッジ, 代表性, 一貫性の4つの指標を適用し, LLMのアウトプットの教育的品質を評価する。以上の結果から, GPT-4は一般・人文・理科教育において有意な可能性を秘めていることが示唆された。
論文参考訳（メタデータ） (2024-08-20T15:36:30Z)
Knowledge Tagging System on Math Questions via LLMs with Flexible Demonstration Retriever [48.5585921817745]
大きな言語モデル(LLM)は知識タグ付けタスクを自動化するために使われる。算数問題における知識タグ付けタスクに対するゼロショットと少数ショットの結果の強い性能を示す。強化学習に基づくデモレトリバーの提案により,異なるサイズのLLMの潜在能力を活用できた。
論文参考訳（メタデータ） (2024-06-19T23:30:01Z)
LOVA3: Learning to Visual Question Answering, Asking and Assessment [61.51687164769517]
質問への回答、質問、評価は、世界を理解し、知識を得るのに不可欠な3つの人間の特性である。現在のMLLM(Multimodal Large Language Models)は主に質問応答に焦点を当てており、質問や評価スキルの可能性を無視することが多い。 LOVA3は、"Learning tO Visual Question Answering, Asking and Assessment"と名付けられた革新的なフレームワークである。
論文参考訳（メタデータ） (2024-05-23T18:21:59Z)
Adapting Large Language Models for Education: Foundational Capabilities, Potentials, and Challenges [60.62904929065257]
大規模言語モデル(LLM)は、個々の要求を解釈することでこの問題を解決する可能性を提供する。本稿では, 数学, 文章, プログラミング, 推論, 知識に基づく質問応答など, 教育能力に関する最近のLLM研究を概観する。
論文参考訳（メタデータ） (2023-12-27T14:37:32Z)
Automating question generation from educational text [1.9325905076281444]
質問ベースの活動(QBA)の使用は、教育において広く普及しており、学習と評価プロセスの不可欠な部分を形成している。学校における形式的・要約的評価のための自動質問生成ツールの設計と評価を行う。
論文参考訳（メタデータ） (2023-09-26T15:18:44Z)
UKP-SQuARE: An Interactive Tool for Teaching Question Answering [61.93372227117229]
質問応答の指数的増加(QA)は、あらゆる自然言語処理(NLP)コースにおいて必須のトピックとなっている。本稿では、QA教育のプラットフォームとしてUKP-SQuAREを紹介する。学生は様々な視点から様々なQAモデルを実行、比較、分析することができる。
論文参考訳（メタデータ） (2023-05-31T11:29:04Z)
Do Large Language Models Know What They Don't Know? [74.65014158544011]
大規模言語モデル(LLM)は、様々な自然言語処理(NLP)タスクに優れた知識を持つ。膨大な知識にもかかわらず、LLMはそれらが適合し理解できる情報の量によって制限されている。本研究の目的は,LLMの自己理解能力を評価することである。
論文参考訳（メタデータ） (2023-05-29T15:30:13Z)
Neural Multi-Task Learning for Teacher Question Detection in Online Classrooms [50.19997675066203]
教師の音声記録から質問を自動的に検出するエンドツーエンドのニューラルネットワークフレームワークを構築している。マルチタスク学習手法を取り入れることで,質問の種類によって意味的関係の理解を深めることが可能となる。
論文参考訳（メタデータ） (2020-05-16T02:17:04Z)
R2DE: a NLP approach to estimating IRT parameters of newly generated questions [3.364554138758565]
R2DEは、質問のテキストを見て、新しく生成された複数の選択の質問を評価することができるモデルである。特に、各質問の難易度と識別度を推定することができる。
論文参考訳（メタデータ） (2020-01-21T14:31:01Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。