Automated Generation of Multiple-Choice Cloze Questions for Assessing
English Vocabulary Using GPT-turbo 3.5
- URL: http://arxiv.org/abs/2403.02078v1
- Date: Mon, 4 Mar 2024 14:24:47 GMT
- Title: Automated Generation of Multiple-Choice Cloze Questions for Assessing
English Vocabulary Using GPT-turbo 3.5
- Authors: Qiao Wang, Ralph Rose, Naho Orita, Ayaka Sugawara
- Abstract summary: We evaluate a new method for automatically generating multiple-choice questions using large language models (LLM)
The VocaTT engine is written in Python and comprises three basic steps: pre-processing target word lists, generating sentences and candidate word options, and finally selecting suitable word options.
Results showed a 75% rate of well-formedness for sentences and 66.85% rate for suitable word options.
- Score: 5.525336037820985
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A common way of assessing language learners' mastery of vocabulary is via
multiple-choice cloze (i.e., fill-in-the-blank) questions. But the creation of
test items can be laborious for individual teachers or in large-scale language
programs. In this paper, we evaluate a new method for automatically generating
these types of questions using large language models (LLM). The VocaTT
(vocabulary teaching and training) engine is written in Python and comprises
three basic steps: pre-processing target word lists, generating sentences and
candidate word options using GPT, and finally selecting suitable word options.
To test the efficiency of this system, 60 questions were generated targeting
academic words. The generated items were reviewed by expert reviewers who
judged the well-formedness of the sentences and word options, adding comments
to items judged not well-formed. Results showed a 75% rate of well-formedness
for sentences and 66.85% rate for suitable word options. This is a marked
improvement over the generator used earlier in our research which did not take
advantage of GPT's capabilities. Post-hoc qualitative analysis reveals several
points for improvement in future work including cross-referencing
part-of-speech tagging, better sentence validation, and improving GPT prompts.
Related papers
- Answer Candidate Type Selection: Text-to-Text Language Model for Closed
Book Question Answering Meets Knowledge Graphs [62.20354845651949]
We present a novel approach which works on top of the pre-trained Text-to-Text QA system to address this issue.
Our simple yet effective method performs filtering and re-ranking of generated candidates based on their types derived from Wikidata "instance_of" property.
arXiv Detail & Related papers (2023-10-10T20:49:43Z) - ChatPRCS: A Personalized Support System for English Reading
Comprehension based on ChatGPT [3.847982502219679]
This paper presents a novel personalized support system for reading comprehension, referred to as ChatPRCS.
ChatPRCS employs methods including reading comprehension proficiency prediction, question generation, and automatic evaluation.
arXiv Detail & Related papers (2023-09-22T11:46:44Z) - Can Language Models Learn to Listen? [96.01685069483025]
We present a framework for generating appropriate facial responses from a listener in dyadic social interactions based on the speaker's words.
Our approach autoregressively predicts a response of a listener: a sequence of listener facial gestures, quantized using a VQ-VAE.
We show that our generated listener motion is fluent and reflective of language semantics through quantitative metrics and a qualitative user study.
arXiv Detail & Related papers (2023-08-21T17:59:02Z) - Controllable Emphasis with zero data for text-to-speech [57.12383531339368]
A simple but effective method to achieve emphasized speech consists in increasing the predicted duration of the emphasised word.
We show that this is significantly better than spectrogram modification techniques improving naturalness by $7.3%$ and correct testers' identification of the emphasised word in a sentence by $40%$ on a reference female en-US voice.
arXiv Detail & Related papers (2023-07-13T21:06:23Z) - Automatic Generation of Multiple-Choice Questions [7.310488568715925]
We present two methods to tackle the challenge of QAP generations.
A deep-learning-based end-to-end question generation system based on T5 Transformer with Preprocessing and Postprocessing Pipelines.
A sequence-learning-based scheme to generate adequate QAPs via meta-sequence representations of sentences.
arXiv Detail & Related papers (2023-03-25T22:45:54Z) - TEMPERA: Test-Time Prompting via Reinforcement Learning [57.48657629588436]
We propose Test-time Prompt Editing using Reinforcement learning (TEMPERA)
In contrast to prior prompt generation methods, TEMPERA can efficiently leverage prior knowledge.
Our method achieves 5.33x on average improvement in sample efficiency when compared to the traditional fine-tuning methods.
arXiv Detail & Related papers (2022-11-21T22:38:20Z) - Vector Representations of Idioms in Conversational Systems [1.6507910904669727]
We utilize the Potentialatic Expression (PIE)-English idioms corpus for the two tasks that we investigate.
We achieve state-of-the-art (SoTA) result of 98% macro F1 score on the classification task by using the SoTA T5 model.
The results show that the model trained on the idiom corpus generates more fitting responses to prompts containing idioms 71.9% of the time.
arXiv Detail & Related papers (2022-05-07T14:50:05Z) - Few-Shot Bot: Prompt-Based Learning for Dialogue Systems [58.27337673451943]
Learning to converse using only a few examples is a great challenge in conversational AI.
The current best conversational models are either good chit-chatters (e.g., BlenderBot) or goal-oriented systems (e.g., MinTL)
We propose prompt-based few-shot learning which does not require gradient-based fine-tuning but instead uses a few examples as the only source of learning.
arXiv Detail & Related papers (2021-10-15T14:36:45Z) - Speaker-Conditioned Hierarchical Modeling for Automated Speech Scoring [60.55025339250815]
We propose a novel deep learning technique for non-native ASS, called speaker-conditioned hierarchical modeling.
We take advantage of the fact that oral proficiency tests rate multiple responses for a candidate. In our technique, we take advantage of the fact that oral proficiency tests rate multiple responses for a candidate. We extract context from these responses and feed them as additional speaker-specific context to our network to score a particular response.
arXiv Detail & Related papers (2021-08-30T07:00:28Z) - An Automated Multiple-Choice Question Generation Using Natural Language
Processing Techniques [0.913755431537592]
We present an NLP-based system for automatic multiple-choice question generation (MCQG) for Computer-Based Testing Examination (CBTE)
We used NLP technique to extract keywords that are important words in a given lesson material.
To validate that the system is not perverse, five lesson materials were used to check the effectiveness and efficiency of the system.
arXiv Detail & Related papers (2021-03-26T22:39:59Z) - Automated Utterance Generation [5.220940151628735]
Using relevant utterances as features in question-answering has shown to improve both the precision and recall for retrieving the right answer by a conversational assistant.
We propose an utterance generation system which 1) uses extractive summarization to extract important sentences from the description, 2) uses multiple paraphrasing techniques to generate a diverse set of paraphrases of the title and summary sentences, and 3) selects good candidate paraphrases with the help of a novel candidate selection algorithm.
arXiv Detail & Related papers (2020-04-07T15:35:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.