CDGP: Automatic Cloze Distractor Generation based on Pre-trained Language Model
- URL: http://arxiv.org/abs/2403.10326v1
- Date: Fri, 15 Mar 2024 14:14:26 GMT
- Title: CDGP: Automatic Cloze Distractor Generation based on Pre-trained Language Model
- Authors: Shang-Hsuan Chiang, Ssu-Cheng Wang, Yao-Chung Fan,
- Abstract summary: We explore the employment of pre-trained language models (PLMs) as an alternative for candidate distractor generation.
Experiments show that the PLM-enhanced model brings a substantial performance improvement.
- Score: 2.2169618382995764
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Manually designing cloze test consumes enormous time and efforts. The major challenge lies in wrong option (distractor) selection. Having carefully-design distractors improves the effectiveness of learner ability assessment. As a result, the idea of automatically generating cloze distractor is motivated. In this paper, we investigate cloze distractor generation by exploring the employment of pre-trained language models (PLMs) as an alternative for candidate distractor generation. Experiments show that the PLM-enhanced model brings a substantial performance improvement. Our best performing model advances the state-of-the-art result from 14.94 to 34.17 (NDCG@10 score). Our code and dataset is available at https://github.com/AndyChiangSH/CDGP.
Related papers
- Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision [120.40788744292739]
We propose a two-player paradigm that separates the roles of reasoning and critique models.
We first propose AutoMathCritique, an automated and scalable framework for collecting critique data.
We demonstrate that the critique models consistently improve the actor's performance on difficult queries at test-time.
arXiv Detail & Related papers (2024-11-25T17:11:54Z) - Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens [53.99177152562075]
Scaling up autoregressive models in vision has not proven as beneficial as in large language models.
We focus on two critical factors: whether models use discrete or continuous tokens, and whether tokens are generated in a random or fixed order using BERT- or GPT-like transformer architectures.
Our results show that while all models scale effectively in terms of validation loss, their evaluation performance -- measured by FID, GenEval score, and visual quality -- follows different trends.
arXiv Detail & Related papers (2024-10-17T17:59:59Z) - Just Say What You Want: Only-prompting Self-rewarding Online Preference Optimization [64.34767799614328]
Current self-rewarding approaches rely heavily on the discriminator's judgment capabilities.
We propose a novel, only-prompting self-rewarding online algorithm that generates preference datasets without relying on judgment capabilities.
arXiv Detail & Related papers (2024-09-26T04:41:08Z) - Improving Automated Distractor Generation for Math Multiple-choice Questions with Overgenerate-and-rank [44.04217284677347]
We propose a novel method to enhance the quality of generated distractors through overgenerate-and-rank.
Our ranking model increases alignment with human-authored distractors, although human-authored ones are still preferred over generated ones.
arXiv Detail & Related papers (2024-04-19T00:25:44Z) - Challenging Forgets: Unveiling the Worst-Case Forget Sets in Machine Unlearning [9.998859702421417]
Machine unlearning (MU) aims to eliminate the influence of chosen data points on model performance.
Despite various MU methods for data influence erasure, evaluations have largely focused on random data forgetting.
We propose identifying the data subset that presents the most significant challenge for influence erasure, pinpointing the worst-case forget set.
arXiv Detail & Related papers (2024-03-12T06:50:32Z) - Robust Preference Learning for Storytelling via Contrastive
Reinforcement Learning [53.92465205531759]
Controlled automated story generation seeks to generate natural language stories satisfying constraints from natural language critiques or preferences.
We train a contrastive bi-encoder model to align stories with human critiques, building a general purpose preference model.
We further fine-tune the contrastive reward model using a prompt-learning technique to increase story generation robustness.
arXiv Detail & Related papers (2022-10-14T13:21:33Z) - Ensembling Off-the-shelf Models for GAN Training [55.34705213104182]
We find that pretrained computer vision models can significantly improve performance when used in an ensemble of discriminators.
We propose an effective selection mechanism, by probing the linear separability between real and fake samples in pretrained model embeddings.
Our method can improve GAN training in both limited data and large-scale settings.
arXiv Detail & Related papers (2021-12-16T18:59:50Z) - Utilizing Self-supervised Representations for MOS Prediction [51.09985767946843]
Existing evaluations usually require clean references or parallel ground truth data.
Subjective tests, on the other hand, do not need any additional clean or parallel data and correlates better to human perception.
We develop an automatic evaluation approach that correlates well with human perception while not requiring ground truth data.
arXiv Detail & Related papers (2021-04-07T09:44:36Z) - Controllable Generation from Pre-trained Language Models via Inverse
Prompting [47.23315683944257]
We propose an innovative method, inverse prompting, to better control text generation.
Inverse prompting uses generated text to inversely predict the prompt during beam search.
Our results show that our proposed method substantially outperforms the baselines.
arXiv Detail & Related papers (2021-03-19T08:36:52Z) - EnD: Entangling and Disentangling deep representations for bias
correction [7.219077740523682]
We propose EnD, a regularization strategy whose aim is to prevent deep models from learning unwanted biases.
In particular, we insert an "information bottleneck" at a certain point of the deep neural network, where we disentangle the information about the bias.
Experiments show that EnD effectively improves the generalization on unbiased test sets.
arXiv Detail & Related papers (2021-03-02T20:55:42Z) - Better Distractions: Transformer-based Distractor Generation and
Multiple Choice Question Filtering [4.168157981135697]
We train a GPT-2 language model to generate three distractors for a given question and text context.
Next, we train a BERT language model to answer multiple choice questions (MCQs) and use this model as a filter.
arXiv Detail & Related papers (2020-10-19T15:23:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.