Related papers: CDGP: Automatic Cloze Distractor Generation based on Pre-trained Language Model

Related papers

Fake it till You Make it: Reward Modeling as Discriminative Prediction [49.31309674007382]
GAN-RM is an efficient reward modeling framework that eliminates manual preference annotation and explicit quality dimension engineering.<n>Our method trains the reward model through discrimination between a small set of representative, unpaired target samples.<n>Experiments demonstrate our GAN-RM's effectiveness across multiple key applications.
arXiv Detail & Related papers (2025-06-16T17:59:40Z)
Dream to Generalize: Zero-Shot Model-Based Reinforcement Learning for Unseen Visual Distractions [14.137070712516005]
We propose a novel self-supervised method, Dream to Generalize (Dr. G), for zero-shot model-based reinforcement learning (MBRL)<n>Dr. G trains its encoder and world model with dual contrastive learning which efficiently captures task-relevant features among multi-view data augmentations.<n>We also introduce a recurrent state inverse dynamics model that helps the world model to better understand the temporal structure.
arXiv Detail & Related papers (2025-06-05T00:39:03Z)
Learning to Solve and Verify: A Self-Play Framework for Code and Test Generation [69.62857948698436]
Recent advances in large language models (LLMs) have improved their performance on coding benchmarks. However, improvement is plateauing due to the exhaustion of readily available high-quality data. We propose Sol-Ver, a self-play solver-verifier framework that jointly improves a single model's code and test generation capacity.
arXiv Detail & Related papers (2025-02-20T18:32:19Z)
Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision [120.40788744292739]
We propose a two-player paradigm that separates the roles of reasoning and critique models. We first propose AutoMathCritique, an automated and scalable framework for collecting critique data. We demonstrate that the critique models consistently improve the actor's performance on difficult queries at test-time.
arXiv Detail & Related papers (2024-11-25T17:11:54Z)
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens [53.99177152562075]
Scaling up autoregressive models in vision has not proven as beneficial as in large language models. We focus on two critical factors: whether models use discrete or continuous tokens, and whether tokens are generated in a random or fixed order using BERT- or GPT-like transformer architectures. Our results show that while all models scale effectively in terms of validation loss, their evaluation performance -- measured by FID, GenEval score, and visual quality -- follows different trends.
arXiv Detail & Related papers (2024-10-17T17:59:59Z)
Just Say What You Want: Only-prompting Self-rewarding Online Preference Optimization [64.34767799614328]
Current self-rewarding approaches rely heavily on the discriminator's judgment capabilities. We propose a novel, only-prompting self-rewarding online algorithm that generates preference datasets without relying on judgment capabilities.
arXiv Detail & Related papers (2024-09-26T04:41:08Z)
Improving Automated Distractor Generation for Math Multiple-choice Questions with Overgenerate-and-rank [44.04217284677347]
We propose a novel method to enhance the quality of generated distractors through overgenerate-and-rank. Our ranking model increases alignment with human-authored distractors, although human-authored ones are still preferred over generated ones.
arXiv Detail & Related papers (2024-04-19T00:25:44Z)
Challenging Forgets: Unveiling the Worst-Case Forget Sets in Machine Unlearning [9.998859702421417]
Machine unlearning (MU) aims to eliminate the influence of chosen data points on model performance. Despite various MU methods for data influence erasure, evaluations have largely focused on random data forgetting. We propose identifying the data subset that presents the most significant challenge for influence erasure, pinpointing the worst-case forget set.
arXiv Detail & Related papers (2024-03-12T06:50:32Z)
Adapt then Unlearn: Exploring Parameter Space Semantics for Unlearning in Generative Adversarial Networks [5.107720313575234]
This work aims to prevent the generation of outputs containing undesired features from a pre-trained Generative Adversarial Network (GAN) Our proposed two-stage method, known as 'Adapt-then-Unlearn,' excels at unlearning such undesirable features while also maintaining the quality of generated samples. To the best of our knowledge, our approach stands as the first method addressing unlearning within the realm of high-fidelity GANs.
arXiv Detail & Related papers (2023-09-25T11:36:20Z)
Robust Preference Learning for Storytelling via Contrastive Reinforcement Learning [53.92465205531759]
Controlled automated story generation seeks to generate natural language stories satisfying constraints from natural language critiques or preferences. We train a contrastive bi-encoder model to align stories with human critiques, building a general purpose preference model. We further fine-tune the contrastive reward model using a prompt-learning technique to increase story generation robustness.
arXiv Detail & Related papers (2022-10-14T13:21:33Z)
Ensembling Off-the-shelf Models for GAN Training [55.34705213104182]
We find that pretrained computer vision models can significantly improve performance when used in an ensemble of discriminators. We propose an effective selection mechanism, by probing the linear separability between real and fake samples in pretrained model embeddings. Our method can improve GAN training in both limited data and large-scale settings.
arXiv Detail & Related papers (2021-12-16T18:59:50Z)
Utilizing Self-supervised Representations for MOS Prediction [51.09985767946843]
Existing evaluations usually require clean references or parallel ground truth data. Subjective tests, on the other hand, do not need any additional clean or parallel data and correlates better to human perception. We develop an automatic evaluation approach that correlates well with human perception while not requiring ground truth data.
arXiv Detail & Related papers (2021-04-07T09:44:36Z)
Controllable Generation from Pre-trained Language Models via Inverse Prompting [47.23315683944257]
We propose an innovative method, inverse prompting, to better control text generation. Inverse prompting uses generated text to inversely predict the prompt during beam search. Our results show that our proposed method substantially outperforms the baselines.
arXiv Detail & Related papers (2021-03-19T08:36:52Z)
EnD: Entangling and Disentangling deep representations for bias correction [7.219077740523682]
We propose EnD, a regularization strategy whose aim is to prevent deep models from learning unwanted biases. In particular, we insert an "information bottleneck" at a certain point of the deep neural network, where we disentangle the information about the bias. Experiments show that EnD effectively improves the generalization on unbiased test sets.
arXiv Detail & Related papers (2021-03-02T20:55:42Z)
Better Distractions: Transformer-based Distractor Generation and Multiple Choice Question Filtering [4.168157981135697]
We train a GPT-2 language model to generate three distractors for a given question and text context. Next, we train a BERT language model to answer multiple choice questions (MCQs) and use this model as a filter.
arXiv Detail & Related papers (2020-10-19T15:23:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.