Amortised Design Optimization for Item Response Theory
- URL: http://arxiv.org/abs/2307.09891v1
- Date: Wed, 19 Jul 2023 10:42:56 GMT
- Title: Amortised Design Optimization for Item Response Theory
- Authors: Antti Keurulainen, Isak Westerlund, Oskar Keurulainen, Andrew Howes
- Abstract summary: In education, Item Response Theory (IRT) is used to infer student abilities and characteristics of test items from student responses.
In response, we propose incorporating amortised experimental design into IRT.
The computational cost is shifted to a precomputing phase by training a Deep Reinforcement Learning (DRL) agent with synthetic data.
- Score: 5.076871870091048
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Item Response Theory (IRT) is a well known method for assessing responses
from humans in education and psychology. In education, IRT is used to infer
student abilities and characteristics of test items from student responses.
Interactions with students are expensive, calling for methods that efficiently
gather information for inferring student abilities. Methods based on Optimal
Experimental Design (OED) are computationally costly, making them inapplicable
for interactive applications. In response, we propose incorporating amortised
experimental design into IRT. Here, the computational cost is shifted to a
precomputing phase by training a Deep Reinforcement Learning (DRL) agent with
synthetic data. The agent is trained to select optimally informative test items
for the distribution of students, and to conduct amortised inference
conditioned on the experiment outcomes. During deployment the agent estimates
parameters from data, and suggests the next test item for the student, in close
to real-time, by taking into account the history of experiments and outcomes.
Related papers
- Implicit assessment of language learning during practice as accurate as explicit testing [0.5749787074942512]
We use Item Response Theory (IRT) in computer-aided language learning for assessment of student ability in two contexts.
We first aim to replace exhaustive tests with efficient but accurate adaptive tests.
Second, we explore whether we can accurately estimate learner ability directly from the context of practice with exercises, without testing.
arXiv Detail & Related papers (2024-09-24T14:40:44Z) - Switchable Decision: Dynamic Neural Generation Networks [98.61113699324429]
We propose a switchable decision to accelerate inference by dynamically assigning resources for each data instance.
Our method benefits from less cost during inference while keeping the same accuracy.
arXiv Detail & Related papers (2024-05-07T17:44:54Z) - An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models [55.01592097059969]
Supervised finetuning on instruction datasets has played a crucial role in achieving the remarkable zero-shot generalization capabilities.
Active learning is effective in identifying useful subsets of samples to annotate from an unlabeled pool.
We propose using experimental design to circumvent the computational bottlenecks of active learning.
arXiv Detail & Related papers (2024-01-12T16:56:54Z) - Effect Size Estimation for Duration Recommendation in Online Experiments: Leveraging Hierarchical Models and Objective Utility Approaches [13.504353263032359]
The selection of the assumed effect size (AES) critically determines the duration of an experiment, and hence its accuracy and efficiency.
Traditionally, experimenters determine AES based on domain knowledge, but this method becomes impractical for online experimentation services managing numerous experiments.
We propose two solutions for data-driven AES selection in for online experimentation services.
arXiv Detail & Related papers (2023-12-20T09:34:28Z) - Adaptive Instrument Design for Indirect Experiments [48.815194906471405]
Unlike RCTs, indirect experiments estimate treatment effects by leveragingconditional instrumental variables.
In this paper we take the initial steps towards enhancing sample efficiency for indirect experiments by adaptively designing a data collection policy.
Our main contribution is a practical computational procedure that utilizes influence functions to search for an optimal data collection policy.
arXiv Detail & Related papers (2023-12-05T02:38:04Z) - Query-Dependent Prompt Evaluation and Optimization with Offline Inverse
RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization.
We identify a previously overlooked objective of query dependency in such optimization.
We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z) - Amortised Experimental Design and Parameter Estimation for User Models
of Pointing [5.076871870091048]
We show how experiments can be designed so as to gather data and infer parameters as efficiently as possible.
We train a policy for choosing experimental designs with simulated participants.
Our solution learns which experiments provide the most useful data for parameter estimation by interacting with in-silico agents sampled from the model space.
arXiv Detail & Related papers (2023-07-19T10:17:35Z) - Task-specific experimental design for treatment effect estimation [59.879567967089145]
Large randomised trials (RCTs) are the standard for causal inference.
Recent work has proposed more sample-efficient alternatives to RCTs, but these are not adaptable to the downstream application for which the causal effect is sought.
We develop a task-specific approach to experimental design and derive sampling strategies customised to particular downstream applications.
arXiv Detail & Related papers (2023-06-08T18:10:37Z) - Use-Case-Grounded Simulations for Explanation Evaluation [23.584251632331046]
We introduce Use-Case-Grounded Simulated Evaluations (SimEvals)
SimEvals involve training algorithmic agents that take as input the information content that would be presented to each participant in a human subject study.
We run a comprehensive evaluation on three real-world use cases to demonstrate that Simevals can effectively identify which explanation methods will help humans for each use case.
arXiv Detail & Related papers (2022-06-05T20:12:19Z) - Do Deep Neural Networks Always Perform Better When Eating More Data? [82.6459747000664]
We design experiments from Identically Independent Distribution(IID) and Out of Distribution(OOD)
Under IID condition, the amount of information determines the effectivity of each sample, the contribution of samples and difference between classes determine the amount of class information.
Under OOD condition, the cross-domain degree of samples determine the contributions, and the bias-fitting caused by irrelevant elements is a significant factor of cross-domain.
arXiv Detail & Related papers (2022-05-30T15:40:33Z) - Active Learning-Based Optimization of Scientific Experimental Design [1.9705094859539976]
Active learning (AL) is a machine learning algorithm that can achieve greater accuracy with fewer labeled training instances.
This article performs a retrospective study on a drug response dataset using the proposed AL scheme.
It shows that scientific experimental design, instead of being manually set, can be optimized by AL.
arXiv Detail & Related papers (2021-12-29T20:02:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.