Towards Continuous Compounding Effects and Agile Practices in
Educational Experimentation
- URL: http://arxiv.org/abs/2112.01243v1
- Date: Wed, 17 Nov 2021 13:10:51 GMT
- Title: Towards Continuous Compounding Effects and Agile Practices in
Educational Experimentation
- Authors: Luis M. Vaquero, Niall Twomey, Miguel Patricio Dias, Massimo Camplani,
Robert Hardman
- Abstract summary: This paper defines a framework for categorising different experimental processes.
Next generation of education technology successes will be heralded by embracing the full set of processes.
- Score: 2.7094829962573304
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Randomised control trials are currently the definitive gold standard approach
for formal educational experiments. Although conclusions from these experiments
are highly credible, their relatively slow experimentation rate, high expense
and rigid framework can be seen to limit scope on: 1. $\textit{metrics}$:
automation of the consistent rigorous computation of hundreds of metrics for
every experiment; 2. $\textit{concurrency}$: fast automated releases of
hundreds of concurrent experiments daily; and 3. $\textit{safeguards}$: safety
net tests and ramping up/rolling back treatments quickly to minimise negative
impact. This paper defines a framework for categorising different experimental
processes, and places a particular emphasis on technology readiness.
On the basis of our analysis, our thesis is that the next generation of
education technology successes will be heralded by recognising the context of
experiments and collectively embracing the full set of processes that are at
hand: from rapid ideation and prototyping produced in small scale experiments
on the one hand, to influencing recommendations of best teaching practices with
large-scale and technology-enabled online A/B testing on the other. A key
benefit of the latter is that the running costs tend towards zero (leading to
`free experimentation'). This offers low-risk opportunities to explore and
drive value though well-planned lasting campaigns that iterate quickly at a
large scale. Importantly, because these experimental platforms are so
adaptable, the cumulative effect of the experimental campaign delivers
compounding value exponentially over time even if each individual experiment
delivers a small effect.
Related papers
- Efficient Biological Data Acquisition through Inference Set Design [3.9633147697178996]
In this work, we aim to select the smallest set of candidates in order to achieve some desired level of accuracy for the system as a whole.
We call this mechanism inference set design, and propose the use of a confidence-based active learning solution to prune out challenging examples.
arXiv Detail & Related papers (2024-10-25T15:34:03Z) - Adaptive Experimentation When You Can't Experiment [55.86593195947978]
This paper introduces the emphconfounded pure exploration transductive linear bandit (textttCPET-LB) problem.
Online services can employ a properly randomized encouragement that incentivizes users toward a specific treatment.
arXiv Detail & Related papers (2024-06-15T20:54:48Z) - An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models [55.01592097059969]
Supervised finetuning on instruction datasets has played a crucial role in achieving the remarkable zero-shot generalization capabilities.
Active learning is effective in identifying useful subsets of samples to annotate from an unlabeled pool.
We propose using experimental design to circumvent the computational bottlenecks of active learning.
arXiv Detail & Related papers (2024-01-12T16:56:54Z) - GFlowNets for AI-Driven Scientific Discovery [74.27219800878304]
We present a new probabilistic machine learning framework called GFlowNets.
GFlowNets can be applied in the modeling, hypotheses generation and experimental design stages of the experimental science loop.
We argue that GFlowNets can become a valuable tool for AI-driven scientific discovery.
arXiv Detail & Related papers (2023-02-01T17:29:43Z) - Fair Effect Attribution in Parallel Online Experiments [57.13281584606437]
A/B tests serve the purpose of reliably identifying the effect of changes introduced in online services.
It is common for online platforms to run a large number of simultaneous experiments by splitting incoming user traffic randomly.
Despite a perfect randomization between different groups, simultaneous experiments can interact with each other and create a negative impact on average population outcomes.
arXiv Detail & Related papers (2022-10-15T17:15:51Z) - A Reinforcement Learning Approach to Estimating Long-term Treatment
Effects [13.371851720834918]
A limitation with randomized experiments is that they do not easily extend to measure long-term effects.
We take a reinforcement learning (RL) approach that estimates the average reward in a Markov process.
Motivated by real-world scenarios where the observed state transition is nonstationary, we develop a new algorithm for a class of nonstationary problems.
arXiv Detail & Related papers (2022-10-14T05:33:19Z) - Synthetically Controlled Bandits [2.8292841621378844]
This paper presents a new dynamic approach to experiment design in settings where, due to interference or other concerns, experimental units are coarse.
Our new design, dubbed Synthetically Controlled Thompson Sampling (SCTS), minimizes the regret associated with experimentation at no practically meaningful loss to inferential ability.
arXiv Detail & Related papers (2022-02-14T22:58:13Z) - Fine-Tuning Pretrained Language Models: Weight Initializations, Data
Orders, and Early Stopping [62.78338049381917]
Fine-tuning pretrained contextual word embedding models to supervised downstream tasks has become commonplace in natural language processing.
We experiment with four datasets from the GLUE benchmark, fine-tuning BERT hundreds of times on each while varying only the random seeds.
We find substantial performance increases compared to previously reported results, and we quantify how the performance of the best-found model varies as a function of the number of fine-tuning trials.
arXiv Detail & Related papers (2020-02-15T02:40:10Z) - Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement
Learning Framework [68.96770035057716]
A/B testing is a business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries.
This paper introduces a reinforcement learning framework for carrying A/B testing in online experiments.
arXiv Detail & Related papers (2020-02-05T10:25:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.