Related papers: Towards Continuous Compounding Effects and Agile Practices in Educational Experimentation

Towards Continuous Compounding Effects and Agile Practices in Educational Experimentation

URL: http://arxiv.org/abs/2112.01243v1
Date: Wed, 17 Nov 2021 13:10:51 GMT
Title: Towards Continuous Compounding Effects and Agile Practices in Educational Experimentation
Authors: Luis M. Vaquero, Niall Twomey, Miguel Patricio Dias, Massimo Camplani, Robert Hardman
Abstract summary: This paper defines a framework for categorising different experimental processes. Next generation of education technology successes will be heralded by embracing the full set of processes.
Score: 2.7094829962573304
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Randomised control trials are currently the definitive gold standard approach for formal educational experiments. Although conclusions from these experiments are highly credible, their relatively slow experimentation rate, high expense and rigid framework can be seen to limit scope on: 1. $\textit{metrics}$: automation of the consistent rigorous computation of hundreds of metrics for every experiment; 2. $\textit{concurrency}$: fast automated releases of hundreds of concurrent experiments daily; and 3. $\textit{safeguards}$: safety net tests and ramping up/rolling back treatments quickly to minimise negative impact. This paper defines a framework for categorising different experimental processes, and places a particular emphasis on technology readiness. On the basis of our analysis, our thesis is that the next generation of education technology successes will be heralded by recognising the context of experiments and collectively embracing the full set of processes that are at hand: from rapid ideation and prototyping produced in small scale experiments on the one hand, to influencing recommendations of best teaching practices with large-scale and technology-enabled online A/B testing on the other. A key benefit of the latter is that the running costs tend towards zero (leading to `free experimentation'). This offers low-risk opportunities to explore and drive value though well-planned lasting campaigns that iterate quickly at a large scale. Importantly, because these experimental platforms are so adaptable, the cumulative effect of the experimental campaign delivers compounding value exponentially over time even if each individual experiment delivers a small effect.

Related papers

Post Launch Evaluation of Policies in a High-Dimensional Setting [4.710921988115686]
A/B tests, also known as randomized controlled experiments (RCTs), are the gold standard for evaluating the impact of new policies, products, or decisions. This paper explores practical considerations in applying methodologies inspired by "synthetic control" Synthetic control methods leverage data from unaffected units to estimate counterfactual outcomes for treated units.
arXiv Detail & Related papers (2024-12-30T19:35:29Z)
Efficient Biological Data Acquisition through Inference Set Design [3.9633147697178996]
In this work, we aim to select the smallest set of candidates in order to achieve some desired level of accuracy for the system as a whole. We call this mechanism inference set design, and propose the use of a confidence-based active learning solution to prune out challenging examples.
arXiv Detail & Related papers (2024-10-25T15:34:03Z)
Adaptive Experimentation When You Can't Experiment [55.86593195947978]
This paper introduces the emphconfounded pure exploration transductive linear bandit (textttCPET-LB) problem. Online services can employ a properly randomized encouragement that incentivizes users toward a specific treatment.
arXiv Detail & Related papers (2024-06-15T20:54:48Z)
An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models [55.01592097059969]
Supervised finetuning on instruction datasets has played a crucial role in achieving the remarkable zero-shot generalization capabilities. Active learning is effective in identifying useful subsets of samples to annotate from an unlabeled pool. We propose using experimental design to circumvent the computational bottlenecks of active learning.
arXiv Detail & Related papers (2024-01-12T16:56:54Z)
GFlowNets for AI-Driven Scientific Discovery [74.27219800878304]
We present a new probabilistic machine learning framework called GFlowNets. GFlowNets can be applied in the modeling, hypotheses generation and experimental design stages of the experimental science loop. We argue that GFlowNets can become a valuable tool for AI-driven scientific discovery.
arXiv Detail & Related papers (2023-02-01T17:29:43Z)
Fair Effect Attribution in Parallel Online Experiments [57.13281584606437]
A/B tests serve the purpose of reliably identifying the effect of changes introduced in online services. It is common for online platforms to run a large number of simultaneous experiments by splitting incoming user traffic randomly. Despite a perfect randomization between different groups, simultaneous experiments can interact with each other and create a negative impact on average population outcomes.
arXiv Detail & Related papers (2022-10-15T17:15:51Z)
A Reinforcement Learning Approach to Estimating Long-term Treatment Effects [13.371851720834918]
A limitation with randomized experiments is that they do not easily extend to measure long-term effects. We take a reinforcement learning (RL) approach that estimates the average reward in a Markov process. Motivated by real-world scenarios where the observed state transition is nonstationary, we develop a new algorithm for a class of nonstationary problems.
arXiv Detail & Related papers (2022-10-14T05:33:19Z)
Synthetically Controlled Bandits [2.8292841621378844]
This paper presents a new dynamic approach to experiment design in settings where, due to interference or other concerns, experimental units are coarse. Our new design, dubbed Synthetically Controlled Thompson Sampling (SCTS), minimizes the regret associated with experimentation at no practically meaningful loss to inferential ability.
arXiv Detail & Related papers (2022-02-14T22:58:13Z)
Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping [62.78338049381917]
Fine-tuning pretrained contextual word embedding models to supervised downstream tasks has become commonplace in natural language processing. We experiment with four datasets from the GLUE benchmark, fine-tuning BERT hundreds of times on each while varying only the random seeds. We find substantial performance increases compared to previously reported results, and we quantify how the performance of the best-found model varies as a function of the number of fine-tuning trials.
arXiv Detail & Related papers (2020-02-15T02:40:10Z)
Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework [68.96770035057716]
A/B testing is a business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries. This paper introduces a reinforcement learning framework for carrying A/B testing in online experiments.
arXiv Detail & Related papers (2020-02-05T10:25:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.