Related papers: Adaptive Experiments Under High-Dimensional and Data Sparse Settings: Applications for Educational Platforms

Adaptive Experiments Under High-Dimensional and Data Sparse Settings: Applications for Educational Platforms

URL: http://arxiv.org/abs/2501.03999v2
Date: Mon, 24 Feb 2025 05:29:00 GMT
Title: Adaptive Experiments Under High-Dimensional and Data Sparse Settings: Applications for Educational Platforms
Authors: Haochen Song, Ilya Musabirov, Ananya Bhattacharjee, Audrey Durand, Meredith Franklin, Anna Rafferty, Joseph Jay Williams,
Abstract summary: Traditional adaptive policies, such as Thompson Sampling, struggle with scalability in high-dimensional and sparse settings.<n>We propose a framework for determining the feasible number of treatments given a sample size.<n>We present comparative evaluations of WAPTS across various sample sizes and treatment conditions.
Score: 10.565276803897325
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In online educational platforms, adaptive experiment designs play a critical role in personalizing learning pathways, instructional sequencing, and content recommendations. Traditional adaptive policies, such as Thompson Sampling, struggle with scalability in high-dimensional and sparse settings such as when there are large amount of treatments (arms) and limited resources such as funding and time to conduct to a classroom constraint student size. Furthermore, the issue of under-exploration in large-scale educational interventions can lead to suboptimal learning recommendations. To address these challenges, we build upon the concept of lenient regret, which tolerates limited suboptimal selections to enhance exploratory learning, and propose a framework for determining the feasible number of treatments given a sample size. We illustrate these ideas with a case study in online educational learnersourcing examples, where adaptive algorithms dynamically allocate peer-crafted interventions to other students under active recall exercise. Our proposed Weighted Allocation Probability Adjusted Thompson Sampling (WAPTS) algorithm enhances the efficiency of treatment allocation by adjusting sampling weights to balance exploration and exploitation in data-sparse environments. We present comparative evaluations of WAPTS across various sample sizes (N=50, 300, 1000) and treatment conditions, demonstrating its ability to mitigate under-exploration while optimizing learning outcomes.

Related papers

Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes [50.544186914115045]
Large language models (LLMs) are increasingly embedded in everyday applications.<n> Ensuring their alignment with the diverse preferences of individual users has become a critical challenge.<n>We present a novel framework for few-shot steerable alignment.
arXiv Detail & Related papers (2024-12-18T16:14:59Z)
A Systematic Examination of Preference Learning through the Lens of Instruction-Following [83.71180850955679]
We use a novel synthetic data generation pipeline to generate 48,000 instruction unique-following prompts. With our synthetic prompts, we use two preference dataset curation methods - rejection sampling (RS) and Monte Carlo Tree Search (MCTS) Experiments reveal that shared prefixes in preference pairs, as generated by MCTS, provide marginal but consistent improvements. High-contrast preference pairs generally outperform low-contrast pairs; however, combining both often yields the best performance.
arXiv Detail & Related papers (2024-12-18T15:38:39Z)
Quantifying User Coherence: A Unified Framework for Cross-Domain Recommendation Analysis [69.37718774071793]
This paper introduces novel information-theoretic measures for understanding recommender systems. We evaluate 7 recommendation algorithms across 9 datasets, revealing the relationships between our measures and standard performance metrics.
arXiv Detail & Related papers (2024-10-03T13:02:07Z)
Adaptive teachers for amortized samplers [76.88721198565861]
We propose an adaptive training distribution (the teacher) to guide the training of the primary amortized sampler (the student) We validate the effectiveness of this approach in a synthetic environment designed to present an exploration challenge.
arXiv Detail & Related papers (2024-10-02T11:33:13Z)
Submodular Maximization Approaches for Equitable Client Selection in Federated Learning [4.167345675621377]
In a conventional Learning framework, client selection for training typically involves the random sampling of a subset of clients in each iteration. This paper introduces two novel methods, namely SUBTRUNC and UNIONFL, designed to address the limitations of random client selection.
arXiv Detail & Related papers (2024-08-24T22:40:31Z)
Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation [69.60321475454843]
We propose DPCPL, the first pre-training and prompt-tuning paradigm tailored for Multi-Behavior Sequential Recommendation. In the pre-training stage, we propose a novel Efficient Behavior Miner (EBM) to filter out the noise at multiple time scales. Subsequently, we propose to tune the pre-trained model in a highly efficient manner with the proposed Customized Prompt Learning (CPL) module.
arXiv Detail & Related papers (2024-08-21T06:48:38Z)
Optimization-Driven Adaptive Experimentation [7.948144726705323]
Real-world experiments involve batched & delayed feedback, non-stationarity, multiple objectives & constraints, and (often some) personalization. Tailoring adaptive methods to address these challenges on a per-problem basis is infeasible, and static designs remain the de facto standard. We present a mathematical programming formulation that can flexibly incorporate a wide range of objectives, constraints, and statistical procedures.
arXiv Detail & Related papers (2024-08-08T16:29:09Z)
Adaptive Experimentation When You Can't Experiment [55.86593195947978]
This paper introduces the emphconfounded pure exploration transductive linear bandit (textttCPET-LB) problem. Online services can employ a properly randomized encouragement that incentivizes users toward a specific treatment.
arXiv Detail & Related papers (2024-06-15T20:54:48Z)
Spurious Feature Eraser: Stabilizing Test-Time Adaptation for Vision-Language Foundation Model [86.9619638550683]
Vision-language foundation models have exhibited remarkable success across a multitude of downstream tasks due to their scalability on extensive image-text paired data. However, these models display significant limitations when applied to downstream tasks, such as fine-grained image classification, as a result of decision shortcuts''
arXiv Detail & Related papers (2024-03-01T09:01:53Z)
Experiment Planning with Function Approximation [49.50254688629728]
We study the problem of experiment planning with function approximation in contextual bandit problems. We propose two experiment planning strategies compatible with function approximation. We show that a uniform sampler achieves competitive optimality rates in the setting where the number of actions is small.
arXiv Detail & Related papers (2024-01-10T14:40:23Z)
Adaptive Experimental Design for Policy Learning [9.54473759331265]
We study an optimal adaptive experimental design for policy learning with multiple treatment arms. In the sampling stage, the planner assigns treatment arms adaptively over sequentially arriving experimental units. After the experiment, the planner recommends an individualized assignment rule to the population.
arXiv Detail & Related papers (2024-01-08T09:29:07Z)
Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms. In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z)
Latent Alignment with Deep Set EEG Decoders [44.128689862889715]
We introduce the Latent Alignment method that won the Benchmarks for EEG Transfer Learning competition. We present its formulation as a deep set applied on the set of trials from a given subject. Our experimental results show that performing statistical distribution alignment at later stages in a deep learning model is beneficial to the classification accuracy.
arXiv Detail & Related papers (2023-11-29T12:40:45Z)
Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning [22.410220040736235]
We present a theoretically optimal solution for addressing both coreset selection and active learning. Our proposed method, COPS, is designed to minimize the expected loss of a model trained on subsampled data.
arXiv Detail & Related papers (2023-09-05T14:06:33Z)
Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers. We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes. We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z)
SPEC: Summary Preference Decomposition for Low-Resource Abstractive Summarization [21.037841262371355]
We present a framework to transfer few-shot learning processes from source corpora to the target corpus. Our methods achieve state-of-the-art performance on six diverse corpora with 30.11%/33.95%/27.51% and 26.74%/31.14%/24.48% average improvements on ROUGE-1/2/L under 10- and 100-example settings.
arXiv Detail & Related papers (2023-03-24T14:07:03Z)
Personalized Algorithmic Recourse with Preference Elicitation [20.78332455864586]
We introduce PEAR, the first human-in-the-loop approach capable of providing personalized algorithmic recourse tailored to the needs of any end-user. PEAR builds on insights from Bayesian Preference Elicitation to iteratively refine an estimate of the costs of actions by asking choice set queries to the target user. Our empirical evaluation on real-world datasets highlights how PEAR produces high-quality personalized recourse in only a handful of iterations.
arXiv Detail & Related papers (2022-05-27T03:12:18Z)
Statistical Inference After Adaptive Sampling for Longitudinal Data [9.468593929311867]
We develop novel methods to perform a variety of statistical analyses on adaptively sampled data via Z-estimation. We develop novel theoretical tools for empirical processes on non-i.i.d., adaptively sampled longitudinal data which may be of independent interest.
arXiv Detail & Related papers (2022-02-14T23:48:13Z)
CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance. Sample re-weighting methods are popularly used to alleviate this data bias issue. We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z)
Few-shot Quality-Diversity Optimization [50.337225556491774]
Quality-Diversity (QD) optimization has been shown to be effective tools in dealing with deceptive minima and sparse rewards in Reinforcement Learning. We show that, given examples from a task distribution, information about the paths taken by optimization in parameter space can be leveraged to build a prior population, which when used to initialize QD methods in unseen environments, allows for few-shot adaptation. Experiments carried in both sparse and dense reward settings using robotic manipulation and navigation benchmarks show that it considerably reduces the number of generations that are required for QD optimization in these environments.
arXiv Detail & Related papers (2021-09-14T17:12:20Z)
AdaS: Adaptive Scheduling of Stochastic Gradients [50.80697760166045]
We introduce the notions of textit"knowledge gain" and textit"mapping condition" and propose a new algorithm called Adaptive Scheduling (AdaS) Experimentation reveals that, using the derived metrics, AdaS exhibits: (a) faster convergence and superior generalization over existing adaptive learning methods; and (b) lack of dependence on a validation set to determine when to stop training.
arXiv Detail & Related papers (2020-06-11T16:36:31Z)
Progressive Multi-Stage Learning for Discriminative Tracking [25.94944743206374]
We propose a joint discriminative learning scheme with the progressive multi-stage optimization policy of sample selection for robust visual tracking. The proposed scheme presents a novel time-weighted and detection-guided self-paced learning strategy for easy-to-hard sample selection. Experiments on the benchmark datasets demonstrate the effectiveness of the proposed learning framework.
arXiv Detail & Related papers (2020-04-01T07:01:30Z)
Adaptive Experience Selection for Policy Gradient [8.37609145576126]
Experience replay is a commonly used approach to improve sample efficiency. gradient estimators using past trajectories typically have high variance. Existing sampling strategies for experience replay like uniform sampling or prioritised experience replay do not explicitly try to control the variance of the gradient estimates. We propose an online learning algorithm, adaptive experience selection (AES), to adaptively learn an experience sampling distribution that explicitly minimises this variance.
arXiv Detail & Related papers (2020-02-17T13:16:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.