Balancing Test Accuracy and Security in Computerized Adaptive Testing
- URL: http://arxiv.org/abs/2305.18312v1
- Date: Thu, 18 May 2023 18:32:51 GMT
- Title: Balancing Test Accuracy and Security in Computerized Adaptive Testing
- Authors: Wanyong Feng, Aritra Ghosh, Stephen Sireci, Andrew S. Lan
- Abstract summary: Bilevel optimization-based CAT (BOBCAT) is a framework that learns a data-driven question selection algorithm.
It suffers from high question exposure and test overlap rates, which potentially affects test security.
We show that C-BOBCAT is effective through extensive experiments on two real-world adult testing datasets.
- Score: 18.121437613260618
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Computerized adaptive testing (CAT) is a form of personalized testing that
accurately measures students' knowledge levels while reducing test length.
Bilevel optimization-based CAT (BOBCAT) is a recent framework that learns a
data-driven question selection algorithm to effectively reduce test length and
improve test accuracy. However, it suffers from high question exposure and test
overlap rates, which potentially affects test security. This paper introduces a
constrained version of BOBCAT to address these problems by changing its
optimization setup and enabling us to trade off test accuracy for question
exposure and test overlap rates. We show that C-BOBCAT is effective through
extensive experiments on two real-world adult testing datasets.
Related papers
- Active Test-Time Adaptation: Theoretical Analyses and An Algorithm [51.84691955495693]
Test-time adaptation (TTA) addresses distribution shifts for streaming test data in unsupervised settings.
We propose the novel problem setting of active test-time adaptation (ATTA) that integrates active learning within the fully TTA setting.
arXiv Detail & Related papers (2024-04-07T22:31:34Z) - Survey of Computerized Adaptive Testing: A Machine Learning Perspective [66.26687542572974]
Computerized Adaptive Testing (CAT) provides an efficient and tailored method for assessing the proficiency of examinees.
This paper aims to provide a machine learning-focused survey on CAT, presenting a fresh perspective on this adaptive testing method.
arXiv Detail & Related papers (2024-03-31T15:09:47Z) - Fine-Grained Assertion-Based Test Selection [6.9290255098776425]
Regression test selection techniques aim at reducing test execution time by selecting only the tests that are affected by code changes.
We propose a novel approach that increases the selection precision by analyzing test code at statement level and treating test assertions as the unit for selection.
arXiv Detail & Related papers (2024-03-24T04:07:30Z) - FlaKat: A Machine Learning-Based Categorization Framework for Flaky
Tests [3.0846824529023382]
Flaky tests can pass or fail non-deterministically, without alterations to a software system.
State-of-the-art research incorporates machine learning solutions into flaky test detection and achieves reasonably good accuracy.
arXiv Detail & Related papers (2024-03-01T22:00:44Z) - Precise Error Rates for Computationally Efficient Testing [75.63895690909241]
We revisit the question of simple-versus-simple hypothesis testing with an eye towards computational complexity.
An existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates.
arXiv Detail & Related papers (2023-11-01T04:41:16Z) - Addressing Selection Bias in Computerized Adaptive Testing: A User-Wise
Aggregate Influence Function Approach [14.175555669521987]
We propose a user-wise aggregate influence function method to tackle the selection bias issue.
Our intuition is to filter out users whose response data is heavily biased in an aggregate manner.
arXiv Detail & Related papers (2023-08-23T04:57:21Z) - A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts [143.14128737978342]
Test-time adaptation, an emerging paradigm, has the potential to adapt a pre-trained model to unlabeled data during testing, before making predictions.
Recent progress in this paradigm highlights the significant benefits of utilizing unlabeled data for training self-adapted models prior to inference.
arXiv Detail & Related papers (2023-03-27T16:32:21Z) - DELTA: degradation-free fully test-time adaptation [59.74287982885375]
We find that two unfavorable defects are concealed in the prevalent adaptation methodologies like test-time batch normalization (BN) and self-learning.
First, we reveal that the normalization statistics in test-time BN are completely affected by the currently received test samples, resulting in inaccurate estimates.
Second, we show that during test-time adaptation, the parameter update is biased towards some dominant classes.
arXiv Detail & Related papers (2023-01-30T15:54:00Z) - BOBCAT: Bilevel Optimization-Based Computerized Adaptive Testing [3.756550107432323]
Computerized adaptive testing (CAT) refers to a form of tests that are personalized to every student/test taker.
We propose BOBCAT, a Bilevel Optimization-Based framework for CAT to directly learn a data-driven question selection algorithm from training data.
arXiv Detail & Related papers (2021-08-17T00:40:23Z) - Noisy Adaptive Group Testing using Bayesian Sequential Experimental
Design [63.48989885374238]
When the infection prevalence of a disease is low, Dorfman showed 80 years ago that testing groups of people can prove more efficient than testing people individually.
Our goal in this paper is to propose new group testing algorithms that can operate in a noisy setting.
arXiv Detail & Related papers (2020-04-26T23:41:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.