Related papers: Balancing Test Accuracy and Security in Computerized Adaptive Testing

Balancing Test Accuracy and Security in Computerized Adaptive Testing

URL: http://arxiv.org/abs/2305.18312v1
Date: Thu, 18 May 2023 18:32:51 GMT
Title: Balancing Test Accuracy and Security in Computerized Adaptive Testing
Authors: Wanyong Feng, Aritra Ghosh, Stephen Sireci, Andrew S. Lan
Abstract summary: Bilevel optimization-based CAT (BOBCAT) is a framework that learns a data-driven question selection algorithm. It suffers from high question exposure and test overlap rates, which potentially affects test security. We show that C-BOBCAT is effective through extensive experiments on two real-world adult testing datasets.
Score: 18.121437613260618
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Computerized adaptive testing (CAT) is a form of personalized testing that accurately measures students' knowledge levels while reducing test length. Bilevel optimization-based CAT (BOBCAT) is a recent framework that learns a data-driven question selection algorithm to effectively reduce test length and improve test accuracy. However, it suffers from high question exposure and test overlap rates, which potentially affects test security. This paper introduces a constrained version of BOBCAT to address these problems by changing its optimization setup and enabling us to trade off test accuracy for question exposure and test overlap rates. We show that C-BOBCAT is effective through extensive experiments on two real-world adult testing datasets.

Related papers

TestAgent: An Adaptive and Intelligent Expert for Human Assessment [62.060118490577366]
We propose TestAgent, a large language model (LLM)-powered agent designed to enhance adaptive testing through interactive engagement.<n>TestAgent supports personalized question selection, captures test-takers' responses and anomalies, and provides precise outcomes through dynamic, conversational interactions.
arXiv Detail & Related papers (2025-06-03T16:07:54Z)
Active Test-Time Adaptation: Theoretical Analyses and An Algorithm [51.84691955495693]
Test-time adaptation (TTA) addresses distribution shifts for streaming test data in unsupervised settings. We propose the novel problem setting of active test-time adaptation (ATTA) that integrates active learning within the fully TTA setting.
arXiv Detail & Related papers (2024-04-07T22:31:34Z)
Survey of Computerized Adaptive Testing: A Machine Learning Perspective [66.26687542572974]
Computerized Adaptive Testing (CAT) provides an efficient and tailored method for assessing the proficiency of examinees. This paper aims to provide a machine learning-focused survey on CAT, presenting a fresh perspective on this adaptive testing method.
arXiv Detail & Related papers (2024-03-31T15:09:47Z)
Fine-Grained Assertion-Based Test Selection [6.9290255098776425]
Regression test selection techniques aim at reducing test execution time by selecting only the tests that are affected by code changes. We propose a novel approach that increases the selection precision by analyzing test code at statement level and treating test assertions as the unit for selection.
arXiv Detail & Related papers (2024-03-24T04:07:30Z)
FlaKat: A Machine Learning-Based Categorization Framework for Flaky Tests [3.0846824529023382]
Flaky tests can pass or fail non-deterministically, without alterations to a software system. State-of-the-art research incorporates machine learning solutions into flaky test detection and achieves reasonably good accuracy.
arXiv Detail & Related papers (2024-03-01T22:00:44Z)
Precise Error Rates for Computationally Efficient Testing [75.63895690909241]
We revisit the question of simple-versus-simple hypothesis testing with an eye towards computational complexity. An existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates.
arXiv Detail & Related papers (2023-11-01T04:41:16Z)
Addressing Selection Bias in Computerized Adaptive Testing: A User-Wise Aggregate Influence Function Approach [14.175555669521987]
We propose a user-wise aggregate influence function method to tackle the selection bias issue. Our intuition is to filter out users whose response data is heavily biased in an aggregate manner.
arXiv Detail & Related papers (2023-08-23T04:57:21Z)
A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts [143.14128737978342]
Test-time adaptation, an emerging paradigm, has the potential to adapt a pre-trained model to unlabeled data during testing, before making predictions. Recent progress in this paradigm highlights the significant benefits of utilizing unlabeled data for training self-adapted models prior to inference.
arXiv Detail & Related papers (2023-03-27T16:32:21Z)
DELTA: degradation-free fully test-time adaptation [59.74287982885375]
We find that two unfavorable defects are concealed in the prevalent adaptation methodologies like test-time batch normalization (BN) and self-learning. First, we reveal that the normalization statistics in test-time BN are completely affected by the currently received test samples, resulting in inaccurate estimates. Second, we show that during test-time adaptation, the parameter update is biased towards some dominant classes.
arXiv Detail & Related papers (2023-01-30T15:54:00Z)
DLTTA: Dynamic Learning Rate for Test-time Adaptation on Cross-domain Medical Images [56.72015587067494]
We propose a novel dynamic learning rate adjustment method for test-time adaptation, called DLTTA. Our method achieves effective and fast test-time adaptation with consistent performance improvement over current state-of-the-art test-time adaptation methods.
arXiv Detail & Related papers (2022-05-27T02:34:32Z)
BOBCAT: Bilevel Optimization-Based Computerized Adaptive Testing [3.756550107432323]
Computerized adaptive testing (CAT) refers to a form of tests that are personalized to every student/test taker. We propose BOBCAT, a Bilevel Optimization-Based framework for CAT to directly learn a data-driven question selection algorithm from training data.
arXiv Detail & Related papers (2021-08-17T00:40:23Z)
TestRank: Bringing Order into Unlabeled Test Instances for Deep Learning Tasks [14.547623982073475]
Deep learning systems are notoriously difficult to test and debug. It is essential to conduct test selection and label only those selected "high quality" bug-revealing test inputs for test cost reduction. We propose a novel test prioritization technique that brings order into the unlabeled test instances according to their bug-revealing capabilities, namely TestRank.
arXiv Detail & Related papers (2021-05-21T03:41:10Z)
Noisy Adaptive Group Testing using Bayesian Sequential Experimental Design [63.48989885374238]
When the infection prevalence of a disease is low, Dorfman showed 80 years ago that testing groups of people can prove more efficient than testing people individually. Our goal in this paper is to propose new group testing algorithms that can operate in a noisy setting.
arXiv Detail & Related papers (2020-04-26T23:41:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.