The competent Computational Thinking test (cCTt): Development and
validation of an unplugged Computational Thinking test for upper primary
school
- URL: http://arxiv.org/abs/2203.05980v2
- Date: Wed, 4 May 2022 19:35:04 GMT
- Title: The competent Computational Thinking test (cCTt): Development and
validation of an unplugged Computational Thinking test for upper primary
school
- Authors: Laila El-Hamamsy, Mar\'ia Zapata-C\'aceres, Estefan\'ia Mart\'in
Barroso, Francesco Mondada, Jessica Dehler Zufferey, Barbara Bruno
- Abstract summary: The competent CT test (cCTt) is an unplugged CT test targeting 7-9 year-old students.
The expert evaluation indicates that the cCTt shows good face, construct, and content validity.
The psychometric analysis of the student data demonstrates adequate reliability, difficulty, and discriminability.
- Score: 0.8367620276482053
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: With the increasing importance of Computational Thinking (CT) at all levels
of education, it is essential to have valid and reliable assessments.
Currently, there is a lack of such assessments in upper primary school. That is
why we present the development and validation of the competent CT test (cCTt),
an unplugged CT test targeting 7-9 year-old students. In the first phase, 37
experts evaluated the validity of the cCTt through a survey and focus group. In
the second phase, the test was administered to 1519 students. We employed
Classical Test Theory, Item Response Theory, and Confirmatory Factor Analysis
to assess the instruments' psychometric properties. The expert evaluation
indicates that the cCTt shows good face, construct, and content validity.
Furthermore, the psychometric analysis of the student data demonstrates
adequate reliability, difficulty, and discriminability for the target age
groups. Finally, shortened variants of the test are established through
Confirmatory Factor Analysis. To conclude, the proposed cCTt is a valid and
reliable instrument, for use by researchers and educators alike, which expands
the portfolio of validated CT assessments across compulsory education. Future
assessments looking at capturing CT in a more exhaustive manner might consider
combining the cCTt with other forms of assessments.
Related papers
- CTBench: A Comprehensive Benchmark for Evaluating Language Model Capabilities in Clinical Trial Design [15.2100541345819]
CTBench is introduced as a benchmark to assess language models (LMs) in aiding clinical study design.
It consists of two datasets: "CT-Repo," containing baseline features from 1,690 clinical trials sourced from clinicaltrials.gov, and "CT-Pub," a subset of 100 trials with more comprehensive baseline features gathered from relevant publications.
arXiv Detail & Related papers (2024-06-25T18:52:48Z) - ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models [53.00812898384698]
We argue that human evaluation of generative large language models (LLMs) should be a multidisciplinary undertaking.
We highlight how cognitive biases can conflate fluent information and truthfulness, and how cognitive uncertainty affects the reliability of rating scores such as Likert.
We propose the ConSiDERS-The-Human evaluation framework consisting of 6 pillars -- Consistency, Scoring Criteria, Differentiating, User Experience, Responsible, and Scalability.
arXiv Detail & Related papers (2024-05-28T22:45:28Z) - Survey of Computerized Adaptive Testing: A Machine Learning Perspective [66.26687542572974]
Computerized Adaptive Testing (CAT) provides an efficient and tailored method for assessing the proficiency of examinees.
This paper aims to provide a machine learning-focused survey on CAT, presenting a fresh perspective on this adaptive testing method.
arXiv Detail & Related papers (2024-03-31T15:09:47Z) - Analyzing-Evaluating-Creating: Assessing Computational Thinking and Problem Solving in Visual Programming Domains [21.14335914575035]
Computational thinking (CT) and problem-solving skills are increasingly integrated into K-8 school curricula worldwide.
We have developed ACE, a novel test focusing on the three higher cognitive levels in Bloom's taxonomy.
We evaluate the psychometric properties of ACE through a study conducted with 371 students in grades 3-7 from 10 schools.
arXiv Detail & Related papers (2024-03-18T20:18:34Z) - TEASMA: A Practical Methodology for Test Adequacy Assessment of Deep Neural Networks [4.528286105252983]
TEASMA is a comprehensive and practical methodology designed to accurately assess the adequacy of test sets for Deep Neural Networks.
We evaluate TEASMA with four state-of-the-art test adequacy metrics: Distance-based Surprise Coverage (DSC), Likelihood-based Surprise Coverage (LSC), Input Distribution Coverage (IDC) and Mutation Score (MS)
arXiv Detail & Related papers (2023-08-02T17:56:05Z) - Revisiting Computer-Aided Tuberculosis Diagnosis [56.80999479735375]
Tuberculosis (TB) is a major global health threat, causing millions of deaths annually.
Computer-aided tuberculosis diagnosis (CTD) using deep learning has shown promise, but progress is hindered by limited training data.
We establish a large-scale dataset, namely the Tuberculosis X-ray (TBX11K) dataset, which contains 11,200 chest X-ray (CXR) images with corresponding bounding box annotations for TB areas.
This dataset enables the training of sophisticated detectors for high-quality CTD.
arXiv Detail & Related papers (2023-07-06T08:27:48Z) - Effective Matching of Patients to Clinical Trials using Entity
Extraction and Neural Re-ranking [8.200196331837576]
Clinical trials (CTs) often fail due to inadequate patient recruitment.
This paper tackles the challenges of CT retrieval by presenting an approach that addresses the patient-to-trials paradigm.
arXiv Detail & Related papers (2023-07-01T16:42:39Z) - From Static Benchmarks to Adaptive Testing: Psychometrics in AI Evaluation [60.14902811624433]
We discuss a paradigm shift from static evaluation methods to adaptive testing.
This involves estimating the characteristics and value of each test item in the benchmark and dynamically adjusting items in real-time.
We analyze the current approaches, advantages, and underlying reasons for adopting psychometrics in AI evaluation.
arXiv Detail & Related papers (2023-06-18T09:54:33Z) - The competent Computational Thinking test (cCTt): a valid, reliable and gender-fair test for longitudinal CT studies in grades 3-6 [0.06282171844772422]
This study investigated whether the competent Computational Thinking test (cCTt) could evaluate learning reliably from grades 3 to 6 (ages 7-11) using data from 2709 students.
The findings indicate that the cCTt is valid, reliable and gender-fair for grades 3-6, although more complex items would be beneficial for grades 5-6.
arXiv Detail & Related papers (2023-05-31T03:29:04Z) - Towards Reliable Medical Image Segmentation by utilizing Evidential Calibrated Uncertainty [52.03490691733464]
We introduce DEviS, an easily implementable foundational model that seamlessly integrates into various medical image segmentation networks.
By leveraging subjective logic theory, we explicitly model probability and uncertainty for the problem of medical image segmentation.
DeviS incorporates an uncertainty-aware filtering module, which utilizes the metric of uncertainty-calibrated error to filter reliable data.
arXiv Detail & Related papers (2023-01-01T05:02:46Z) - Dual-Consistency Semi-Supervised Learning with Uncertainty
Quantification for COVID-19 Lesion Segmentation from CT Images [49.1861463923357]
We propose an uncertainty-guided dual-consistency learning network (UDC-Net) for semi-supervised COVID-19 lesion segmentation from CT images.
Our proposed UDC-Net improves the fully supervised method by 6.3% in Dice and outperforms other competitive semi-supervised approaches by significant margins.
arXiv Detail & Related papers (2021-04-07T16:23:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.