An NLP Crosswalk Between the Common Core State Standards and NAEP Item Specifications
- URL: http://arxiv.org/abs/2405.17284v2
- Date: Fri, 31 May 2024 21:30:44 GMT
- Title: An NLP Crosswalk Between the Common Core State Standards and NAEP Item Specifications
- Authors: Gregory Camilli,
- Abstract summary: I describe an NLP-based procedure that can be used to support subject matter experts in establishing a crosswalk between item specifications and content standards.
The procedure is used to evaluate the match of the Common Core State Standards for mathematics at grade 4 to the corresponding item specifications for the 2026 National Assessment of Educational Progress.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Natural language processing (NLP) is rapidly developing for applications in educational assessment. In this paper, I describe an NLP-based procedure that can be used to support subject matter experts in establishing a crosswalk between item specifications and content standards. This paper extends recent work by proposing and demonstrating the use of multivariate similarity based on embedding vectors for sentences or texts. In particular, a hybrid regression procedure is demonstrated for establishing the match of each content standard to multiple item specifications. The procedure is used to evaluate the match of the Common Core State Standards (CCSS) for mathematics at grade 4 to the corresponding item specifications for the 2026 National Assessment of Educational Progress (NAEP).
Related papers
- Exploring Ordinality in Text Classification: A Comparative Study of Explicit and Implicit Techniques [3.197435100145382]
Ordinal Classification (OC) is a widely encountered challenge in Natural Language Processing (NLP)
Previous approaches to tackle OC have primarily focused on modifying existing or creating novel loss functions that textbfexplicitly account for the ordinal nature of labels.
With the advent of Pretrained Language Models (PLMs), it became possible to tackle ordinality through the textbfimplicit semantics of the labels as well.
arXiv Detail & Related papers (2024-05-20T04:31:04Z) - Standardize: Aligning Language Models with Expert-Defined Standards for
Content Generation [4.1205832766381985]
We introduce Standardize, a retrieval-style in-context learning-based framework to guide large language models to align with expert-defined standards.
Our findings show that models can gain 40% to 100% increase in precise accuracy for Llama2 and GPT-4, respectively.
arXiv Detail & Related papers (2024-02-19T23:18:18Z) - WYWEB: A NLP Evaluation Benchmark For Classical Chinese [10.138128038929237]
We introduce the WYWEB evaluation benchmark, which consists of nine NLP tasks in classical Chinese.
We evaluate the existing pre-trained language models, which are all struggling with this benchmark.
arXiv Detail & Related papers (2023-05-23T15:15:11Z) - An Inclusive Notion of Text [69.36678873492373]
We argue that clarity on the notion of text is crucial for reproducible and generalizable NLP.
We introduce a two-tier taxonomy of linguistic and non-linguistic elements that are available in textual sources and can be used in NLP modeling.
arXiv Detail & Related papers (2022-11-10T14:26:43Z) - Revise and Resubmit: An Intertextual Model of Text-based Collaboration
in Peer Review [52.359007622096684]
Peer review is a key component of the publishing process in most fields of science.
Existing NLP studies focus on the analysis of individual texts.
editorial assistance often requires modeling interactions between pairs of texts.
arXiv Detail & Related papers (2022-04-22T16:39:38Z) - Unsupervised Attention-based Sentence-Level Meta-Embeddings from
Contextualised Language Models [15.900069711477542]
We propose a sentence-level meta-embedding learning method that takes independently trained contextualised word embedding models.
Our proposed method is unsupervised and is not tied to a particular downstream task.
Experimental results show that our proposed unsupervised sentence-level meta-embedding method outperforms previously proposed sentence-level meta-embedding methods.
arXiv Detail & Related papers (2022-04-16T08:20:24Z) - Benchmarking Generalization via In-Context Instructions on 1,600+
Language Tasks [95.06087720086133]
Natural-Instructions v2 is a collection of 1,600+ diverse language tasks and their expert written instructions.
The benchmark covers 70+ distinct task types, such as tagging, in-filling, and rewriting.
This benchmark enables large-scale evaluation of cross-task generalization of the models.
arXiv Detail & Related papers (2022-04-16T03:12:30Z) - CUGE: A Chinese Language Understanding and Generation Evaluation
Benchmark [144.05723617401674]
General-purpose language intelligence evaluation has been a longstanding goal for natural language processing.
We argue that for general-purpose language intelligence evaluation, the benchmark itself needs to be comprehensive and systematic.
We propose CUGE, a Chinese Language Understanding and Generation Evaluation benchmark with the following features.
arXiv Detail & Related papers (2021-12-27T11:08:58Z) - Multi-view Subword Regularization [111.04350390045705]
Multi-view Subword Regularization (MVR) is a method that enforces the consistency between predictions of using inputs tokenized by the standard and probabilistic segmentations.
Results on the XTREME multilingual benchmark show that MVR brings consistent improvements of up to 2.5 points over using standard segmentation algorithms.
arXiv Detail & Related papers (2021-03-15T16:07:42Z) - NEMO: Frequentist Inference Approach to Constrained Linguistic Typology
Feature Prediction in SIGTYP 2020 Shared Task [83.43738174234053]
We employ frequentist inference to represent correlations between typological features and use this representation to train simple multi-class estimators that predict individual features.
Our best configuration achieved the micro-averaged accuracy score of 0.66 on 149 test languages.
arXiv Detail & Related papers (2020-10-12T19:25:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.