Related papers: Blueprinting the Future: Automatic Item Categorization using Hierarchical Zero-Shot and Few-Shot Classifiers

Blueprinting the Future: Automatic Item Categorization using Hierarchical Zero-Shot and Few-Shot Classifiers

URL: http://arxiv.org/abs/2312.03561v1
Date: Wed, 6 Dec 2023 15:51:49 GMT
Title: Blueprinting the Future: Automatic Item Categorization using Hierarchical Zero-Shot and Few-Shot Classifiers
Authors: Ting Wang, Keith Stelter, Jenn Floyd, Thomas O'Neill, Nathaniel Hendrix, Andrew Bazemore, Kevin Rode, Warren Newton
Abstract summary: This study unveils a novel approach employing the zero-shot and few-shot Generative Pretrained Transformer (GPT) for hierarchical item categorization. The hierarchical nature of examination blueprints is navigated seamlessly, allowing for a tiered classification of items across multiple levels. An initial simulation with artificial data demonstrates the efficacy of this method, achieving an average accuracy of 92.91% measured by the F1 score.
Score: 6.907552533477328
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In testing industry, precise item categorization is pivotal to align exam questions with the designated content domains outlined in the assessment blueprint. Traditional methods either entail manual classification, which is laborious and error-prone, or utilize machine learning requiring extensive training data, often leading to model underfit or overfit issues. This study unveils a novel approach employing the zero-shot and few-shot Generative Pretrained Transformer (GPT) classifier for hierarchical item categorization, minimizing the necessity for training data, and instead, leveraging human-like language descriptions to define categories. Through a structured python dictionary, the hierarchical nature of examination blueprints is navigated seamlessly, allowing for a tiered classification of items across multiple levels. An initial simulation with artificial data demonstrates the efficacy of this method, achieving an average accuracy of 92.91% measured by the F1 score. This method was further applied to real exam items from the 2022 In-Training Examination (ITE) conducted by the American Board of Family Medicine (ABFM), reclassifying 200 items according to a newly formulated blueprint swiftly in 15 minutes, a task that traditionally could span several days among editors and physicians. This innovative approach not only drastically cuts down classification time but also ensures a consistent, principle-driven categorization, minimizing human biases and discrepancies. The ability to refine classifications by adjusting definitions adds to its robustness and sustainability.

Related papers

Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels. By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data. The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z)
Learn to Categorize or Categorize to Learn? Self-Coding for Generalized Category Discovery [49.1865089933055]
We propose a novel, efficient and self-supervised method capable of discovering previously unknown categories at test time. A salient feature of our approach is the assignment of minimum length category codes to individual data instances. Experimental evaluations, bolstered by state-of-the-art benchmark comparisons, testify to the efficacy of our solution.
arXiv Detail & Related papers (2023-10-30T17:45:32Z)
Task-Specific Embeddings for Ante-Hoc Explainable Text Classification [6.671252951387647]
We propose an alternative training objective in which we learn task-specific embeddings of text. Our proposed objective learns embeddings such that all texts that share the same target class label should be close together. We present extensive experiments which show that the benefits of ante-hoc explainability and incremental learning come at no cost in overall classification accuracy.
arXiv Detail & Related papers (2022-11-30T19:56:25Z)
Parametric Classification for Generalized Category Discovery: A Baseline Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples. We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem. We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z)
Novel Class Discovery without Forgetting [72.52222295216062]
We identify and formulate a new, pragmatic problem setting of NCDwF: Novel Class Discovery without Forgetting. We propose a machine learning model to incrementally discover novel categories of instances from unlabeled data. We introduce experimental protocols based on CIFAR-10, CIFAR-100 and ImageNet-1000 to measure the trade-off between knowledge retention and novel class discovery.
arXiv Detail & Related papers (2022-07-21T17:54:36Z)
Cluster & Tune: Boost Cold Start Performance in Text Classification [21.957605438780224]
In real-world scenarios, a text classification task often begins with a cold start, when labeled data is scarce. We suggest a method to boost the performance of such models by adding an intermediate unsupervised classification task.
arXiv Detail & Related papers (2022-03-20T15:29:34Z)
Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification. Our analysis reveals that the classification accuracy is highly distribution-dependent. The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z)
Predicting Classification Accuracy When Adding New Unobserved Classes [8.325327265120283]
We study how a classifier's performance can be used to extrapolate its expected accuracy on a larger, unobserved set of classes. We formulate a robust neural-network-based algorithm, "CleaneX", which learns to estimate the accuracy of such classifiers on arbitrarily large sets of classes.
arXiv Detail & Related papers (2020-10-28T14:37:25Z)
Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning [61.32992639292889]
Fine-tuning of pre-trained transformer models has become the standard approach for solving common NLP tasks. We introduce a new scoring method that casts a plausibility ranking task in a full-text format. We show that our method provides a much more stable training phase across random restarts.
arXiv Detail & Related papers (2020-04-29T10:54:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.