CLICKER: A Computational LInguistics Classification Scheme for
Educational Resources
- URL: http://arxiv.org/abs/2112.08578v1
- Date: Thu, 16 Dec 2021 02:40:43 GMT
- Title: CLICKER: A Computational LInguistics Classification Scheme for
Educational Resources
- Authors: Swapnil Hingmire, Irene Li, Rena Kawamura, Benjamin Chen, Alexander
Fabbri, Xiangru Tang, Yixin Liu, Thomas George, Tammy Liao, Wai Pan Wong,
Vanessa Yan, Richard Zhou, Girish K. Palshikar, Dragomir Radev
- Abstract summary: A classification scheme of a scientific subject gives an overview of its body of knowledge.
A comprehensive classification system like CCS or Mathematics Subject Classification (MSC) does not exist for Computational Linguistics (CL) and Natural Language Processing (NLP)
We propose a classification scheme -- CLICKER for CL/NLP based on the analysis of online lectures from 77 university courses on this subject.
- Score: 47.48935730905393
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: A classification scheme of a scientific subject gives an overview of its body
of knowledge. It can also be used to facilitate access to research articles and
other materials related to the subject. For example, the ACM Computing
Classification System (CCS) is used in the ACM Digital Library search interface
and also for indexing computer science papers. We observed that a comprehensive
classification system like CCS or Mathematics Subject Classification (MSC) does
not exist for Computational Linguistics (CL) and Natural Language Processing
(NLP). We propose a classification scheme -- CLICKER for CL/NLP based on the
analysis of online lectures from 77 university courses on this subject. The
currently proposed taxonomy includes 334 topics and focuses on educational
aspects of CL/NLP; it is based primarily, but not exclusively, on lecture notes
from NLP courses. We discuss how such a taxonomy can help in various real-world
applications, including tutoring platforms, resource retrieval, resource
recommendation, prerequisite chain learning, and survey generation.
Related papers
- Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions [62.12545440385489]
Large language models (LLMs) have brought substantial advancements in text generation, but their potential for enhancing classification tasks remains underexplored.
We propose a framework for thoroughly investigating fine-tuning LLMs for classification, including both generation- and encoding-based approaches.
We instantiate this framework in edit intent classification (EIC), a challenging and underexplored classification task.
arXiv Detail & Related papers (2024-10-02T20:48:28Z) - Empowering Interdisciplinary Research with BERT-Based Models: An Approach Through SciBERT-CNN with Topic Modeling [0.0]
This paper introduces a novel approach using the SciBERT model and CNNs to systematically categorize academic abstracts.
The CNN uses convolution and pooling to enhance feature extraction and reduce dimensionality.
arXiv Detail & Related papers (2024-04-16T05:21:47Z) - Hierarchical Multi-Label Classification of Scientific Documents [47.293189105900524]
We introduce a new dataset for hierarchical multi-label text classification of scientific papers called SciHTC.
This dataset contains 186,160 papers and 1,233 categories from the ACM CCS tree.
Our best model achieves a Macro-F1 score of 34.57% which shows that this dataset provides significant research opportunities.
arXiv Detail & Related papers (2022-11-05T04:12:57Z) - TaxoCom: Topic Taxonomy Completion with Hierarchical Discovery of Novel
Topic Clusters [57.59286394188025]
We propose a novel framework for topic taxonomy completion, named TaxoCom.
TaxoCom discovers novel sub-topic clusters of terms and documents.
Our comprehensive experiments on two real-world datasets demonstrate that TaxoCom not only generates the high-quality topic taxonomy in terms of term coherency and topic coverage.
arXiv Detail & Related papers (2022-01-18T07:07:38Z) - A Systematic Literature Review of Automated ICD Coding and
Classification Systems using Discharge Summaries [5.156484100374058]
Codification of free-text clinical narratives has long been recognised to be beneficial for secondary uses such as funding, insurance claim processing and research.
The current scenario of assigning codes is a manual process which is very expensive, time-consuming and error prone.
This systematic literature review provides a comprehensive overview of automated clinical coding systems.
arXiv Detail & Related papers (2021-07-12T03:55:17Z) - One-Class Classification: A Survey [96.17410674315816]
One-Class Classification (OCC) is a special case of multi-class classification, where data observed during training is from a single positive class.
We provide a survey of classical statistical and recent deep learning-based OCC methods for visual recognition.
arXiv Detail & Related papers (2021-01-08T15:30:29Z) - A Survey on Curriculum Learning [48.36129047271622]
Curriculum learning (CL) is a training strategy that trains a machine learning model from easier data to harder data.
As an easy-to-use plug-in, the CL strategy has demonstrated its power in improving the generalization capacity and convergence rate of various models.
arXiv Detail & Related papers (2020-10-25T17:15:04Z) - AutoMSC: Automatic Assignment of Mathematics Subject Classification
Labels [4.001125251113153]
We investigate the feasibility of automatically assigning a coarse-grained primary classification using the Mathematics Subject Classification scheme.
We find that our method achieves an (F_1)-score of over 77%, which is remarkably close to the agreement of zbMATH and MR.
arXiv Detail & Related papers (2020-05-25T13:26:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.