Key Phrase Classification in Complex Assignments
- URL: http://arxiv.org/abs/2003.07019v1
- Date: Mon, 16 Mar 2020 04:25:37 GMT
- Title: Key Phrase Classification in Complex Assignments
- Authors: Manikandan Ravikiran
- Abstract summary: We show that the task of classification of key phrases is ambiguous at a human level producing Cohen's kappa of 0.77 on a new data set.
Both pretrained language models and simple TFIDF SVM classifiers produce similar results with a former producing average of 0.6 F1 higher than the latter.
- Score: 5.067828201066184
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Complex assignments typically consist of open-ended questions with large and
diverse content in the context of both classroom and online graduate programs.
With the sheer scale of these programs comes a variety of problems in peer and
expert feedback, including rogue reviews. As such with the hope of identifying
important contents needed for the review, in this work we present a very first
work on key phrase classification with a detailed empirical study on
traditional and most recent language modeling approaches. From this study, we
find that the task of classification of key phrases is ambiguous at a human
level producing Cohen's kappa of 0.77 on a new data set. Both pretrained
language models and simple TFIDF SVM classifiers produce similar results with a
former producing average of 0.6 F1 higher than the latter. We finally derive
practical advice from our extensive empirical and model interpretability
results for those interested in key phrase classification from educational
reports in the future.
Related papers
- Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions [62.12545440385489]
Large language models (LLMs) have brought substantial advancements in text generation, but their potential for enhancing classification tasks remains underexplored.
We propose a framework for thoroughly investigating fine-tuning LLMs for classification, including both generation- and encoding-based approaches.
We instantiate this framework in edit intent classification (EIC), a challenging and underexplored classification task.
arXiv Detail & Related papers (2024-10-02T20:48:28Z) - Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language
Modelling [70.23876429382969]
We propose a benchmark that can evaluate intra-sentence discourse properties across a diverse set of NLP tasks.
Disco-Bench consists of 9 document-level testsets in the literature domain, which contain rich discourse phenomena.
For linguistic analysis, we also design a diagnostic test suite that can examine whether the target models learn discourse knowledge.
arXiv Detail & Related papers (2023-07-16T15:18:25Z) - Estimating Confidence of Predictions of Individual Classifiers and Their
Ensembles for the Genre Classification Task [0.0]
Genre identification is a subclass of non-topical text classification.
Nerve models based on pre-trained transformers, such as BERT or XLM-RoBERTa, demonstrate SOTA results in many NLP tasks.
arXiv Detail & Related papers (2022-06-15T09:59:05Z) - Automated Speech Scoring System Under The Lens: Evaluating and
interpreting the linguistic cues for language proficiency [26.70127591966917]
We utilize classical machine learning models to formulate a speech scoring task as both a classification and a regression problem.
First, we extract linguist features under five categories (fluency, pronunciation, content, grammar and vocabulary, and acoustic) and train models to grade responses.
In comparison, we find that the regression-based models perform equivalent to or better than the classification approach.
arXiv Detail & Related papers (2021-11-30T06:28:58Z) - A Multimodal Machine Learning Framework for Teacher Vocal Delivery
Evaluation [21.07429789279818]
We present a novel machine learning approach that utilizes pairwise comparisons and a multimodal fusing algorithm to generate objective evaluation results of the teacher vocal delivery in terms of fluency and passion.
arXiv Detail & Related papers (2021-07-15T05:09:39Z) - Sentiment analysis in tweets: an assessment study from classical to
modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information.
Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks.
This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z) - Multitask Learning for Class-Imbalanced Discourse Classification [74.41900374452472]
We show that a multitask approach can improve 7% Micro F1-score upon current state-of-the-art benchmarks.
We also offer a comparative review of additional techniques proposed to address resource-poor problems in NLP.
arXiv Detail & Related papers (2021-01-02T07:13:41Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - A Survey on Text Classification: From Shallow to Deep Learning [83.47804123133719]
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning.
This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021.
We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification.
arXiv Detail & Related papers (2020-08-02T00:09:03Z) - Finding Black Cat in a Coal Cellar -- Keyphrase Extraction &
Keyphrase-Rubric Relationship Classification from Complex Assignments [5.067828201066184]
This paper aims to quantify the effectiveness of supervised and unsupervised approaches for the task for keyphrase extraction.
We find that (i) unsupervised MultiPartiteRank produces the best result for keyphrase extraction.
We also present a comprehensive analysis and derive useful observations for those interested in these tasks for the future.
arXiv Detail & Related papers (2020-04-03T13:18:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.