Related papers: Are We Really Making Much Progress in Text Classification? A Comparative Review

Are We Really Making Much Progress in Text Classification? A Comparative Review

URL: http://arxiv.org/abs/2204.03954v6
Date: Sun, 19 Jan 2025 20:37:45 GMT
Title: Are We Really Making Much Progress in Text Classification? A Comparative Review
Authors: Lukas Galke, Ansgar Scherp, Andor Diera, Fabian Karl, Bao Xin Lin, Bhakti Khera, Tim Meuser, Tushar Singhal,
Abstract summary: We analyze various methods for single-label and multi-label text classification across well-known datasets.<n>We highlight the superiority of discriminative language models like BERT over generative models for supervised tasks.
Score: 5.33235750734179
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We analyze various methods for single-label and multi-label text classification across well-known datasets, categorizing them into bag-of-words, sequence-based, graph-based, and hierarchical approaches. Despite the surge in methods like graph-based models, encoder-only pre-trained language models, notably BERT, remain state-of-the-art. However, recent findings suggest simpler models like logistic regression and trigram-based SVMs outperform newer techniques. While decoder-only generative language models show promise in learning with limited data, they lag behind encoder-only models in performance. We emphasize the superiority of discriminative language models like BERT over generative models for supervised tasks. Additionally, we highlight the literature's lack of robustness in method comparisons, particularly concerning basic hyperparameter optimizations like learning rate in fine-tuning encoder-only language models. Data availability: The source code is available at https://github.com/drndr/multilabel-text-clf All datasets used for our experiments are publicly available except the NYT dataset.

Related papers

READ: Reinforcement-based Adversarial Learning for Text Classification with Limited Labeled Data [7.152603583363887]
Pre-trained transformer models such as BERT have shown massive gains across many text classification tasks. This paper proposes a method that encapsulates reinforcement learning-based text generation and semi-supervised adversarial learning approaches. Our method READ, Reinforcement-based Adversarial learning, utilizes an unlabeled dataset to generate diverse synthetic text through reinforcement learning.
arXiv Detail & Related papers (2025-01-14T11:39:55Z)
Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA [51.3033125256716]
We model the subgraph retrieval task as a conditional generation task handled by small language models. Our base generative subgraph retrieval model, consisting of only 220M parameters, competitive retrieval performance compared to state-of-the-art models. Our largest 3B model, when plugged with an LLM reader, sets new SOTA end-to-end performance on both the WebQSP and CWQ benchmarks.
arXiv Detail & Related papers (2024-10-08T15:22:36Z)
Leveraging Annotator Disagreement for Text Classification [3.6625157427847963]
It is common practice in text classification to only use one majority label for model training even if a dataset has been annotated by multiple annotators. This paper proposes three strategies to leverage annotator disagreement for text classification: a probability-based multi-label method, an ensemble system, and instruction tuning.
arXiv Detail & Related papers (2024-09-26T06:46:53Z)
Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings. An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts) This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z)
XAI-CLASS: Explanation-Enhanced Text Classification with Extremely Weak Supervision [6.406111099707549]
XAI-CLASS is a novel explanation-enhanced weakly-supervised text classification method. It incorporates word saliency prediction as an auxiliary task. XAI-CLASS outperforms other weakly-supervised text classification methods significantly.
arXiv Detail & Related papers (2023-10-31T23:24:22Z)
Linear Classifier: An Often-Forgotten Baseline for Text Classification [12.792276278777532]
We argue the importance of running a simple baseline like linear classifiers on bag-of-words features along with advanced methods. advanced models such as BERT may only achieve the best results if properly applied.
arXiv Detail & Related papers (2023-06-12T13:39:54Z)
Evaluating Unsupervised Text Classification: Zero-shot and Similarity-based Approaches [0.6767885381740952]
Similarity-based approaches attempt to classify instances based on similarities between text document representations and class description representations. Zero-shot text classification approaches aim to generalize knowledge gained from a training task by assigning appropriate labels of unknown classes to text documents. This paper conducts a systematic evaluation of different similarity-based and zero-shot approaches for text classification of unseen classes.
arXiv Detail & Related papers (2022-11-29T15:14:47Z)
TabLLM: Few-shot Classification of Tabular Data with Large Language Models [66.03023402174138]
We study the application of large language models to zero-shot and few-shot classification. We evaluate several serialization methods including templates, table-to-text models, and large language models. This approach is also competitive with strong traditional baselines like gradient-boosted trees.
arXiv Detail & Related papers (2022-10-19T17:08:13Z)
BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and Semantic Parsing [55.058258437125524]
We introduce BenchCLAMP, a Benchmark to evaluate Constrained LAnguage Model Parsing. We benchmark eight language models, including two GPT-3 variants available only through an API. Our experiments show that encoder-decoder pretrained language models can achieve similar performance or surpass state-of-the-art methods for syntactic and semantic parsing when the model output is constrained to be valid.
arXiv Detail & Related papers (2022-06-21T18:34:11Z)
Hierarchical Neural Network Approaches for Long Document Classification [3.6700088931938835]
We employ pre-trained Universal Sentence (USE) and Bidirectional Representations from Transformers (BERT) in a hierarchical setup to capture better representations efficiently. Our proposed models are conceptually simple where we divide the input data into chunks and then pass this through base models of BERT and USE. We show that USE + CNN/LSTM performs better than its stand-alone baseline. Whereas the BERT + CNN/LSTM performs on par with its stand-alone counterpart.
arXiv Detail & Related papers (2022-01-18T07:17:40Z)
Hierarchical Heterogeneous Graph Representation Learning for Short Text Classification [60.233529926965836]
We propose a new method called SHINE, which is based on graph neural network (GNN) for short text classification. First, we model the short text dataset as a hierarchical heterogeneous graph consisting of word-level component graphs. Then, we dynamically learn a short document graph that facilitates effective label propagation among similar short texts.
arXiv Detail & Related papers (2021-10-30T05:33:05Z)
CLLD: Contrastive Learning with Label Distance for Text Classificatioin [0.6299766708197883]
We propose Contrastive Learning with Label Distance (CLLD) for learning contrastive classes. CLLD ensures the flexibility within the subtle differences that lead to different label assignments. Our experiments suggest that the learned label distance relieve the adversarial nature of interclasses.
arXiv Detail & Related papers (2021-10-25T07:07:14Z)
Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model. In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z)
Multi-Label Image Classification with Contrastive Learning [57.47567461616912]
We show that a direct application of contrastive learning can hardly improve in multi-label cases. We propose a novel framework for multi-label classification with contrastive learning in a fully supervised setting.
arXiv Detail & Related papers (2021-07-24T15:00:47Z)
Sentiment analysis in tweets: an assessment study from classical to modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information. Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks. This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z)
Self-Training Pre-Trained Language Models for Zero- and Few-Shot Multi-Dialectal Arabic Sequence Labeling [7.310390479801139]
Self-train pre-trained language models in zero- and few-shot scenarios to improve performance on data-scarce varieties. Our work opens up opportunities for developing DA models exploiting only MSA resources.
arXiv Detail & Related papers (2021-01-12T21:29:30Z)
Text Classification Using Label Names Only: A Language Model Self-Training Approach [80.63885282358204]
Current text classification methods typically require a good number of human-labeled documents as training data. We show that our model achieves around 90% accuracy on four benchmark datasets including topic and sentiment classification.
arXiv Detail & Related papers (2020-10-14T17:06:41Z)
Cooperative Bi-path Metric for Few-shot Learning [50.98891758059389]
We make two contributions to investigate the few-shot classification problem. We report a simple and effective baseline trained on base classes in the way of traditional supervised learning. We propose a cooperative bi-path metric for classification, which leverages the correlations between base classes and novel classes to further improve the accuracy.
arXiv Detail & Related papers (2020-08-10T11:28:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.