A Survey on Text Classification: From Shallow to Deep Learning
- URL: http://arxiv.org/abs/2008.00364v6
- Date: Wed, 22 Dec 2021 11:35:08 GMT
- Title: A Survey on Text Classification: From Shallow to Deep Learning
- Authors: Qian Li, Hao Peng, Jianxin Li, Congying Xia, Renyu Yang, Lichao Sun,
Philip S. Yu, Lifang He
- Abstract summary: The last decade has seen a surge of research in this area due to the unprecedented success of deep learning.
This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021.
We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification.
- Score: 83.47804123133719
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text classification is the most fundamental and essential task in natural
language processing. The last decade has seen a surge of research in this area
due to the unprecedented success of deep learning. Numerous methods, datasets,
and evaluation metrics have been proposed in the literature, raising the need
for a comprehensive and updated survey. This paper fills the gap by reviewing
the state-of-the-art approaches from 1961 to 2021, focusing on models from
traditional models to deep learning. We create a taxonomy for text
classification according to the text involved and the models used for feature
extraction and classification. We then discuss each of these categories in
detail, dealing with both the technical developments and benchmark datasets
that support tests of predictions. A comprehensive comparison between different
techniques, as well as identifying the pros and cons of various evaluation
metrics are also provided in this survey. Finally, we conclude by summarizing
key implications, future research directions, and the challenges facing the
research area.
Related papers
- Text Classification using Graph Convolutional Networks: A Comprehensive Survey [11.1080224302799]
Graph convolution network (GCN)-based approaches have gained a lot of traction in this domain over the last decade.
This work aims to summarize and categorize various GCN-based Text Classification approaches with regard to the architecture and mode of supervision.
arXiv Detail & Related papers (2024-10-12T07:03:42Z) - Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions [62.12545440385489]
Large language models (LLMs) have brought substantial advancements in text generation, but their potential for enhancing classification tasks remains underexplored.
We propose a framework for thoroughly investigating fine-tuning LLMs for classification, including both generation- and encoding-based approaches.
We instantiate this framework in edit intent classification (EIC), a challenging and underexplored classification task.
arXiv Detail & Related papers (2024-10-02T20:48:28Z) - Detecting Statements in Text: A Domain-Agnostic Few-Shot Solution [1.3654846342364308]
State-of-the-art approaches usually involve fine-tuning models on large annotated datasets, which are costly to produce.
We propose and release a qualitative and versatile few-shot learning methodology as a common paradigm for any claim-based textual classification task.
We illustrate this methodology in the context of three tasks: climate change contrarianism detection, topic/stance classification and depression-relates symptoms detection.
arXiv Detail & Related papers (2024-05-09T12:03:38Z) - A Comprehensive Survey of Text Classification Techniques and Their Research Applications: Observational and Experimental Insights [2.1436706159840013]
This survey paper introduces a comprehensive taxonomy specifically designed for text classification based on research fields.
The taxonomy is structured into hierarchical levels: research field-based category, research field-based sub-category, methodology-based technique, methodology sub-technique, and research field applications.
arXiv Detail & Related papers (2024-01-11T08:17:42Z) - Deep Learning Schema-based Event Extraction: Literature Review and
Current Trends [60.29289298349322]
Event extraction technology based on deep learning has become a research hotspot.
This paper fills the gap by reviewing the state-of-the-art approaches, focusing on deep learning-based models.
arXiv Detail & Related papers (2021-07-05T16:32:45Z) - Deep Learning for Scene Classification: A Survey [48.57123373347695]
Scene classification is a longstanding, fundamental and challenging problem in computer vision.
The rise of large-scale datasets and the renaissance of deep learning techniques have brought remarkable progress in the field of scene representation and classification.
This paper provides a comprehensive survey of recent achievements in scene classification using deep learning.
arXiv Detail & Related papers (2021-01-26T03:06:50Z) - A Survey of Embedding Space Alignment Methods for Language and Knowledge
Graphs [77.34726150561087]
We survey the current research landscape on word, sentence and knowledge graph embedding algorithms.
We provide a classification of the relevant alignment techniques and discuss benchmark datasets used in this field of research.
arXiv Detail & Related papers (2020-10-26T16:08:13Z) - Deep Learning Based Text Classification: A Comprehensive Review [75.8403533775179]
We provide a review of more than 150 deep learning based models for text classification developed in recent years.
We also provide a summary of more than 40 popular datasets widely used for text classification.
arXiv Detail & Related papers (2020-04-06T02:00:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.