Related papers: Deep Sequence Models for Text Classification Tasks

Deep Sequence Models for Text Classification Tasks

URL: http://arxiv.org/abs/2207.08880v1
Date: Mon, 18 Jul 2022 18:47:18 GMT
Title: Deep Sequence Models for Text Classification Tasks
Authors: Saheed Salahudeen Abdullahi, Sun Yiming, Shamsuddeen Hassan Muhammad, Abdulrasheed Mustapha, Ahmad Muhammad Aminu, Abdulkadir Abdullahi, Musa Bello, Saminu Mohammad Aliyu
Abstract summary: Natural Language Processing (NLP) is equipping machines to understand human diverse and complicated languages. Common text classification application includes information retrieval, modeling news topic, theme extraction, sentiment analysis, and spam detection. Sequence models such as RNN, GRU, and LSTM is a breakthrough for tasks with long-range dependencies. Results generated were excellent with most of the models performing within the range of 80% and 94%.
Score: 0.007329200485567826
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The exponential growth of data generated on the Internet in the current information age is a driving force for the digital economy. Extraction of information is the major value in an accumulated big data. Big data dependency on statistical analysis and hand-engineered rules machine learning algorithms are overwhelmed with vast complexities inherent in human languages. Natural Language Processing (NLP) is equipping machines to understand these human diverse and complicated languages. Text Classification is an NLP task which automatically identifies patterns based on predefined or undefined labeled sets. Common text classification application includes information retrieval, modeling news topic, theme extraction, sentiment analysis, and spam detection. In texts, some sequences of words depend on the previous or next word sequences to make full meaning; this is a challenging dependency task that requires the machine to be able to store some previous important information to impact future meaning. Sequence models such as RNN, GRU, and LSTM is a breakthrough for tasks with long-range dependencies. As such, we applied these models to Binary and Multi-class classification. Results generated were excellent with most of the models performing within the range of 80% and 94%. However, this result is not exhaustive as we believe there is room for improvement if machines are to compete with humans.

Related papers

A Comparative Study of Task Adaptation Techniques of Large Language Models for Identifying Sustainable Development Goals [39.71115518041856]
This study analyzes various proprietary and open-source text classification models for a single-label, multi-class text classification task focused on the UN's Sustainable Development Goals.<n>The results reveal that smaller models, when optimized through prompt engineering, can perform on par with larger models like OpenAI's GPT.
arXiv Detail & Related papers (2025-06-18T07:42:32Z)
Large Language Models in the Task of Automatic Validation of Text Classifier Predictions [55.2480439325792]
Machine learning models for text classification are trained to predict a class for a given text.<n>To do this, training and validation samples must be prepared, and each text is assigned a class.<n>Human annotators are usually assigned by human annotators with different expertise levels, depending on the specific classification task.<n>This paper proposes several approaches to replace human annotators with Large Language Models.
arXiv Detail & Related papers (2025-05-24T13:19:03Z)
From Text to Graph: Leveraging Graph Neural Networks for Enhanced Explainability in NLP [3.864700176441583]
This study proposes a novel methodology to achieve explainability in natural language processing tasks. It automatically converts sentences into graphs and maintains semantics through nodes and relations. Experiments delivered promising results in determining the most critical components within the text structure for a given classification.
arXiv Detail & Related papers (2025-04-02T18:55:58Z)
DataAgent: Evaluating Large Language Models' Ability to Answer Zero-Shot, Natural Language Queries [0.0]
We evaluate OpenAI's GPT-3.5 as a "Language Data Scientist" (LDS) The model was tested on a diverse set of benchmark datasets to evaluate its performance across multiple standards.
arXiv Detail & Related papers (2024-03-29T22:59:34Z)
Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings. An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts) This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z)
In-Context Language Learning: Architectures and Algorithms [73.93205821154605]
We study ICL through the lens of a new family of model problems we term in context language learning (ICLL) We evaluate a diverse set of neural sequence models on regular ICLL tasks.
arXiv Detail & Related papers (2024-01-23T18:59:21Z)
The Imitation Game: Detecting Human and AI-Generated Texts in the Era of ChatGPT and BARD [3.2228025627337864]
We introduce a novel dataset of human-written and AI-generated texts in different genres. We employ several machine learning models to classify the texts. Results demonstrate the efficacy of these models in discerning between human and AI-generated text.
arXiv Detail & Related papers (2023-07-22T21:00:14Z)
Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts [91.3755431537592]
The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases. Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding. This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
arXiv Detail & Related papers (2023-07-05T20:16:20Z)
Improving Classifier Training Efficiency for Automatic Cyberbullying Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods. We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments. The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z)
Detecting Requirements Smells With Deep Learning: Experiences, Challenges and Future Work [9.44316959798363]
This work aims to improve the previous work by creating a manually labeled dataset and using ensemble learning, Deep Learning (DL), and techniques such as word embeddings and transfer learning to overcome the generalization problem. The current findings show that the dataset is unbalanced and which class examples should be added more.
arXiv Detail & Related papers (2021-08-06T12:45:15Z)
Sentiment analysis in tweets: an assessment study from classical to modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information. Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks. This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z)
Text Classification Using Hybrid Machine Learning Algorithms on Big Data [0.0]
In this work, two supervised machine learning algorithms are combined with text mining techniques to produce a hybrid model. The result shows that the hybrid model gave 96.76% accuracy as against the 61.45% and 69.21% of the Na"ive Bayes and SVM models respectively.
arXiv Detail & Related papers (2021-03-30T19:02:48Z)
Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking. We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.