Natural Language Processing Through Transfer Learning: A Case Study on
Sentiment Analysis
- URL: http://arxiv.org/abs/2311.16965v1
- Date: Tue, 28 Nov 2023 17:12:06 GMT
- Title: Natural Language Processing Through Transfer Learning: A Case Study on
Sentiment Analysis
- Authors: Aman Yadav, Abhishek Vichare
- Abstract summary: This paper explores the potential of transfer learning in natural language processing focusing mainly on sentiment analysis.
The claim is that, compared to training models from scratch, transfer learning, using pre-trained BERT models, can increase sentiment classification accuracy.
- Score: 1.14219428942199
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial intelligence and machine learning have significantly bolstered the
technological world. This paper explores the potential of transfer learning in
natural language processing focusing mainly on sentiment analysis. The models
trained on the big data can also be used where data are scarce. The claim is
that, compared to training models from scratch, transfer learning, using
pre-trained BERT models, can increase sentiment classification accuracy. The
study adopts a sophisticated experimental design that uses the IMDb dataset of
sentimentally labelled movie reviews. Pre-processing includes tokenization and
encoding of text data, making it suitable for NLP models. The dataset is used
on a BERT based model, measuring its performance using accuracy. The result
comes out to be 100 per cent accurate. Although the complete accuracy could
appear impressive, it might be the result of overfitting or a lack of
generalization. Further analysis is required to ensure the model's ability to
handle diverse and unseen data. The findings underscore the effectiveness of
transfer learning in NLP, showcasing its potential to excel in sentiment
analysis tasks. However, the research calls for a cautious interpretation of
perfect accuracy and emphasizes the need for additional measures to validate
the model's generalization.
Related papers
- Detecting AI Generated Text Based on NLP and Machine Learning Approaches [0.0]
Recent advances in natural language processing may enable AI models to generate writing that is identical to human written form in the future.
This might have profound ethical, legal, and social repercussions.
Our approach includes a machine learning methods that can differentiate between electronically produced text and human-written text.
arXiv Detail & Related papers (2024-04-15T16:37:44Z) - Improving Classification Performance With Human Feedback: Label a few,
we label the rest [2.7386128680964408]
This paper focuses on understanding how a continuous feedback loop can refine models, thereby enhancing their accuracy, recall, and precision.
We benchmark this approach on the Financial Phrasebank, Banking, Craigslist, Trec, Amazon Reviews datasets to prove that with just a few labeled examples, we are able to surpass the accuracy of zero shot large language models.
arXiv Detail & Related papers (2024-01-17T19:13:05Z) - Robust Machine Learning by Transforming and Augmenting Imperfect
Training Data [6.928276018602774]
This thesis explores several data sensitivities of modern machine learning.
We first discuss how to prevent ML from codifying prior human discrimination measured in the training data.
We then discuss the problem of learning from data containing spurious features, which provide predictive fidelity during training but are unreliable upon deployment.
arXiv Detail & Related papers (2023-12-19T20:49:28Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Self-Distillation for Further Pre-training of Transformers [83.84227016847096]
We propose self-distillation as a regularization for a further pre-training stage.
We empirically validate the efficacy of self-distillation on a variety of benchmark datasets for image and text classification tasks.
arXiv Detail & Related papers (2022-09-30T02:25:12Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - On the Transferability of Pre-trained Language Models: A Study from
Artificial Datasets [74.11825654535895]
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to achieve exceptional downstream performance.
We study what specific traits in the pre-training data, other than the semantics, make a pre-trained LM superior to their counterparts trained from scratch on downstream tasks.
arXiv Detail & Related papers (2021-09-08T10:39:57Z) - BERT Fine-Tuning for Sentiment Analysis on Indonesian Mobile Apps
Reviews [1.5749416770494706]
This study examines the effectiveness of fine-tuning BERT for sentiment analysis using two different pre-trained models.
The dataset used is Indonesian user reviews of the ten best apps in 2020 in Google Play sites.
Two training data labeling approaches were also tested to determine the effectiveness of the model, which is score-based and lexicon-based.
arXiv Detail & Related papers (2021-07-14T16:00:15Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.