BERT Fine-Tuning for Sentiment Analysis on Indonesian Mobile Apps
Reviews
- URL: http://arxiv.org/abs/2107.06802v1
- Date: Wed, 14 Jul 2021 16:00:15 GMT
- Title: BERT Fine-Tuning for Sentiment Analysis on Indonesian Mobile Apps
Reviews
- Authors: Kuncahyo Setyo Nugroho, Anantha Yullian Sukmadewa, Haftittah
Wuswilahaken DW, Fitra Abdurrachman Bachtiar, Novanto Yudistira
- Abstract summary: This study examines the effectiveness of fine-tuning BERT for sentiment analysis using two different pre-trained models.
The dataset used is Indonesian user reviews of the ten best apps in 2020 in Google Play sites.
Two training data labeling approaches were also tested to determine the effectiveness of the model, which is score-based and lexicon-based.
- Score: 1.5749416770494706
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: User reviews have an essential role in the success of the developed mobile
apps. User reviews in the textual form are unstructured data, creating a very
high complexity when processed for sentiment analysis. Previous approaches that
have been used often ignore the context of reviews. In addition, the relatively
small data makes the model overfitting. A new approach, BERT, has been
introduced as a transfer learning model with a pre-trained model that has
previously been trained to have a better context representation. This study
examines the effectiveness of fine-tuning BERT for sentiment analysis using two
different pre-trained models. Besides the multilingual pre-trained model, we
use the pre-trained model that only has been trained in Indonesian. The dataset
used is Indonesian user reviews of the ten best apps in 2020 in Google Play
sites. We also perform hyper-parameter tuning to find the optimum trained
model. Two training data labeling approaches were also tested to determine the
effectiveness of the model, which is score-based and lexicon-based. The
experimental results show that pre-trained models trained in Indonesian have
better average accuracy on lexicon-based data. The pre-trained Indonesian model
highest accuracy is 84%, with 25 epochs and a training time of 24 minutes.
These results are better than all of the machine learning and multilingual
pre-trained models.
Related papers
- Reuse, Don't Retrain: A Recipe for Continued Pretraining of Language Models [29.367678364485794]
We show how to design efficacious data distributions and learning rate schedules for continued pretraining of language models.
We show an improvement of 9% in average model accuracy compared to the baseline of continued training on the pretraining set.
arXiv Detail & Related papers (2024-07-09T22:37:59Z) - Text Quality-Based Pruning for Efficient Training of Language Models [66.66259229732121]
We propose a novel method for numerically evaluating text quality in large unlabelled NLP datasets.
By proposing the text quality metric, the paper establishes a framework to identify and eliminate low-quality text instances.
Experimental results over multiple models and datasets demonstrate the efficacy of this approach.
arXiv Detail & Related papers (2024-04-26T18:01:25Z) - Natural Language Processing Through Transfer Learning: A Case Study on
Sentiment Analysis [1.14219428942199]
This paper explores the potential of transfer learning in natural language processing focusing mainly on sentiment analysis.
The claim is that, compared to training models from scratch, transfer learning, using pre-trained BERT models, can increase sentiment classification accuracy.
arXiv Detail & Related papers (2023-11-28T17:12:06Z) - Investigating Pre-trained Language Models on Cross-Domain Datasets, a
Step Closer to General AI [0.8889304968879164]
We investigate the ability of pre-trained language models to generalize to different non-language tasks.
The four pre-trained models that we used, T5, BART, BERT, and GPT-2 achieve outstanding results.
arXiv Detail & Related papers (2023-06-21T11:55:17Z) - Effective Robustness against Natural Distribution Shifts for Models with
Different Training Data [113.21868839569]
"Effective robustness" measures the extra out-of-distribution robustness beyond what can be predicted from the in-distribution (ID) performance.
We propose a new evaluation metric to evaluate and compare the effective robustness of models trained on different data.
arXiv Detail & Related papers (2023-02-02T19:28:41Z) - MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided
Adaptation [68.30497162547768]
We propose MoEBERT, which uses a Mixture-of-Experts structure to increase model capacity and inference speed.
We validate the efficiency and effectiveness of MoEBERT on natural language understanding and question answering tasks.
arXiv Detail & Related papers (2022-04-15T23:19:37Z) - From Good to Best: Two-Stage Training for Cross-lingual Machine Reading
Comprehension [51.953428342923885]
We develop a two-stage approach to enhance the model performance.
The first stage targets at recall: we design a hard-learning (HL) algorithm to maximize the likelihood that the top-k predictions contain the accurate answer.
The second stage focuses on precision: an answer-aware contrastive learning mechanism is developed to learn the fine difference between the accurate answer and other candidates.
arXiv Detail & Related papers (2021-12-09T07:31:15Z) - bert2BERT: Towards Reusable Pretrained Language Models [51.078081486422896]
We propose bert2BERT, which can effectively transfer the knowledge of an existing smaller pre-trained model to a large model.
bert2BERT saves about 45% and 47% computational cost of pre-training BERT_BASE and GPT_BASE by reusing the models of almost their half sizes.
arXiv Detail & Related papers (2021-10-14T04:05:25Z) - How much pretraining data do language models need to learn syntax? [12.668478784932878]
Transformers-based pretrained language models achieve outstanding results in many well-known NLU benchmarks.
We study the impact of pretraining data size on the knowledge of the models using RoBERTa.
arXiv Detail & Related papers (2021-09-07T15:51:39Z) - Few-shot learning through contextual data augmentation [74.20290390065475]
Machine translation models need to adapt to new data to maintain their performance over time.
We show that adaptation on the scale of one to five examples is possible.
Our model reports better accuracy scores than a reference system trained with on average 313 parallel examples.
arXiv Detail & Related papers (2021-03-31T09:05:43Z) - SE3M: A Model for Software Effort Estimation Using Pre-trained Embedding
Models [0.8287206589886881]
This paper proposes to evaluate the effectiveness of pre-trained embeddings models.
Generic pre-trained models for both approaches went through a fine-tuning process.
Results were very promising, realizing that pre-trained models can be used to estimate software effort based only on requirements texts.
arXiv Detail & Related papers (2020-06-30T14:15:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.