Uzbek Sentiment Analysis based on local Restaurant Reviews
- URL: http://arxiv.org/abs/2205.15930v1
- Date: Tue, 31 May 2022 16:21:00 GMT
- Title: Uzbek Sentiment Analysis based on local Restaurant Reviews
- Authors: Sanatbek Matlatipov, Hulkar Rahimboeva, Jaloliddin Rajabov, Elmurod
Kuriyozov
- Abstract summary: We present a work done on collecting restaurant reviews data as a sentiment analysis dataset for the Uzbek language.
The paper includes detailed information on how the data was collected, how it was pre-processed for better quality optimization, as well as experimental setups for the evaluation process.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Extracting useful information for sentiment analysis and classification
problems from a big amount of user-generated feedback, such as restaurant
reviews, is a crucial task of natural language processing, which is not only
for customer satisfaction where it can give personalized services, but can also
influence the further development of a company. In this paper, we present a
work done on collecting restaurant reviews data as a sentiment analysis dataset
for the Uzbek language, a member of the Turkic family which is heavily affected
by the low-resource constraint, and provide some further analysis of the novel
dataset by evaluation using different techniques, from logistic regression
based models, to support vector machines, and even deep learning models, such
as recurrent neural networks, as well as convolutional neural networks. The
paper includes detailed information on how the data was collected, how it was
pre-processed for better quality optimization, as well as experimental setups
for the evaluation process. The overall evaluation results indicate that by
performing pre-processing steps, such as stemming for agglutinative languages,
the system yields better results, eventually achieving 91% accuracy result in
the best performing model
Related papers
- Leveraging Large Language Models for Mobile App Review Feature Extraction [4.879919005707447]
This study explores the hypothesis that encoder-only large language models can enhance feature extraction from mobile app reviews.
By leveraging crowdsourced annotations from an industrial context, we redefine feature extraction as a supervised token classification task.
Empirical evaluations demonstrate that this method improves the precision and recall of extracted features and enhances performance efficiency.
arXiv Detail & Related papers (2024-08-02T07:31:57Z) - Noisy Self-Training with Synthetic Queries for Dense Retrieval [49.49928764695172]
We introduce a novel noisy self-training framework combined with synthetic queries.
Experimental results show that our method improves consistently over existing methods.
Our method is data efficient and outperforms competitive baselines.
arXiv Detail & Related papers (2023-11-27T06:19:50Z) - TRIAGE: Characterizing and auditing training data for improved
regression [80.11415390605215]
We introduce TRIAGE, a novel data characterization framework tailored to regression tasks and compatible with a broad class of regressors.
TRIAGE utilizes conformal predictive distributions to provide a model-agnostic scoring method, the TRIAGE score.
We show that TRIAGE's characterization is consistent and highlight its utility to improve performance via data sculpting/filtering, in multiple regression settings.
arXiv Detail & Related papers (2023-10-29T10:31:59Z) - Convolutional Neural Networks for Sentiment Analysis on Weibo Data: A
Natural Language Processing Approach [0.228438857884398]
This study addresses the complex task of sentiment analysis on a dataset of 119,988 original tweets from Weibo using a Convolutional Neural Network (CNN)
A CNN-based model was utilized, leveraging word embeddings for feature extraction, and trained to perform sentiment classification.
The model achieved a macro-average F1-score of approximately 0.73 on the test set, showing balanced performance across positive, neutral, and negative sentiments.
arXiv Detail & Related papers (2023-07-13T03:02:56Z) - Transfer Learning for Low-Resource Sentiment Analysis [1.2891210250935146]
In this paper, the collection and annotation of a dataset are described for sentiment analysis of Central Kurdish.
We explore a few classical machine learning and neural network-based techniques for this task.
arXiv Detail & Related papers (2023-04-10T16:44:44Z) - Incorporating Relevance Feedback for Information-Seeking Retrieval using
Few-Shot Document Re-Ranking [56.80065604034095]
We introduce a kNN approach that re-ranks documents based on their similarity with the query and the documents the user considers relevant.
To evaluate our different integration strategies, we transform four existing information retrieval datasets into the relevance feedback scenario.
arXiv Detail & Related papers (2022-10-19T16:19:37Z) - An Empirical Investigation of Commonsense Self-Supervision with
Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models.
We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z) - ASR in German: A Detailed Error Analysis [0.0]
This work presents a selection of ASR model architectures that are pretrained on the German language and evaluates them on a benchmark of diverse test datasets.
It identifies cross-architectural prediction errors, classifies those into categories and traces the sources of errors per category back into training data.
arXiv Detail & Related papers (2022-04-12T08:25:01Z) - Improving Classifier Training Efficiency for Automatic Cyberbullying
Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods.
We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments.
The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z) - Cross-lingual Approach to Abstractive Summarization [0.0]
Cross-lingual model transfers are successfully applied in low-resource languages.
We used a pretrained English summarization model based on deep neural networks and sequence-to-sequence architecture.
We developed several models with different proportions of target language data for fine-tuning.
arXiv Detail & Related papers (2020-12-08T09:30:38Z) - CDEvalSumm: An Empirical Study of Cross-Dataset Evaluation for Neural
Summarization Systems [121.78477833009671]
We investigate the performance of different summarization models under a cross-dataset setting.
A comprehensive study of 11 representative summarization systems on 5 datasets from different domains reveals the effect of model architectures and generation ways.
arXiv Detail & Related papers (2020-10-11T02:19:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.