Classifying YouTube Comments Based on Sentiment and Type of Sentence
- URL: http://arxiv.org/abs/2111.01908v1
- Date: Sun, 31 Oct 2021 18:08:10 GMT
- Title: Classifying YouTube Comments Based on Sentiment and Type of Sentence
- Authors: Rhitabrat Pokharel and Dixit Bhatta
- Abstract summary: We address the challenge of text extraction and classification from YouTube comments using well-known statistical measures and machine learning models.
The results show that our approach that incorporates conventional methods performs well on the classification task, validating its potential in assisting content creators increase viewer engagement on their channel.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: As a YouTube channel grows, each video can potentially collect enormous
amounts of comments that provide direct feedback from the viewers. These
comments are a major means of understanding viewer expectations and improving
channel engagement. However, the comments only represent a general collection
of user opinions about the channel and the content. Many comments are poorly
constructed, trivial, and have improper spellings and grammatical errors. As a
result, it is a tedious job to identify the comments that best interest the
content creators. In this paper, we extract and classify the raw comments into
different categories based on both sentiment and sentence types that will help
YouTubers find relevant comments for growing their viewership. Existing studies
have focused either on sentiment analysis (positive and negative) or
classification of sub-types within the same sentence types (e.g., types of
questions) on a text corpus. These have limited application on non-traditional
text corpus like YouTube comments. We address this challenge of text extraction
and classification from YouTube comments using well-known statistical measures
and machine learning models. We evaluate each combination of statistical
measure and the machine learning model using cross validation and $F_1$ scores.
The results show that our approach that incorporates conventional methods
performs well on the classification task, validating its potential in assisting
content creators increase viewer engagement on their channel.
Related papers
- CineXDrama: Relevance Detection and Sentiment Analysis of Bangla YouTube Comments on Movie-Drama using Transformers: Insights from Interpretability Tool [0.0]
We propose a system that first assesses the relevance of comments and then analyzes the sentiment of those deemed relevant.
We introduce a dataset of 14,000 manually collected and preprocessed comments, annotated for relevance (relevant or irrelevant) and sentiment (positive or negative)
arXiv Detail & Related papers (2024-11-10T18:04:41Z) - HOTVCOM: Generating Buzzworthy Comments for Videos [49.39846630199698]
This study introduces textscHotVCom, the largest Chinese video hot-comment dataset, comprising 94k diverse videos and 137 million comments.
We also present the textttComHeat framework, which synergistically integrates visual, auditory, and textual data to generate influential hot-comments on the Chinese video dataset.
arXiv Detail & Related papers (2024-09-23T16:45:13Z) - ViCo: Engaging Video Comment Generation with Human Preference Rewards [68.50351391812723]
We propose ViCo with three novel designs to tackle the challenges for generating engaging Video Comments.
To quantify the engagement of comments, we utilize the number of "likes" each comment receives as a proxy of human preference.
To automatically evaluate the engagement of comments, we train a reward model to align its judgment to the above proxy.
arXiv Detail & Related papers (2023-08-22T04:01:01Z) - Models See Hallucinations: Evaluating the Factuality in Video Captioning [57.85548187177109]
We conduct a human evaluation of the factuality in video captioning and collect two annotated factuality datasets.
We find that 57.0% of the model-generated sentences have factual errors, indicating it is a severe problem in this field.
We propose a weakly-supervised, model-based factuality metric FactVC, which outperforms previous metrics on factuality evaluation of video captioning.
arXiv Detail & Related papers (2023-03-06T08:32:50Z) - EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained
Embedding Matching [90.98122161162644]
Current metrics for video captioning are mostly based on the text-level comparison between reference and candidate captions.
We propose EMScore (Embedding Matching-based score), a novel reference-free metric for video captioning.
We exploit a well pre-trained vision-language model to extract visual and linguistic embeddings for computing EMScore.
arXiv Detail & Related papers (2021-11-17T06:02:43Z) - Stay on Topic, Please: Aligning User Comments to the Content of a News
Article [7.3203631241415055]
We propose a classification algorithm to categorize user comments posted to a new article base don their alignment to its content.
The alignment seek to match user comments to an article based on similarity off content, entities in discussion, and topic.
We conduct a user study to evaluate human labeling performance to understand the difficulty of the classification task.
arXiv Detail & Related papers (2021-03-03T18:29:00Z) - Understanding YouTube Communities via Subscription-based Channel
Embeddings [0.0]
This paper presents new methods to discover and classify YouTube channels.
The methods use a self-supervised learning approach that leverages the public subscription pages of commenters.
We create a new dataset to analyze the amount of traffic going to different political content.
arXiv Detail & Related papers (2020-10-19T22:00:04Z) - Cooking Is All About People: Comment Classification On Cookery Channels
Using BERT and Classification Models (Malayalam-English Mix-Code) [0.0]
We have evaluated top-performing classification models for classifying comments which are a mix of different combinations of English and Malayalam.
Results indicate that Multinomial Naive Bayes, K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Random Forest and Decision Trees offer similar level of accuracy in comment classification.
arXiv Detail & Related papers (2020-06-15T19:07:06Z) - A Unified Dual-view Model for Review Summarization and Sentiment
Classification with Inconsistency Loss [51.448615489097236]
Acquiring accurate summarization and sentiment from user reviews is an essential component of modern e-commerce platforms.
We propose a novel dual-view model that jointly improves the performance of these two tasks.
Experiment results on four real-world datasets from different domains demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2020-06-02T13:34:11Z) - Mi YouTube es Su YouTube? Analyzing the Cultures using YouTube
Thumbnails of Popular Videos [98.87558262467257]
This study explores culture preferences among countries using the thumbnails of YouTube trending videos.
Experimental results indicate that the users from similar cultures shares interests in watching similar videos on YouTube.
arXiv Detail & Related papers (2020-01-27T20:15:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.