Hate, Obscenity, and Insults: Measuring the Exposure of Children to
Inappropriate Comments in YouTube
- URL: http://arxiv.org/abs/2103.09050v1
- Date: Wed, 3 Mar 2021 20:15:22 GMT
- Title: Hate, Obscenity, and Insults: Measuring the Exposure of Children to
Inappropriate Comments in YouTube
- Authors: Sultan Alshamrani, Ahmed Abusnaina, Mohammed Abuhamad, Daehun Nyang,
David Mohaisen
- Abstract summary: In this paper, we investigate the exposure of young users to inappropriate comments posted on YouTube videos targeting this demographic.
We collected a large-scale dataset of approximately four million records and studied the presence of five age-inappropriate categories and the amount of exposure to each category.
Using natural language processing and machine learning techniques, we constructed ensemble classifiers that achieved high accuracy in detecting inappropriate comments.
- Score: 8.688428251722911
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Social media has become an essential part of the daily routines of children
and adolescents. Moreover, enormous efforts have been made to ensure the
psychological and emotional well-being of young users as well as their safety
when interacting with various social media platforms. In this paper, we
investigate the exposure of those users to inappropriate comments posted on
YouTube videos targeting this demographic. We collected a large-scale dataset
of approximately four million records and studied the presence of five
age-inappropriate categories and the amount of exposure to each category. Using
natural language processing and machine learning techniques, we constructed
ensemble classifiers that achieved high accuracy in detecting inappropriate
comments. Our results show a large percentage of worrisome comments with
inappropriate content: we found 11% of the comments on children's videos to be
toxic, highlighting the importance of monitoring comments, particularly on
children's platforms.
Related papers
- HOTVCOM: Generating Buzzworthy Comments for Videos [49.39846630199698]
This study introduces textscHotVCom, the largest Chinese video hot-comment dataset, comprising 94k diverse videos and 137 million comments.
We also present the textttComHeat framework, which synergistically integrates visual, auditory, and textual data to generate influential hot-comments on the Chinese video dataset.
arXiv Detail & Related papers (2024-09-23T16:45:13Z) - More Skin, More Likes! Measuring Child Exposure and User Engagement on TikTok [0.0]
Study investigates children's exposure on TikTok.
Analyzing 432,178 comments across 5,896 videos from 115 user accounts featuring children.
arXiv Detail & Related papers (2024-08-10T19:44:12Z) - Security Advice for Parents and Children About Content Filtering and
Circumvention as Found on YouTube and TikTok [2.743215038883957]
We examine the advice available to parents and children regarding content filtering and circumvention as found on YouTube and TikTok.
Our results show that of these videos, roughly three-quarters are accurate, with the remaining one-fourth containing factually incorrect advice.
We find that videos targeting children are both more likely to be incorrect and actionable than videos targeting parents, leaving children at increased risk of taking harmful action.
arXiv Detail & Related papers (2024-02-05T18:12:33Z) - ViCo: Engaging Video Comment Generation with Human Preference Rewards [68.50351391812723]
We propose ViCo with three novel designs to tackle the challenges for generating engaging Video Comments.
To quantify the engagement of comments, we utilize the number of "likes" each comment receives as a proxy of human preference.
To automatically evaluate the engagement of comments, we train a reward model to align its judgment to the above proxy.
arXiv Detail & Related papers (2023-08-22T04:01:01Z) - An Image is Worth a Thousand Toxic Words: A Metamorphic Testing
Framework for Content Moderation Software [64.367830425115]
Social media platforms are being increasingly misused to spread toxic content, including hate speech, malicious advertising, and pornography.
Despite tremendous efforts in developing and deploying content moderation methods, malicious users can evade moderation by embedding texts into images.
We propose a metamorphic testing framework for content moderation software.
arXiv Detail & Related papers (2023-08-18T20:33:06Z) - SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable
Responses Created Through Human-Machine Collaboration [75.62448812759968]
This dataset is a large-scale Korean dataset of 49k sensitive questions with 42k acceptable and 46k non-acceptable responses.
The dataset was constructed leveraging HyperCLOVA in a human-in-the-loop manner based on real news headlines.
arXiv Detail & Related papers (2023-05-28T11:51:20Z) - Malicious or Benign? Towards Effective Content Moderation for Children's
Videos [1.0323063834827415]
This paper introduces our toolkit Malicious or Benign for promoting research on automated content moderation of children's videos.
We present 1) a customizable annotation tool for videos, 2) a new dataset with difficult to detect test cases of malicious content, and 3) a benchmark suite of state-of-the-art video classification models.
arXiv Detail & Related papers (2023-05-24T20:33:38Z) - Analyzing Norm Violations in Live-Stream Chat [49.120561596550395]
We study the first NLP study dedicated to detecting norm violations in conversations on live-streaming platforms.
We define norm violation categories in live-stream chats and annotate 4,583 moderated comments from Twitch.
Our results show that appropriate contextual information can boost moderation performance by 35%.
arXiv Detail & Related papers (2023-05-18T05:58:27Z) - Fighting Malicious Media Data: A Survey on Tampering Detection and
Deepfake Detection [115.83992775004043]
Recent advances in deep learning, particularly deep generative models, open the doors for producing perceptually convincing images and videos at a low cost.
This paper provides a comprehensive review of the current media tampering detection approaches, and discusses the challenges and trends in this field for future research.
arXiv Detail & Related papers (2022-12-12T02:54:08Z) - Privacy Concerns in Chatbot Interactions: When to Trust and When to
Worry [3.867363075280544]
We surveyed a representative sample of 491 British citizens.
Our results show that the user concerns focus on deleting personal information and concerns about their data's inappropriate use.
We also identified that individuals were concerned about losing control over their data after a conversation with conversational agents.
arXiv Detail & Related papers (2021-07-08T16:31:58Z) - Characterizing Abhorrent, Misinformative, and Mistargeted Content on
YouTube [1.9138099871648453]
We study the degree of problematic content on YouTube and the role of the recommendation algorithm in the dissemination of such content.
Our analysis reveals that young children are likely to encounter disturbing content when they randomly browse the platform.
We find that Incel activity is increasing over time and that platforms may play an active role in steering users towards extreme content.
arXiv Detail & Related papers (2021-05-20T15:10:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.