MetaHarm: Harmful YouTube Video Dataset Annotated by Domain Experts, GPT-4-Turbo, and Crowdworkers
- URL: http://arxiv.org/abs/2504.16304v1
- Date: Tue, 22 Apr 2025 22:45:16 GMT
- Title: MetaHarm: Harmful YouTube Video Dataset Annotated by Domain Experts, GPT-4-Turbo, and Crowdworkers
- Authors: Wonjeong Jo, Magdalena Wojcieszak,
- Abstract summary: Short video platforms, such as YouTube, Instagram, or TikTok, expose users to harmful content.<n>We present two large-scale datasets of multi-modal and multi-categorical online harm.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Short video platforms, such as YouTube, Instagram, or TikTok, are used by billions of users. These platforms expose users to harmful content, ranging from clickbait or physical harms to hate or misinformation. Yet, we lack a comprehensive understanding and measurement of online harm on short video platforms. Toward this end, we present two large-scale datasets of multi-modal and multi-categorical online harm: (1) 60,906 systematically selected potentially harmful YouTube videos and (2) 19,422 videos annotated by three labeling actors: trained domain experts, GPT-4-Turbo (using 14 image frames, 1 thumbnail, and text metadata), and crowdworkers (Amazon Mechanical Turk master workers). The annotated dataset includes both (a) binary classification (harmful vs. harmless) and (b) multi-label categorizations of six harm categories: Information, Hate and harassment, Addictive, Clickbait, Sexual, and Physical harms. Furthermore, the annotated dataset provides (1) ground truth data with videos annotated consistently across (a) all three actors and (b) the majority of the labeling actors, and (2) three data subsets labeled by individual actors. These datasets are expected to facilitate future work on online harm, aid in (multi-modal) classification efforts, and advance the identification and potential mitigation of harmful content on video platforms.
Related papers
- Simple Visual Artifact Detection in Sora-Generated Videos [9.991747596111011]
This study investigates visual artifacts frequently found and reported in Sora-generated videos.
We propose a multi-label classification framework targeting four common artifact label types.
The best-performing model trained by ResNet-50 achieved an average multi-label classification accuracy of 94.14%.
arXiv Detail & Related papers (2025-04-30T05:41:43Z) - Harmful YouTube Video Detection: A Taxonomy of Online Harm and MLLMs as Alternative Annotators [0.0]
This study advances measures and methods to detect harm in video content.
We develop a comprehensive taxonomy for online harm on video platforms, categorizing it into six categories: Information, Hate and harassment, Addictive, Clickbait, Sexual, and Physical harms.
We analyze 19,422 YouTube videos using 14 image frames, 1 thumbnail, and text metadata, comparing the accuracy of crowdworkers (Mturk) and GPT-4-Turbo.
arXiv Detail & Related papers (2024-11-06T23:48:30Z) - MultiHateClip: A Multilingual Benchmark Dataset for Hateful Video Detection on YouTube and Bilibili [11.049937698021054]
This study presents MultiHateClip, a novel multilingual dataset created through hate lexicons and human annotation.
It aims to enhance the detection of hateful videos on platforms such as YouTube and Bilibili, including content in both English and Chinese languages.
arXiv Detail & Related papers (2024-07-28T08:19:09Z) - Into the LAIONs Den: Investigating Hate in Multimodal Datasets [67.21783778038645]
This paper investigates the effect of scaling datasets on hateful content through a comparative audit of two datasets: LAION-400M and LAION-2B.
We found that hate content increased by nearly 12% with dataset scale, measured both qualitatively and quantitatively.
We also found that filtering dataset contents based on Not Safe For Work (NSFW) values calculated based on images alone does not exclude all the harmful content in alt-text.
arXiv Detail & Related papers (2023-11-06T19:00:05Z) - Micro-video Tagging via Jointly Modeling Social Influence and Tag
Relation [56.23157334014773]
85.7% of micro-videos lack annotation.
Existing methods mostly focus on analyzing video content, neglecting users' social influence and tag relation.
We formulate micro-video tagging as a link prediction problem in a constructed heterogeneous network.
arXiv Detail & Related papers (2023-03-15T02:13:34Z) - Video Manipulations Beyond Faces: A Dataset with Human-Machine Analysis [60.13902294276283]
We present VideoSham, a dataset consisting of 826 videos (413 real and 413 manipulated).
Many of the existing deepfake datasets focus exclusively on two types of facial manipulations -- swapping with a different subject's face or altering the existing face.
Our analysis shows that state-of-the-art manipulation detection algorithms only work for a few specific attacks and do not scale well on VideoSham.
arXiv Detail & Related papers (2022-07-26T17:39:04Z) - DisinfoMeme: A Multimodal Dataset for Detecting Meme Intentionally
Spreading Out Disinformation [72.18912216025029]
We present DisinfoMeme to help detect disinformation memes.
The dataset contains memes mined from Reddit covering three current topics: the COVID-19 pandemic, the Black Lives Matter movement, and veganism/vegetarianism.
arXiv Detail & Related papers (2022-05-25T09:54:59Z) - TeamX@DravidianLangTech-ACL2022: A Comparative Analysis for Troll-Based
Meme Classification [21.32190107220764]
harmful content online raised concerns among social media platforms, government agencies, policymakers, and society as a whole.
Among different harmful content textittrolling-based online content is one of them, where the idea is to post a message that is provocative, offensive, or menacing with an intent to mislead the audience.
This study provides a comparative analysis of troll-based memes classification using the textual, visual, and multimodal content.
arXiv Detail & Related papers (2022-05-09T16:19:28Z) - Detecting Harmful Content On Online Platforms: What Platforms Need Vs.
Where Research Efforts Go [44.774035806004214]
harmful content on online platforms comes in many different forms including hate speech, offensive language, bullying and harassment, misinformation, spam, violence, graphic content, sexual abuse, self harm, and many other.
Online platforms seek to moderate such content to limit societal harm, to comply with legislation, and to create a more inclusive environment for their users.
There is currently a dichotomy between what types of harmful content online platforms seek to curb, and what research efforts there are to automatically detect such content.
arXiv Detail & Related papers (2021-02-27T08:01:10Z) - Trawling for Trolling: A Dataset [56.1778095945542]
We present a dataset that models trolling as a subcategory of offensive content.
The dataset has 12,490 samples, split across 5 classes; Normal, Profanity, Trolling, Derogatory and Hate Speech.
arXiv Detail & Related papers (2020-08-02T17:23:55Z) - Transfer Learning for Hate Speech Detection in Social Media [14.759208309842178]
This paper uses a transfer learning technique to leverage two independent datasets jointly.
We build an interpretable two-dimensional visualization tool of the constructed hate speech representation -- dubbed the Map of Hate.
We show that the joint representation boosts prediction performances when only a limited amount of supervision is available.
arXiv Detail & Related papers (2019-06-10T08:00:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.