A Holistic Approach to Undesired Content Detection in the Real World
- URL: http://arxiv.org/abs/2208.03274v1
- Date: Fri, 5 Aug 2022 16:47:23 GMT
- Title: A Holistic Approach to Undesired Content Detection in the Real World
- Authors: Todor Markov, Chong Zhang, Sandhini Agarwal, Tyna Eloundou, Teddy Lee,
Steven Adler, Angela Jiang, Lilian Weng
- Abstract summary: We present a holistic approach to building a robust natural language classification system for real-world content moderation.
The success of such a system relies on a chain of carefully designed and executed steps, including the design of content and labeling instructions.
Our moderation system is trained to detect a broad set of categories of undesired content, including sexual content, hateful content, violence, self-harm, and harassment.
- Score: 4.626056557184189
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a holistic approach to building a robust and useful natural
language classification system for real-world content moderation. The success
of such a system relies on a chain of carefully designed and executed steps,
including the design of content taxonomies and labeling instructions, data
quality control, an active learning pipeline to capture rare events, and a
variety of methods to make the model robust and to avoid overfitting. Our
moderation system is trained to detect a broad set of categories of undesired
content, including sexual content, hateful content, violence, self-harm, and
harassment. This approach generalizes to a wide range of different content
taxonomies and can be used to create high-quality content classifiers that
outperform off-the-shelf models.
Related papers
- ToVo: Toxicity Taxonomy via Voting [25.22398575368979]
We propose a dataset creation mechanism that integrates voting and chain-of-thought processes.
Our methodology ensures diverse classification metrics for each sample.
We utilize the dataset created through our proposed mechanism to train our model.
arXiv Detail & Related papers (2024-06-21T02:35:30Z) - Labeling Comic Mischief Content in Online Videos with a Multimodal Hierarchical-Cross-Attention Model [10.666877191424792]
We propose a novel end-to-end multimodal system for the task of comic mischief detection.
We release a novel dataset for the targeted task consisting of three modalities: video, text (video captions and subtitles), and audio.
The results show that the proposed approach makes a significant improvement over robust baselines.
arXiv Detail & Related papers (2024-06-12T03:16:45Z) - Contextualization Distillation from Large Language Model for Knowledge
Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks.
Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments.
Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z) - CLAP: Isolating Content from Style through Contrastive Learning with Augmented Prompts [11.752632557524969]
We propose contrastive learning with data augmentation to disentangle content features from the original representations.
Our experiments across diverse datasets demonstrate significant improvements in zero-shot and few-shot classification tasks.
arXiv Detail & Related papers (2023-11-28T03:00:59Z) - Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph
Propagation [68.13453771001522]
We propose a multimodal intensive ZSL framework that matches regions of images with corresponding semantic embeddings.
We conduct extensive experiments and evaluate our model on large-scale real-world data.
arXiv Detail & Related papers (2023-06-14T13:07:48Z) - Safety and Fairness for Content Moderation in Generative Models [0.7992463811844456]
We provide a theoretical framework for conceptualizing responsible content moderation of text-to-image generative technologies.
We define and distinguish the concepts of safety, fairness, and metric equity, and enumerate example harms that can come in each domain.
We conclude with a summary of how the style of harms we demonstrate enables data-driven content moderation decisions.
arXiv Detail & Related papers (2023-06-09T01:37:32Z) - Countering Malicious Content Moderation Evasion in Online Social
Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems.
This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z) - Representation Learning for the Automatic Indexing of Sound Effects
Libraries [79.68916470119743]
We show that a task-specific but dataset-independent representation can successfully address data issues such as class imbalance, inconsistent class labels, and insufficient dataset size.
Detailed experimental results show the impact of metric learning approaches and different cross-dataset training methods on representational effectiveness.
arXiv Detail & Related papers (2022-08-18T23:46:13Z) - Generating More Pertinent Captions by Leveraging Semantics and Style on
Multi-Source Datasets [56.018551958004814]
This paper addresses the task of generating fluent descriptions by training on a non-uniform combination of data sources.
Large-scale datasets with noisy image-text pairs provide a sub-optimal source of supervision.
We propose to leverage and separate semantics and descriptive style through the incorporation of a style token and keywords extracted through a retrieval component.
arXiv Detail & Related papers (2021-11-24T19:00:05Z) - NewsEmbed: Modeling News through Pre-trained DocumentRepresentations [5.007237648361745]
We propose a novel approach to mine semantically-relevant fresh documents, and their topic labels, with little human supervision.
We show that the proposed approach can provide billions of high quality organic training examples and can be naturally extended to multilingual setting.
arXiv Detail & Related papers (2021-06-01T15:59:40Z) - Automatic Validation of Textual Attribute Values in E-commerce Catalog
by Learning with Limited Labeled Data [61.789797281676606]
We propose a novel meta-learning latent variable approach, called MetaBridge.
It can learn transferable knowledge from a subset of categories with limited labeled data.
It can capture the uncertainty of never-seen categories with unlabeled data.
arXiv Detail & Related papers (2020-06-15T21:31:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.