Leveraging Large-scale Multimedia Datasets to Refine Content Moderation
Models
- URL: http://arxiv.org/abs/2212.00668v1
- Date: Thu, 1 Dec 2022 17:19:13 GMT
- Title: Leveraging Large-scale Multimedia Datasets to Refine Content Moderation
Models
- Authors: Ioannis Sarridis, Christos Koutlis, Olga Papadopoulou, and Symeon
Papadopoulos
- Abstract summary: We propose a framework that leverages large-scale multimedia datasets to refine content moderation models.
The proposed method is evaluated on the Not Safe for Work (NSFW) and disturbing content detection tasks.
It significantly reduces human involvement, as 92.54% of data are automatically annotated in case of disturbing content.
- Score: 8.147198294451151
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The sheer volume of online user-generated content has rendered content
moderation technologies essential in order to protect digital platform
audiences from content that may cause anxiety, worry, or concern. Despite the
efforts towards developing automated solutions to tackle this problem, creating
accurate models remains challenging due to the lack of adequate task-specific
training data. The fact that manually annotating such data is a highly
demanding procedure that could severely affect the annotators' emotional
well-being is directly related to the latter limitation. In this paper, we
propose the CM-Refinery framework that leverages large-scale multimedia
datasets to automatically extend initial training datasets with hard examples
that can refine content moderation models, while significantly reducing the
involvement of human annotators. We apply our method on two model adaptation
strategies designed with respect to the different challenges observed while
collecting data, i.e. lack of (i) task-specific negative data or (ii) both
positive and negative data. Additionally, we introduce a diversity criterion
applied to the data collection process that further enhances the generalization
performance of the refined models. The proposed method is evaluated on the Not
Safe for Work (NSFW) and disturbing content detection tasks on benchmark
datasets achieving 1.32% and 1.94% accuracy improvements compared to the state
of the art, respectively. Finally, it significantly reduces human involvement,
as 92.54% of data are automatically annotated in case of disturbing content
while no human intervention is required for the NSFW task.
Related papers
- Semi-Supervised Reward Modeling via Iterative Self-Training [52.48668920483908]
We propose Semi-Supervised Reward Modeling (SSRM), an approach that enhances RM training using unlabeled data.
We demonstrate that SSRM significantly improves reward models without incurring additional labeling costs.
Overall, SSRM substantially reduces the dependency on large volumes of human-annotated data, thereby decreasing the overall cost and time involved in training effective reward models.
arXiv Detail & Related papers (2024-09-10T22:57:58Z) - A Simple Background Augmentation Method for Object Detection with Diffusion Model [53.32935683257045]
In computer vision, it is well-known that a lack of data diversity will impair model performance.
We propose a simple yet effective data augmentation approach by leveraging advancements in generative models.
Background augmentation, in particular, significantly improves the models' robustness and generalization capabilities.
arXiv Detail & Related papers (2024-08-01T07:40:00Z) - Mimicking User Data: On Mitigating Fine-Tuning Risks in Closed Large Language Models [53.50543146583101]
Fine-tuning large language models on small datasets can enhance their performance on specific downstream tasks.
Malicious actors can subtly manipulate the structure of almost any task-specific dataset to foster significantly more dangerous model behaviors.
We propose a novel mitigation strategy that mixes in safety data which mimics the task format and prompting style of the user data.
arXiv Detail & Related papers (2024-06-12T18:33:11Z) - Sexism Detection on a Data Diet [14.899608305188002]
We show how we can leverage influence scores to estimate the importance of a data point while training a model.
We evaluate the model performance trained on data pruned with different pruning strategies on three out-of-domain datasets.
arXiv Detail & Related papers (2024-06-07T12:39:54Z) - GISTEmbed: Guided In-sample Selection of Training Negatives for Text
Embedding Fine-tuning [0.0]
GISTEmbed is a novel strategy that enhances in-batch negative selection during contrastive training through a guide model.
Benchmarked against the Massive Text Embedding Benchmark (MTEB), GISTEmbed showcases consistent performance improvements across various model sizes.
arXiv Detail & Related papers (2024-02-26T18:55:15Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - One-Shot Federated Learning with Classifier-Guided Diffusion Models [44.604485649167216]
One-shot federated learning (OSFL) has gained attention in recent years due to its low communication cost.
In this paper, we explore the novel opportunities that diffusion models bring to OSFL and propose FedCADO.
FedCADO generates data that complies with clients' distributions and subsequently training the aggregated model on the server.
arXiv Detail & Related papers (2023-11-15T11:11:25Z) - Informative Data Mining for One-Shot Cross-Domain Semantic Segmentation [84.82153655786183]
We propose a novel framework called Informative Data Mining (IDM) to enable efficient one-shot domain adaptation for semantic segmentation.
IDM provides an uncertainty-based selection criterion to identify the most informative samples, which facilitates quick adaptation and reduces redundant training.
Our approach outperforms existing methods and achieves a new state-of-the-art one-shot performance of 56.7%/55.4% on the GTA5/SYNTHIA to Cityscapes adaptation tasks.
arXiv Detail & Related papers (2023-09-25T15:56:01Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - CausalAgents: A Robustness Benchmark for Motion Forecasting using Causal
Relationships [8.679073301435265]
We construct a new benchmark for evaluating and improving model robustness by applying perturbations to existing data.
We use these labels to perturb the data by deleting non-causal agents from the scene.
Under non-causal perturbations, we observe a $25$-$38%$ relative change in minADE as compared to the original.
arXiv Detail & Related papers (2022-07-07T21:28:23Z) - Exploring the Efficacy of Automatically Generated Counterfactuals for
Sentiment Analysis [17.811597734603144]
We propose an approach to automatically generating counterfactual data for data augmentation and explanation.
A comprehensive evaluation on several different datasets and using a variety of state-of-the-art benchmarks demonstrate how our approach can achieve significant improvements in model performance.
arXiv Detail & Related papers (2021-06-29T10:27:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.