Experimentation in Content Moderation using RWKV
- URL: http://arxiv.org/abs/2409.03939v1
- Date: Thu, 5 Sep 2024 23:17:18 GMT
- Title: Experimentation in Content Moderation using RWKV
- Authors: Umut Yildirim, Rohan Dutta, Burak Yildirim, Atharva Vaidya,
- Abstract summary: This paper investigates the RWKV model's efficacy in content moderation through targeted experimentation.
We introduce a novel dataset specifically designed for distillation into smaller models.
We generated an extensive set of responses -- 558,958 for text and 83,625 for images -- to train and refine content moderation systems.
- Score: 0.7499722271664147
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: This paper investigates the RWKV model's efficacy in content moderation through targeted experimentation. We introduce a novel dataset specifically designed for distillation into smaller models, enhancing content moderation practices. This comprehensive dataset encompasses images, videos, sounds, and text data that present societal challenges. Leveraging advanced Large Language Models (LLMs), we generated an extensive set of responses -- 558,958 for text and 83,625 for images -- to train and refine content moderation systems. Our core experimentation involved fine-tuning the RWKV model, capitalizing on its CPU-efficient architecture to address large-scale content moderation tasks. By highlighting the dataset's potential for knowledge distillation, this study not only demonstrates RWKV's capability in improving the accuracy and efficiency of content moderation systems but also paves the way for developing more compact, resource-efficient models in this domain. Datasets and models can be found in HuggingFace: https://huggingface.co/modrwkv
Related papers
- Dataset Distillation via Committee Voting [21.018818924580877]
We introduce $bf C$ommittee $bf V$oting for $bf D$ataset $bf D$istillation (CV-DD)
CV-DD is a novel approach that leverages the collective wisdom of multiple models or experts to create high-quality distilled datasets.
arXiv Detail & Related papers (2025-01-13T18:59:48Z) - Data-to-Model Distillation: Data-Efficient Learning Framework [14.44010988811002]
We propose a novel framework called Data-to-Model Distillation (D2M) to distill the real dataset's knowledge into the learnable parameters of a pre-trained generative model.
Our method effectively scales up to high-resolution 128x128 ImageNet-1K.
arXiv Detail & Related papers (2024-11-19T20:10:28Z) - DocMamba: Efficient Document Pre-training with State Space Model [56.84200017560988]
We present DocMamba, a novel framework based on the state space model.
It is designed to reduce computational complexity to linear while preserving global modeling capabilities.
Experiments on the HRDoc confirm DocMamba's potential for length extrapolation.
arXiv Detail & Related papers (2024-09-18T11:34:28Z) - EDADepth: Enhanced Data Augmentation for Monocular Depth Estimation [4.477063987845632]
EDADepth is an enhanced data augmentation method to estimate monocular depth without using additional training data.
We employ the BEiT pre-trained semantic segmentation model for better extraction of text embeddings.
Our model achieves state-of-the-art results (SOTA) on the delta3 metric on NYUv2 and KITTI datasets.
arXiv Detail & Related papers (2024-09-10T03:25:24Z) - Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development [67.55944651679864]
We present a new sandbox suite tailored for integrated data-model co-development.
This sandbox provides a feedback-driven experimental platform, enabling cost-effective and guided refinement of both data and models.
arXiv Detail & Related papers (2024-07-16T14:40:07Z) - RWKV-CLIP: A Robust Vision-Language Representation Learner [31.501759213619646]
Contrastive Language-Image Pre-training (CLIP) has significantly improved performance in various vision-language tasks.
We introduce a diverse description generation framework that can leverage Large Language Models (LLMs) to synthesize and refine content from web-based texts, synthetic captions, and detection tags.
We propose RWKV-CLIP, the first RWKV-driven vision-language representation learning model that combines the effective parallel training of transformers with the efficient inference of RNNs.
arXiv Detail & Related papers (2024-06-11T06:10:46Z) - Self-supervised Dataset Distillation: A Good Compression Is All You Need [23.02066055996762]
We introduce SC-DD, a simple yet effective Self-supervised Compression framework for dataset distillation.
The proposed SC-DD outperforms all previous state-of-the-art supervised dataset distillation methods when employing larger models.
Experiments are conducted on CIFAR-100, Tiny-ImageNet and ImageNet-1K datasets to demonstrate the superiority of our proposed approach.
arXiv Detail & Related papers (2024-04-11T17:56:40Z) - Distribution-Aware Data Expansion with Diffusion Models [55.979857976023695]
We propose DistDiff, a training-free data expansion framework based on the distribution-aware diffusion model.
DistDiff consistently enhances accuracy across a diverse range of datasets compared to models trained solely on original data.
arXiv Detail & Related papers (2024-03-11T14:07:53Z) - Harnessing Diffusion Models for Visual Perception with Meta Prompts [68.78938846041767]
We propose a simple yet effective scheme to harness a diffusion model for visual perception tasks.
We introduce learnable embeddings (meta prompts) to the pre-trained diffusion models to extract proper features for perception.
Our approach achieves new performance records in depth estimation tasks on NYU depth V2 and KITTI, and in semantic segmentation task on CityScapes.
arXiv Detail & Related papers (2023-12-22T14:40:55Z) - StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized
Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning.
This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models.
Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z) - Lafite2: Few-shot Text-to-Image Generation [132.14211027057766]
We propose a novel method for pre-training text-to-image generation model on image-only datasets.
It considers a retrieval-then-optimization procedure to synthesize pseudo text features.
It can be beneficial to a wide range of settings, including the few-shot, semi-supervised and fully-supervised learning.
arXiv Detail & Related papers (2022-10-25T16:22:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.