Socio-Culturally Aware Evaluation Framework for LLM-Based Content Moderation
- URL: http://arxiv.org/abs/2412.13578v1
- Date: Wed, 18 Dec 2024 07:57:18 GMT
- Title: Socio-Culturally Aware Evaluation Framework for LLM-Based Content Moderation
- Authors: Shanu Kumar, Gauri Kholkar, Saish Mendke, Anubhav Sadana, Parag Agrawal, Sandipan Dandapat,
- Abstract summary: We propose a socio-culturally aware evaluation framework for content moderation.
We introduce a scalable method for creating diverse datasets using persona-based generation.
- Score: 10.724258809442958
- License:
- Abstract: With the growth of social media and large language models, content moderation has become crucial. Many existing datasets lack adequate representation of different groups, resulting in unreliable assessments. To tackle this, we propose a socio-culturally aware evaluation framework for LLM-driven content moderation and introduce a scalable method for creating diverse datasets using persona-based generation. Our analysis reveals that these datasets provide broader perspectives and pose greater challenges for LLMs than diversity-focused generation methods without personas. This challenge is especially pronounced in smaller LLMs, emphasizing the difficulties they encounter in moderating such diverse content.
Related papers
- Towards Safer Social Media Platforms: Scalable and Performant Few-Shot Harmful Content Moderation Using Large Language Models [9.42299478071576]
Harmful content on social media platforms poses significant risks to users and society.
Current approaches rely on human moderators, supervised classifiers, and large volumes of training data.
We utilize Large Language Models (LLMs) to undertake few-shot dynamic content moderation via in-context learning.
arXiv Detail & Related papers (2025-01-23T00:19:14Z) - Beyond Binary: Towards Fine-Grained LLM-Generated Text Detection via Role Recognition and Involvement Measurement [51.601916604301685]
Large language models (LLMs) generate content that can undermine trust in online discourse.
Current methods often focus on binary classification, failing to address the complexities of real-world scenarios like human-LLM collaboration.
To move beyond binary classification and address these challenges, we propose a new paradigm for detecting LLM-generated content.
arXiv Detail & Related papers (2024-10-18T08:14:10Z) - Hate Personified: Investigating the role of LLMs in content moderation [64.26243779985393]
For subjective tasks such as hate detection, where people perceive hate differently, the Large Language Model's (LLM) ability to represent diverse groups is unclear.
By including additional context in prompts, we analyze LLM's sensitivity to geographical priming, persona attributes, and numerical information to assess how well the needs of various groups are reflected.
arXiv Detail & Related papers (2024-10-03T16:43:17Z) - Social Debiasing for Fair Multi-modal LLMs [55.8071045346024]
Multi-modal Large Language Models (MLLMs) have advanced significantly, offering powerful vision-language understanding capabilities.
However, these models often inherit severe social biases from their training datasets, leading to unfair predictions based on attributes like race and gender.
This paper addresses the issue of social biases in MLLMs by i) Introducing a comprehensive Counterfactual dataset with Multiple Social Concepts (CMSC) and ii) Proposing an Anti-Stereotype Debiasing strategy (ASD)
arXiv Detail & Related papers (2024-08-13T02:08:32Z) - CulturalTeaming: AI-Assisted Interactive Red-Teaming for Challenging LLMs' (Lack of) Multicultural Knowledge [69.82940934994333]
We introduce CulturalTeaming, an interactive red-teaming system that leverages human-AI collaboration to build challenging evaluation dataset.
Our study reveals that CulturalTeaming's various modes of AI assistance support annotators in creating cultural questions.
CULTURALBENCH-V0.1 is a compact yet high-quality evaluation dataset with users' red-teaming attempts.
arXiv Detail & Related papers (2024-04-10T00:25:09Z) - Recent Advances in Hate Speech Moderation: Multimodality and the Role of Large Models [52.24001776263608]
This comprehensive survey delves into the recent strides in HS moderation.
We highlight the burgeoning role of large language models (LLMs) and large multimodal models (LMMs)
We identify existing gaps in research, particularly in the context of underrepresented languages and cultures.
arXiv Detail & Related papers (2024-01-30T03:51:44Z) - Incorporating Visual Experts to Resolve the Information Loss in
Multimodal Large Language Models [121.83413400686139]
This paper proposes to improve the visual perception ability of MLLMs through a mixture-of-experts knowledge enhancement mechanism.
We introduce a novel method that incorporates multi-task encoders and visual tools into the existing MLLMs training and inference pipeline.
arXiv Detail & Related papers (2024-01-06T02:02:34Z) - A Group Fairness Lens for Large Language Models [34.0579082699443]
Large language models can perpetuate biases and unfairness when deployed in social media contexts.
We propose evaluating LLM biases from a group fairness lens using a novel hierarchical schema characterizing diverse social groups.
We pioneer a novel chain-of-thought method GF-Think to mitigate biases of LLMs from a group fairness perspective.
arXiv Detail & Related papers (2023-12-24T13:25:15Z) - How Far Can We Extract Diverse Perspectives from Large Language Models? [16.16678226707335]
We show that large language models (LLMs) can generate diverse perspectives on subjective topics.
We propose a criteria-based prompting technique to ground diverse opinions.
Our methods, applied to various tasks, show that LLMs can indeed produce diverse opinions according to the degree of task subjectivity.
arXiv Detail & Related papers (2023-11-16T11:23:38Z) - Retrieving Multimodal Information for Augmented Generation: A Survey [35.33076940985081]
We review methods that assist and augment generative models by retrieving multimodal knowledge.
Such methods offer a promising solution to important concerns such as factuality, reasoning, interpretability, and robustness.
arXiv Detail & Related papers (2023-03-20T05:07:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.