StereoDetect: Detecting Stereotypes and Anti-stereotypes the Correct Way Using Social Psychological Underpinnings
- URL: http://arxiv.org/abs/2504.03352v3
- Date: Mon, 27 Oct 2025 11:20:02 GMT
- Title: StereoDetect: Detecting Stereotypes and Anti-stereotypes the Correct Way Using Social Psychological Underpinnings
- Authors: Kaustubh Shivshankar Shejole, Pushpak Bhattacharyya,
- Abstract summary: Stereotype and Anti-stereotype detection is a problem that requires social knowledge.<n>We propose a five-tuple definition and provide precise terminologies disentangling stereotypes, anti-stereotypes, stereotypical bias, and general bias.<n>We show that sub-10B language models and GPT-4o frequently misclassify anti-stereotypes and fail to recognize neutral overgeneralizations.
- Score: 47.02959423049043
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stereotypes are known to have very harmful effects, making their detection critically important. However, current research predominantly focuses on detecting and evaluating stereotypical biases, thereby leaving the study of stereotypes in its early stages. Our study revealed that many works have failed to clearly distinguish between stereotypes and stereotypical biases, which has significantly slowed progress in advancing research in this area. Stereotype and Anti-stereotype detection is a problem that requires social knowledge; hence, it is one of the most difficult areas in Responsible AI. This work investigates this task, where we propose a five-tuple definition and provide precise terminologies disentangling stereotypes, anti-stereotypes, stereotypical bias, and general bias. We provide a conceptual framework grounded in social psychology for reliable detection. We identify key shortcomings in existing benchmarks for this task of stereotype and anti-stereotype detection. To address these gaps, we developed StereoDetect, a well curated, definition-aligned benchmark dataset designed for this task. We show that sub-10B language models and GPT-4o frequently misclassify anti-stereotypes and fail to recognize neutral overgeneralizations. We demonstrate StereoDetect's effectiveness through multiple qualitative and quantitative comparisons with existing benchmarks and models fine-tuned on them. The dataset and code is available at https://github.com/KaustubhShejole/StereoDetect.
Related papers
- Stereotype Detection as a Catalyst for Enhanced Bias Detection: A Multi-Task Learning Approach [36.64093052736432]
Bias and stereotypes in language models can cause harm, especially in sensitive areas like content moderation and decision-making.<n>This paper addresses bias and stereotype detection by exploring how jointly learning these tasks enhances model performance.<n>We introduce StereoBias, a unique dataset labeled for bias and stereotype detection across five categories: religion, gender, socio-economic status, race, profession, and others.
arXiv Detail & Related papers (2025-07-02T13:46:00Z) - A Survey on Stereotype Detection in Natural Language Processing [46.27245894098319]
Stereotypes influence social perceptions and can escalate into discrimination and violence.<n>This work is presented a survey of existing research, analyzing definitions from psychology, sociology, and philosophy.<n>Findings emphasize stereotype detection as a potential early-monitoring tool to prevent bias escalation and the rise of hate speech.
arXiv Detail & Related papers (2025-05-23T09:03:56Z) - Mono2Stereo: A Benchmark and Empirical Study for Stereo Conversion [88.67015254278859]
We introduce the Mono2Stereo dataset, providing high-quality training data and benchmark to support in-depth exploration of stereo conversion.<n>We conduct an empirical study that yields two primary findings. 1) The differences between the left and right views are subtle, yet existing metrics consider overall pixels, failing to concentrate on regions critical to stereo effects.<n>We introduce a new evaluation metric, Stereo Intersection-over-Union, which harmonizes disparity and achieves a high correlation with human judgments on stereo effect.
arXiv Detail & Related papers (2025-03-28T09:25:58Z) - Blind Men and the Elephant: Diverse Perspectives on Gender Stereotypes in Benchmark Datasets [12.798832545154271]
This paper examines the inconsistencies between intrinsic stereotype benchmarks.<n>Using StereoSet and CrowS-Pairs as case studies, we investigated how data distribution affects benchmark results.
arXiv Detail & Related papers (2025-01-02T09:40:31Z) - Biased or Flawed? Mitigating Stereotypes in Generative Language Models by Addressing Task-Specific Flaws [12.559028963968247]
generative language models often reflect and amplify societal biases in their outputs.<n>We propose a targeted stereotype mitigation framework that implicitly mitigates observed stereotypes in generative models.<n>We reduce stereotypical outputs by over 60% across multiple dimensions.
arXiv Detail & Related papers (2024-12-16T03:29:08Z) - Stereotype Detection in LLMs: A Multiclass, Explainable, and Benchmark-Driven Approach [4.908389661988191]
This paper introduces the Multi-Grain Stereotype (MGS) dataset, consisting of 51,867 instances across gender, race, profession, religion, and other stereotypes.
We evaluate various machine learning approaches to establish baselines and fine-tune language models of different architectures and sizes.
We employ explainable AI (XAI) tools, including SHAP, LIME, and BertViz, to assess whether the model's learned patterns align with human intuitions about stereotypes.
arXiv Detail & Related papers (2024-04-02T09:31:32Z) - Quantifying Stereotypes in Language [6.697298321551588]
We quantify stereotypes in language by annotating a dataset.
We use the pre-trained language models (PLMs) to learn this dataset to predict stereotypes of sentences.
We discuss stereotypes about common social issues such as hate speech, sexism, sentiments, and disadvantaged and advantaged groups.
arXiv Detail & Related papers (2024-01-28T01:07:21Z) - Towards Auditing Large Language Models: Improving Text-based Stereotype
Detection [5.3634450268516565]
This work introduces i) the Multi-Grain Stereotype dataset, which includes 52,751 instances of gender, race, profession and religion stereotypic text.
We design several experiments to rigorously test the proposed model trained on the novel dataset.
Experiments show that training the model in a multi-class setting can outperform the one-vs-all binary counterpart.
arXiv Detail & Related papers (2023-11-23T17:47:14Z) - Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts [80.21033860436081]
We investigate how models respond to gender stereotype perturbations through counterfactual data augmentation.<n>Our results show that models exhibit slight performance drops when faced with gender perturbations in the test set.<n>When fine-tuned on counterfactual training data, models become more robust to anti-stereotypical narratives.
arXiv Detail & Related papers (2023-10-16T22:25:09Z) - Easily Accessible Text-to-Image Generation Amplifies Demographic
Stereotypes at Large Scale [61.555788332182395]
We investigate the potential for machine learning models to amplify dangerous and complex stereotypes.
We find a broad range of ordinary prompts produce stereotypes, including prompts simply mentioning traits, descriptors, occupations, or objects.
arXiv Detail & Related papers (2022-11-07T18:31:07Z) - Reinforcement Guided Multi-Task Learning Framework for Low-Resource
Stereotype Detection [3.7223111129285096]
"Stereotype Detection" datasets mainly adopt a diagnostic approach toward large Pre-trained Language Models.
Annotating a reliable dataset requires a precise understanding of the subtle nuances of how stereotypes manifest in text.
We present a multi-task model that leverages the abundance of data-rich neighboring tasks to improve the empirical performance on "Stereotype Detection"
arXiv Detail & Related papers (2022-03-27T17:16:11Z) - Pedestrian Detection: Domain Generalization, CNNs, Transformers and
Beyond [82.37430109152383]
We show that, current pedestrian detectors poorly handle even small domain shifts in cross-dataset evaluation.
We attribute the limited generalization to two main factors, the method and the current sources of data.
We propose a progressive fine-tuning strategy which improves generalization.
arXiv Detail & Related papers (2022-01-10T06:00:26Z) - Understanding and Countering Stereotypes: A Computational Approach to
the Stereotype Content Model [4.916009028580767]
We present a computational approach to interpreting stereotypes in text through the Stereotype Content Model (SCM)
The SCM proposes that stereotypes can be understood along two primary dimensions: warmth and competence.
It is known that countering stereotypes with anti-stereotypical examples is one of the most effective ways to reduce biased thinking.
arXiv Detail & Related papers (2021-06-04T16:53:37Z) - UnQovering Stereotyping Biases via Underspecified Questions [68.81749777034409]
We present UNQOVER, a framework to probe and quantify biases through underspecified questions.
We show that a naive use of model scores can lead to incorrect bias estimates due to two forms of reasoning errors.
We use this metric to analyze four important classes of stereotypes: gender, nationality, ethnicity, and religion.
arXiv Detail & Related papers (2020-10-06T01:49:52Z) - Generalizable Pedestrian Detection: The Elephant In The Room [82.37430109152383]
We find that existing state-of-the-art pedestrian detectors, though perform quite well when trained and tested on the same dataset, generalize poorly in cross dataset evaluation.
We illustrate that diverse and dense datasets, collected by crawling the web, serve to be an efficient source of pre-training for pedestrian detection.
arXiv Detail & Related papers (2020-03-19T14:14:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.