CO-STAR: Conceptualisation of Stereotypes for Analysis and Reasoning
- URL: http://arxiv.org/abs/2112.00819v1
- Date: Wed, 1 Dec 2021 20:39:04 GMT
- Title: CO-STAR: Conceptualisation of Stereotypes for Analysis and Reasoning
- Authors: Teyun Kwon, Anandha Gopalan
- Abstract summary: We build on existing literature and present CO-STAR, a novel framework which encodes the underlying concepts of implied stereotypes.
We also introduce the CO-STAR training data set, which contains just over 12K structured annotations of implied stereotypes and stereotype conceptualisations.
The CO-STAR models are, however, limited in their ability to understand more complex and subtly worded stereotypes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Warning: this paper contains material which may be offensive or upsetting.
While much of recent work has focused on the detection of hate speech and
overtly offensive content, very little research has explored the more subtle
but equally harmful language in the form of implied stereotypes. This is a
challenging domain, made even more so by the fact that humans often struggle to
understand and reason about stereotypes. We build on existing literature and
present CO-STAR (COnceptualisation of STereotypes for Analysis and Reasoning),
a novel framework which encodes the underlying concepts of implied stereotypes.
We also introduce the CO-STAR training data set, which contains just over 12K
structured annotations of implied stereotypes and stereotype
conceptualisations, and achieve state-of-the-art results after training and
manual evaluation. The CO-STAR models are, however, limited in their ability to
understand more complex and subtly worded stereotypes, and our research
motivates future work in developing models with more sophisticated methods for
encoding common-sense knowledge.
Related papers
- Incorporating Human Explanations for Robust Hate Speech Detection [17.354241456219945]
We develop a three stage analysis to evaluate if LMs faithfully assess hate speech.
First, we observe the need for modeling contextually grounded stereotype intents to capture implicit semantic meaning.
Next, we design a new task, Stereotype Intent Entailment (SIE), which encourages a model to contextually understand stereotype presence.
arXiv Detail & Related papers (2024-11-09T15:29:04Z) - Information Theoretic Text-to-Image Alignment [49.396917351264655]
We present a novel method that relies on an information-theoretic alignment measure to steer image generation.
Our method is on-par or superior to the state-of-the-art, yet requires nothing but a pre-trained denoising network to estimate MI.
arXiv Detail & Related papers (2024-05-31T12:20:02Z) - Stereotype Detection in LLMs: A Multiclass, Explainable, and Benchmark-Driven Approach [4.908389661988191]
This paper introduces the Multi-Grain Stereotype (MGS) dataset, consisting of 51,867 instances across gender, race, profession, religion, and other stereotypes.
We evaluate various machine learning approaches to establish baselines and fine-tune language models of different architectures and sizes.
We employ explainable AI (XAI) tools, including SHAP, LIME, and BertViz, to assess whether the model's learned patterns align with human intuitions about stereotypes.
arXiv Detail & Related papers (2024-04-02T09:31:32Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Language Agents for Detecting Implicit Stereotypes in Text-to-image
Models at Scale [45.64096601242646]
We introduce a novel agent architecture tailored for stereotype detection in text-to-image models.
We build the stereotype-relevant benchmark based on multiple open-text datasets.
We find that these models often display serious stereotypes when it comes to certain prompts about personal characteristics.
arXiv Detail & Related papers (2023-10-18T08:16:29Z) - Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world.
The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time.
The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z) - Counteracts: Testing Stereotypical Representation in Pre-trained
Language Models [4.211128681972148]
We use counterexamples to examine the internal stereotypical knowledge in pre-trained language models (PLMs)
We evaluate 7 PLMs on 9 types of cloze-style prompt with different information and base knowledge.
arXiv Detail & Related papers (2023-01-11T07:52:59Z) - Easily Accessible Text-to-Image Generation Amplifies Demographic
Stereotypes at Large Scale [61.555788332182395]
We investigate the potential for machine learning models to amplify dangerous and complex stereotypes.
We find a broad range of ordinary prompts produce stereotypes, including prompts simply mentioning traits, descriptors, occupations, or objects.
arXiv Detail & Related papers (2022-11-07T18:31:07Z) - Understanding and Countering Stereotypes: A Computational Approach to
the Stereotype Content Model [4.916009028580767]
We present a computational approach to interpreting stereotypes in text through the Stereotype Content Model (SCM)
The SCM proposes that stereotypes can be understood along two primary dimensions: warmth and competence.
It is known that countering stereotypes with anti-stereotypical examples is one of the most effective ways to reduce biased thinking.
arXiv Detail & Related papers (2021-06-04T16:53:37Z) - A Minimalist Dataset for Systematic Generalization of Perception,
Syntax, and Semantics [131.93113552146195]
We present a new dataset, Handwritten arithmetic with INTegers (HINT), to examine machines' capability of learning generalizable concepts.
In HINT, machines are tasked with learning how concepts are perceived from raw signals such as images.
We undertake extensive experiments with various sequence-to-sequence models, including RNNs, Transformers, and GPT-3.
arXiv Detail & Related papers (2021-03-02T01:32:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.