Related papers: AUTALIC: A Dataset for Anti-AUTistic Ableist Language In Context

AUTALIC: A Dataset for Anti-AUTistic Ableist Language In Context

URL: http://arxiv.org/abs/2410.16520v1
Date: Mon, 21 Oct 2024 21:21:29 GMT
Title: AUTALIC: A Dataset for Anti-AUTistic Ableist Language In Context
Authors: Naba Rizvi, Harper Strickland, Daniel Gitelman, Tristan Cooper, Alexis Morales-Flores, Michael Golden, Aekta Kallepalli, Akshat Alurkar, Haaset Owens, Saleha Ahmedi, Isha Khirwadkar, Imani Munyaka, Nedjma Ousidhoum,
Abstract summary: AUTALIC is the first benchmark dataset dedicated to the detection of anti-autistic ableist language in context. The dataset comprises 2,400 autism-related sentences collected from Reddit, accompanied by surrounding context, and is annotated by trained experts with backgrounds in neurodiversity.
Score: 1.3334268990558924
License:
Abstract: As our understanding of autism and ableism continues to increase, so does our understanding of ableist language towards autistic people. Such language poses a significant challenge in NLP research due to its subtle and context-dependent nature. Yet, detecting anti-autistic ableist language remains underexplored, with existing NLP tools often failing to capture its nuanced expressions. We present AUTALIC, the first benchmark dataset dedicated to the detection of anti-autistic ableist language in context, addressing a significant gap in the field. The dataset comprises 2,400 autism-related sentences collected from Reddit, accompanied by surrounding context, and is annotated by trained experts with backgrounds in neurodiversity. Our comprehensive evaluation reveals that current language models, including state-of-the-art LLMs, struggle to reliably identify anti-autistic ableism and align with human judgments, underscoring their limitations in this domain. We publicly release AUTALIC along with the individual annotations which serve as a valuable resource to researchers working on ableism, neurodiversity, and also studying disagreements in annotation tasks. This dataset serves as a crucial step towards developing more inclusive and context-aware NLP systems that better reflect diverse perspectives.

Related papers

Negation Blindness in Large Language Models: Unveiling the NO Syndrome in Image Generation [63.064204206220936]
Foundational Large Language Models (LLMs) have changed the way we perceive technology. They have been shown to excel in tasks ranging from poem writing to coding to essay generation and puzzle solving. With the incorporation of image generation capability, they have become more comprehensive and versatile AI tools. Currently identified flaws include hallucination, biases, and bypassing restricted commands to generate harmful content.
arXiv Detail & Related papers (2024-08-27T14:40:16Z)
Exploiting ChatGPT for Diagnosing Autism-Associated Language Disorders and Identifying Distinct Features [13.006406004068117]
This study explored the application of ChatGPT, a state of the art large language model, to enhance diagnostic accuracy and profiling specific linguistic features indicative of autism. We showed that ChatGPT substantially outperformed conventional supervised learning models, achieving over 13% improvement in both accuracy and F1 score in a zero shot learning configuration. Our findings advocate for adopting sophisticated AI tools like ChatGPT in clinical settings to assess and diagnose developmental disorders.
arXiv Detail & Related papers (2024-05-03T01:04:28Z)
A Survey on Lexical Ambiguity Detection and Word Sense Disambiguation [0.0]
This paper explores techniques that focus on understanding and resolving ambiguity in language within the field of natural language processing (NLP) It outlines diverse approaches ranging from deep learning techniques to leveraging lexical resources and knowledge graphs like WordNet. The research identifies persistent challenges in the field, such as the scarcity of sense annotated corpora and the complexity of informal clinical texts.
arXiv Detail & Related papers (2024-03-24T12:58:48Z)
Hear Me, See Me, Understand Me: Audio-Visual Autism Behavior Recognition [47.550391816383794]
We introduce a novel problem of audio-visual autism behavior recognition. Social behavior recognition is an essential aspect previously omitted in AI-assisted autism screening research. We will release our dataset, code, and pre-trained models.
arXiv Detail & Related papers (2024-03-22T22:52:35Z)
Reliability Analysis of Psychological Concept Extraction and Classification in User-penned Text [9.26840677406494]
We use the LoST dataset to capture nuanced textual cues that suggest the presence of low self-esteem in the posts of Reddit users. Our findings suggest the need of shifting the focus of PLMs from Trigger and Consequences to a more comprehensive explanation.
arXiv Detail & Related papers (2024-01-12T17:19:14Z)
We're Afraid Language Models Aren't Modeling Ambiguity [136.8068419824318]
Managing ambiguity is a key part of human language understanding. We characterize ambiguity in a sentence by its effect on entailment relations with another sentence. We show that a multilabel NLI model can flag political claims in the wild that are misleading due to ambiguity.
arXiv Detail & Related papers (2023-04-27T17:57:58Z)
Information-Restricted Neural Language Models Reveal Different Brain Regions' Sensitivity to Semantics, Syntax and Context [87.31930367845125]
We trained a lexical language model, Glove, and a supra-lexical language model, GPT-2, on a text corpus. We then assessed to what extent these information-restricted models were able to predict the time-courses of fMRI signal of humans listening to naturalistic text. Our analyses show that, while most brain regions involved in language are sensitive to both syntactic and semantic variables, the relative magnitudes of these effects vary a lot across these regions.
arXiv Detail & Related papers (2023-02-28T08:16:18Z)
Unpacking the Interdependent Systems of Discrimination: Ableist Bias in NLP Systems through an Intersectional Lens [20.35460711907179]
We report on various analyses based on word predictions of a large-scale BERT language model. Statistically significant results demonstrate that people with disabilities can be disadvantaged. Findings also explore overlapping forms of discrimination related to interconnected gender and race identities.
arXiv Detail & Related papers (2021-10-01T16:40:58Z)
CogAlign: Learning to Align Textual Neural Representations to Cognitive Language Processing Signals [60.921888445317705]
We propose a CogAlign approach to integrate cognitive language processing signals into natural language processing models. We show that CogAlign achieves significant improvements with multiple cognitive features over state-of-the-art models on public datasets.
arXiv Detail & Related papers (2021-06-10T07:10:25Z)
AM2iCo: Evaluating Word Meaning in Context across Low-ResourceLanguages with Adversarial Examples [51.048234591165155]
We present AM2iCo, Adversarial and Multilingual Meaning in Context. It aims to faithfully assess the ability of state-of-the-art (SotA) representation models to understand the identity of word meaning in cross-lingual contexts. Results reveal that current SotA pretrained encoders substantially lag behind human performance.
arXiv Detail & Related papers (2021-04-17T20:23:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.