Contrastive Clustering: Toward Unsupervised Bias Reduction for Emotion
and Sentiment Classification
- URL: http://arxiv.org/abs/2111.07448v1
- Date: Sun, 14 Nov 2021 20:58:04 GMT
- Title: Contrastive Clustering: Toward Unsupervised Bias Reduction for Emotion
and Sentiment Classification
- Authors: Jared Mowery
- Abstract summary: This study assesses the impact of bias on COVID-19 topics.
It demonstrates an automatic algorithm for reducing bias when applied to COVID-19 social media texts.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Background: When neural network emotion and sentiment classifiers are used in
public health informatics studies, biases present in the classifiers could
produce inadvertently misleading results.
Objective: This study assesses the impact of bias on COVID-19 topics, and
demonstrates an automatic algorithm for reducing bias when applied to COVID-19
social media texts. This could help public health informatics studies produce
more timely results during crises, with a reduced risk of misleading results.
Methods: Emotion and sentiment classifiers were applied to COVID-19 data
before and after debiasing the classifiers using unsupervised contrastive
clustering. Contrastive clustering approximates the degree to which tokens
exhibit a causal versus correlational relationship with emotion or sentiment,
by contrasting the tokens' relative salience to topics versus emotions or
sentiments.
Results: Contrastive clustering distinguishes correlation from causation for
tokens with an F1 score of 0.753. Masking bias prone tokens from the classifier
input decreases the classifier's overall F1 score by 0.02 (anger) and 0.033
(negative sentiment), but improves the F1 score for sentences annotated as bias
prone by 0.155 (anger) and 0.103 (negative sentiment). Averaging across topics,
debiasing reduces anger estimates by 14.4% and negative sentiment estimates by
8.0%.
Conclusions: Contrastive clustering reduces algorithmic bias in emotion and
sentiment classification for social media text pertaining to the COVID-19
pandemic. Public health informatics studies should account for bias, due to its
prevalence across a range of topics. Further research is needed to improve bias
reduction techniques and to explore the adverse impact of bias on public health
informatics analyses.
Related papers
- Large-scale digital phenotyping: identifying depression and anxiety indicators in a general UK population with over 10,000 participants [2.2909783327197393]
We conducted a cross-sectional analysis of data from 10,129 participants recruited from a UK-based general population.
Participants shared wearable (Fitbit) data and self-reported questionnaires on depression (PHQ-8), anxiety (GAD-7), and mood via a study app.
We observed significant associations between the severity of depression and anxiety with several factors, including mood, age, gender, BMI, sleep patterns, physical activity, and heart rate.
arXiv Detail & Related papers (2024-09-24T16:05:17Z) - Contrastive Learning with Negative Sampling Correction [52.990001829393506]
We propose a novel contrastive learning method named Positive-Unlabeled Contrastive Learning (PUCL)
PUCL treats the generated negative samples as unlabeled samples and uses information from positive samples to correct bias in contrastive loss.
PUCL can be applied to general contrastive learning problems and outperforms state-of-the-art methods on various image and graph classification tasks.
arXiv Detail & Related papers (2024-01-13T11:18:18Z) - Signal Is Harder To Learn Than Bias: Debiasing with Focal Loss [10.031357641396616]
neural networks are notorious for learning unwanted associations, also known as biases, instead of the underlying decision rule.
We propose Signal is Harder, a variational-autoencoder-based method that simultaneously trains a biased and unbiased classifier.
We propose a perturbation scheme in the latent space for visualizing the bias that helps practitioners become aware of the sources of spurious correlations.
arXiv Detail & Related papers (2023-05-31T09:09:59Z) - Race Bias Analysis of Bona Fide Errors in face anti-spoofing [0.0]
We present a systematic study of race bias in face anti-spoofing with three key characteristics.
The focus is on analysing potential bias in the bona fide errors, where significant ethical and legal issues lie.
We demonstrate the proposed bias analysis process on a VQ-VAE based face anti-spoofing algorithm.
arXiv Detail & Related papers (2022-10-11T11:49:24Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Toward Understanding Bias Correlations for Mitigation in NLP [34.956581421295]
This work aims to provide a first systematic study toward understanding bias correlations in mitigation.
We examine bias mitigation in two common NLP tasks -- toxicity detection and word embeddings.
Our findings suggest that biases are correlated and present scenarios in which independent debiasing approaches may be insufficient.
arXiv Detail & Related papers (2022-05-24T22:48:47Z) - Neural Contrastive Clustering: Fully Unsupervised Bias Reduction for
Sentiment Classification [0.0]
Correlation bias in sentiment classification often arises in conversations about controversial topics.
This study uses adversarial learning to contrast clusters based on sentiment classification labels, with clusters produced by unsupervised topic modeling.
This discourages the neural network from learning topic-related features that produce biased classification results.
arXiv Detail & Related papers (2022-04-22T02:34:41Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Correct block-design experiments mitigate temporal correlation bias in
EEG classification [68.85562949901077]
We show that the main claim in [1] is drastically overstated and their other analyses are seriously flawed by wrong methodological choices.
We investigate the influence of EEG temporal correlation on classification accuracy by testing the same models in two additional experimental settings.
arXiv Detail & Related papers (2020-11-25T22:25:21Z) - Towards Controllable Biases in Language Generation [87.89632038677912]
We develop a method to induce societal biases in generated text when input prompts contain mentions of specific demographic groups.
We analyze two scenarios: 1) inducing negative biases for one demographic and positive biases for another demographic, and 2) equalizing biases between demographics.
arXiv Detail & Related papers (2020-05-01T08:25:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.