Misclassification in Automated Content Analysis Causes Bias in
Regression. Can We Fix It? Yes We Can!
- URL: http://arxiv.org/abs/2307.06483v2
- Date: Sun, 10 Dec 2023 21:21:21 GMT
- Title: Misclassification in Automated Content Analysis Causes Bias in
Regression. Can We Fix It? Yes We Can!
- Authors: Nathan TeBlunthuis, Valerie Hase, Chung-Hong Chan
- Abstract summary: We show in a systematic literature review that communication scholars largely ignore misclassification bias.
Existing statistical methods can use "gold standard" validation data, such as that created by human annotators, to correct misclassification bias.
We introduce and test such methods, including a new method we design and implement in the R package misclassificationmodels.
- Score: 0.30693357740321775
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Automated classifiers (ACs), often built via supervised machine learning
(SML), can categorize large, statistically powerful samples of data ranging
from text to images and video, and have become widely popular measurement
devices in communication science and related fields. Despite this popularity,
even highly accurate classifiers make errors that cause misclassification bias
and misleading results in downstream analyses-unless such analyses account for
these errors. As we show in a systematic literature review of SML applications,
communication scholars largely ignore misclassification bias. In principle,
existing statistical methods can use "gold standard" validation data, such as
that created by human annotators, to correct misclassification bias and produce
consistent estimates. We introduce and test such methods, including a new
method we design and implement in the R package misclassificationmodels, via
Monte Carlo simulations designed to reveal each method's limitations, which we
also release. Based on our results, we recommend our new error correction
method as it is versatile and efficient. In sum, automated classifiers, even
those below common accuracy standards or making systematic misclassifications,
can be useful for measurement with careful study design and appropriate error
correction methods.
Related papers
- A Systematic Review of Machine Learning Approaches for Detecting Deceptive Activities on Social Media: Methods, Challenges, and Biases [0.037693031068634524]
This systematic review evaluates studies that apply machine learning (ML) and deep learning (DL) models to detect fake news, spam, and fake accounts on social media.
arXiv Detail & Related papers (2024-10-26T23:55:50Z) - Subtle Errors Matter: Preference Learning via Error-injected Self-editing [59.405145971637204]
We propose a novel preference learning framework called eRror-Injected Self-Editing (RISE)
RISE injects predefined subtle errors into partial tokens of correct solutions to construct hard pairs for error mitigation.
Experiments validate the effectiveness of RISE, with preference learning on Qwen2-7B-Instruct yielding notable improvements of 3.0% on GSM8K and 7.9% on MATH.
arXiv Detail & Related papers (2024-10-09T07:43:38Z) - Understanding and Mitigating Classification Errors Through Interpretable
Token Patterns [58.91023283103762]
Characterizing errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors.
We propose to discover those patterns of tokens that distinguish correct and erroneous predictions.
We show that our method, Premise, performs well in practice.
arXiv Detail & Related papers (2023-11-18T00:24:26Z) - Probabilistic Safety Regions Via Finite Families of Scalable Classifiers [2.431537995108158]
Supervised classification recognizes patterns in the data to separate classes of behaviours.
Canonical solutions contain misclassification errors that are intrinsic to the numerical approximating nature of machine learning.
We introduce the concept of probabilistic safety region to describe a subset of the input space in which the number of misclassified instances is probabilistically controlled.
arXiv Detail & Related papers (2023-09-08T22:40:19Z) - Class-wise and reduced calibration methods [0.0]
We show how a reduced calibration method transforms the original problem into a simpler one.
Second, we propose class-wise calibration methods, based on building on a phenomenon called neural collapse.
Applying the two methods together results in class-wise reduced calibration algorithms, which are powerful tools for reducing the prediction and per-class calibration errors.
arXiv Detail & Related papers (2022-10-07T17:13:17Z) - Understanding Factual Errors in Summarization: Errors, Summarizers,
Datasets, Error Detectors [105.12462629663757]
In this work, we aggregate factuality error annotations from nine existing datasets and stratify them according to the underlying summarization model.
We compare performance of state-of-the-art factuality metrics, including recent ChatGPT-based metrics, on this stratified benchmark and show that their performance varies significantly across different types of summarization models.
arXiv Detail & Related papers (2022-05-25T15:26:48Z) - Regularized Classification-Aware Quantization [39.04839665081476]
We present a class of algorithms that learn distributed quantization schemes for binary classification tasks.
Our method is called Regularized Classification-Aware Quantization.
arXiv Detail & Related papers (2021-07-12T21:27:48Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Defuse: Harnessing Unrestricted Adversarial Examples for Debugging
Models Beyond Test Accuracy [11.265020351747916]
Defuse is a method to automatically discover and correct model errors beyond those available in test data.
We propose an algorithm inspired by adversarial machine learning techniques that uses a generative model to find naturally occurring instances misclassified by a model.
Defuse corrects the error after fine-tuning while maintaining generalization on the test set.
arXiv Detail & Related papers (2021-02-11T18:08:42Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.