Modeling Subjective Assessments of Guilt in Newspaper Crime Narratives
- URL: http://arxiv.org/abs/2006.09589v2
- Date: Wed, 14 Oct 2020 22:38:44 GMT
- Title: Modeling Subjective Assessments of Guilt in Newspaper Crime Narratives
- Authors: Elisa Kreiss, Zijian Wang, Christopher Potts
- Abstract summary: SuspectGuilt is a Corpus of annotated crime stories from English-language newspapers in the U.S.
For SuspectGuilt, annotators read short crime articles and provided text-level ratings concerning the guilt of the main suspect.
We use SuspectGuilt to train and assess predictive models, and show that these models benefit from genre pretraining and joint supervision.
- Score: 9.589175887215585
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Crime reporting is a prevalent form of journalism with the power to shape
public perceptions and social policies. How does the language of these reports
act on readers? We seek to address this question with the SuspectGuilt Corpus
of annotated crime stories from English-language newspapers in the U.S. For
SuspectGuilt, annotators read short crime articles and provided text-level
ratings concerning the guilt of the main suspect as well as span-level
annotations indicating which parts of the story they felt most influenced their
ratings. SuspectGuilt thus provides a rich picture of how linguistic choices
affect subjective guilt judgments. In addition, we use SuspectGuilt to train
and assess predictive models, and show that these models benefit from genre
pretraining and joint supervision from the text-level ratings and span-level
annotations. Such models might be used as tools for understanding the societal
effects of crime reporting.
Related papers
- "Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in
LLM-Generated Reference Letters [97.11173801187816]
Large Language Models (LLMs) have recently emerged as an effective tool to assist individuals in writing various types of content.
This paper critically examines gender biases in LLM-generated reference letters.
arXiv Detail & Related papers (2023-10-13T16:12:57Z) - Garbage in, garbage out: Zero-shot detection of crime using Large
Language Models [1.113911383207731]
We show that when video is (manually) converted to high quality textual descriptions, large language models are capable of detecting and classifying crimes.
Existing automated video-to-text approaches are unable to generate video descriptions of sufficient quality to support reasoning.
arXiv Detail & Related papers (2023-07-04T01:29:15Z) - Using Natural Language Explanations to Rescale Human Judgments [81.66697572357477]
We propose a method to rescale ordinal annotations and explanations using large language models (LLMs)
We feed annotators' Likert ratings and corresponding explanations into an LLM and prompt it to produce a numeric score anchored in a scoring rubric.
Our method rescales the raw judgments without impacting agreement and brings the scores closer to human judgments grounded in the same scoring rubric.
arXiv Detail & Related papers (2023-05-24T06:19:14Z) - If it Bleeds, it Leads: A Computational Approach to Covering Crime in
Los Angeles [79.4098551457605]
We present a machine-in-the-loop system that covers individual crimes by learning the prototypical coverage archetypes from classical news articles on crime to learn their structure.
We hope our work can lead to systems that use these components together to form the skeletons of news articles covering crime.
arXiv Detail & Related papers (2022-06-14T19:06:13Z) - American Hate Crime Trends Prediction with Event Extraction [0.0]
The FBI's Uniform Crime Reporting (UCR) Program collects hate crime data and releases statistic report yearly.
Recent research mainly focuses on hate speech detection in social media text or empirical studies on the impact of a confirmed crime.
This paper proposes a framework that first utilizes text mining techniques to extract hate crime events from New York Times news, then uses the results to facilitate predicting American national-level and state-level hate crime trends.
arXiv Detail & Related papers (2021-11-09T04:30:20Z) - The effect of differential victim crime reporting on predictive policing
systems [84.86615754515252]
We show how differential victim crime reporting rates can lead to outcome disparities in common crime hot spot prediction models.
Our results suggest that differential crime reporting rates can lead to a displacement of predicted hotspots from high crime but low reporting areas to high or medium crime and high reporting areas.
arXiv Detail & Related papers (2021-01-30T01:57:22Z) - Results of a Single Blind Literary Taste Test with Short Anonymized
Novel Fragments [4.695687634290403]
It is an open question to what extent perceptions of literary quality are derived from text-intrinsic versus social factors.
We report the results of a pilot study to gauge the effect of textual features on literary ratings of Dutch-language novels.
We find moderate to strong correlations of questionnaire ratings with the survey ratings, but the predictions are closer to the survey ratings.
arXiv Detail & Related papers (2020-11-03T11:10:17Z) - Viable Threat on News Reading: Generating Biased News Using Natural
Language Models [49.90665530780664]
We show that publicly available language models can reliably generate biased news content based on an input original news.
We also show that a large number of high-quality biased news articles can be generated using controllable text generation.
arXiv Detail & Related papers (2020-10-05T16:55:39Z) - Examining Racial Bias in an Online Abuse Corpus with Structural Topic
Modeling [0.30458514384586405]
We use structural topic modeling to examine racial bias in social media posts.
We augment the abusive language dataset by adding an additional feature indicating the predicted probability of the tweet being written in African-American English.
arXiv Detail & Related papers (2020-05-26T21:02:43Z) - Multilingual Twitter Corpus and Baselines for Evaluating Demographic
Bias in Hate Speech Recognition [46.57105755981092]
We publish a multilingual Twitter corpus for the task of hate speech detection.
The corpus covers five languages: English, Italian, Polish, Portuguese and Spanish.
We evaluate the inferred demographic labels with a crowdsourcing platform.
arXiv Detail & Related papers (2020-02-24T16:45:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.