A Data-driven Investigation of Euphemistic Language: Comparing the usage of "slave" and "servant" in 19th century US newspapers
- URL: http://arxiv.org/abs/2503.15057v1
- Date: Wed, 19 Mar 2025 09:49:22 GMT
- Title: A Data-driven Investigation of Euphemistic Language: Comparing the usage of "slave" and "servant" in 19th century US newspapers
- Authors: Jaihyun Park, Ryan Cordell,
- Abstract summary: This study investigates the usage of "slave" and "servant" in the 19th century US newspapers using computational methods.<n>We found that "slave" is associated with socio-economic, legal, and administrative words.<n>"servant" is linked to religious words in the Northern newspapers while Southern newspapers associated "servant" with domestic and familial words.
- Score: 4.063328359314906
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study investigates the usage of "slave" and "servant" in the 19th century US newspapers using computational methods. While both terms were used to refer to enslaved African Americans, they were used in distinct ways. In the Chronicling America corpus, we included possible OCR errors by using FastText embedding and excluded text reprints to consider text reprint culture in the 19th century. Word2vec embedding was used to find semantically close words to "slave" and "servant" and log-odds ratio was calculated to identify over-represented discourse words in the Southern and Northern newspapers. We found that "slave" is associated with socio-economic, legal, and administrative words, however, "servant" is linked to religious words in the Northern newspapers while Southern newspapers associated "servant" with domestic and familial words. We further found that slave discourse words in Southern newspapers are more prevalent in Northern newspapers while servant discourse words from each side are prevalent in their own region. This study contributes to the understanding of how newspapers created different discourses around enslaved African Americans in the 19th century US.
Related papers
- Bridging Dictionary: AI-Generated Dictionary of Partisan Language Use [21.15400893251543]
Bridging Dictionary is an interactive tool designed to illuminate how words are perceived by people with different political views.
The Bridging Dictionary includes a static, printable document featuring 796 terms with summaries generated by a large language model.
Users can explore selected words, visualizing their frequency, sentiment, summaries, and examples across political divides.
arXiv Detail & Related papers (2024-07-12T19:44:40Z) - A Quantitative Discourse Analysis of Asian Workers in the US Historical
Newspapers [4.8002841809407695]
We present computational text analysis on how Asian workers are represented in historical newspapers in the United States.
We found that the word "coolie" was semantically different in some States with the different discourses around coolie.
We also found that then-Confederate newspapers and then-Union newspapers formed distinctive discourses by measuring over-represented words.
arXiv Detail & Related papers (2024-02-04T17:32:52Z) - A ripple in time: a discontinuity in American history [49.84018914962972]
We suggest a novel approach to discover temporal (related and unrelated to language dilation) and personality (authorship attribution) aspects in historical datasets.<n>We exemplify our approach on the State of the Union addresses given by the past 42 US presidents.
arXiv Detail & Related papers (2023-12-02T17:24:17Z) - Quantifying the redundancy between prosody and text [67.07817268372743]
We use large language models to estimate how much information is redundant between prosody and the words themselves.
We find a high degree of redundancy between the information carried by the words and prosodic information across several prosodic features.
Still, we observe that prosodic features can not be fully predicted from text, suggesting that prosody carries information above and beyond the words.
arXiv Detail & Related papers (2023-11-28T21:15:24Z) - Neighboring Words Affect Human Interpretation of Saliency Explanations [65.29015910991261]
Word-level saliency explanations are often used to communicate feature-attribution in text-based models.
Recent studies found that superficial factors such as word length can distort human interpretation of the communicated saliency scores.
We investigate how the marking of a word's neighboring words affect the explainee's perception of the word's importance in the context of a saliency explanation.
arXiv Detail & Related papers (2023-05-04T09:50:25Z) - CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a
Context Synergized Hyperbolic Network [52.85130555886915]
CoSyn is a context-synergized neural network that explicitly incorporates user- and conversational context for detecting implicit hate speech in online conversations.
We show that CoSyn outperforms all our baselines in detecting implicit hate speech with absolute improvements in the range of 1.24% - 57.8%.
arXiv Detail & Related papers (2023-03-02T17:30:43Z) - Keywords and Instances: A Hierarchical Contrastive Learning Framework Unifying Hybrid Granularities for Text Generation [60.62039705180484]
We propose a hierarchical contrastive learning mechanism, which can unify hybrid granularities semantic meaning in the input text.
Experiments demonstrate that our model outperforms competitive baselines on paraphrasing, dialogue generation, and storytelling tasks.
arXiv Detail & Related papers (2022-05-26T13:26:03Z) - Both the validity of the cultural tightness index and the association
with creativity and order are spurious -- a comment on Jackson et al [77.34726150561087]
Jackson et al. generate a linguistic index of cultural tightness based on the Google Books Ngram corpus.
We show here that the methods used by Jackson et al. are neither suitable for testing the validity of the index nor for establishing possible relationships with creativity/order.
arXiv Detail & Related papers (2022-01-26T08:32:44Z) - Regional Negative Bias in Word Embeddings Predicts Racial Animus--but
only via Name Frequency [2.247786323899963]
We show that anti-black WEAT estimates from geo-tagged social media data strongly correlate with several measures of racial animus.
We also show that every one of these correlations is explained by the frequency of Black names in the underlying corpora relative to White names.
arXiv Detail & Related papers (2022-01-20T20:52:12Z) - From Plenipotentiary to Puddingless: Users and Uses of New Words in
Early English Letters [0.0]
We study neologism use in two samples of early English correspondence, from 1640--1660 and 1760--1780.
In both samples, neologisms most frequently occur in letters written between close friends.
In the seventeenth-century sample, we observe the influence of the English Civil War, while the eighteenth-century sample appears to reflect the changing functions of letter-writing.
arXiv Detail & Related papers (2021-03-17T21:45:06Z) - Abolitionist Networks: Modeling Language Change in Nineteenth-Century
Activist Newspapers [14.98054985758998]
Two newspapers edited by women -- THE PROVINCIAL FREEMAN and THE LILY -- led a large number of semantic changes in our corpus.
This paper supplements recent qualitative work on the role of women in abolition's vanguard.
arXiv Detail & Related papers (2021-03-12T21:26:30Z) - Towards Debiasing Sentence Representations [109.70181221796469]
We show that Sent-Debias is effective in removing biases, and at the same time, preserves performance on sentence-level downstream tasks.
We hope that our work will inspire future research on characterizing and removing social biases from widely adopted sentence representations for fairer NLP.
arXiv Detail & Related papers (2020-07-16T04:22:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.