Toxicity Begets Toxicity: Unraveling Conversational Chains in Political Podcasts
- URL: http://arxiv.org/abs/2501.12640v2
- Date: Fri, 29 Aug 2025 10:39:09 GMT
- Title: Toxicity Begets Toxicity: Unraveling Conversational Chains in Political Podcasts
- Authors: Naquee Rizwan, Nayandeep Deb, Sarthak Roy, Vishwajeet Singh Solanki, Kiran Garimella, Animesh Mukherjee,
- Abstract summary: This work seeks to fill that gap by curating a dataset of political podcast transcripts and analyzing them with a focus on conversational structure.<n> Specifically, we investigate how toxicity surfaces and intensifies through sequences of replies within these dialogues, shedding light on the organic patterns by which harmful language can escalate across conversational turns.
- Score: 5.573483199335299
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Tackling toxic behavior in digital communication continues to be a pressing concern for both academics and industry professionals. While significant research has explored toxicity on platforms like social networks and discussion boards, podcasts despite their rapid rise in popularity remain relatively understudied in this context. This work seeks to fill that gap by curating a dataset of political podcast transcripts and analyzing them with a focus on conversational structure. Specifically, we investigate how toxicity surfaces and intensifies through sequences of replies within these dialogues, shedding light on the organic patterns by which harmful language can escalate across conversational turns. Warning: Contains potentially abusive/toxic contents.
Related papers
- Identifying Constructive Conflict in Online Discussions through Controversial yet Toxicity Resilient Posts [41.130462443875736]
We operationalize controversiality to identify challenging dialogues and toxicity resilience to capture respectful conversations.<n>We also find that political posts are often controversial and tend to attract more toxic responses.<n>These findings suggest the potential for framing the tone of posts to encourage constructive political discussions.
arXiv Detail & Related papers (2025-09-22T18:30:41Z) - Aligning Spoken Dialogue Models from User Interactions [55.192134724622235]
We propose a novel preference alignment framework to improve spoken dialogue models on realtime conversations from user interactions.<n>We create a dataset of more than 150,000 preference pairs from raw multi-turn speech conversations annotated with AI feedback.<n>Our findings shed light on the importance of a well-calibrated balance among various dynamics, crucial for natural real-time speech dialogue systems.
arXiv Detail & Related papers (2025-06-26T16:45:20Z) - ToxicTone: A Mandarin Audio Dataset Annotated for Toxicity and Toxic Utterance Tonality [35.517662288248225]
ToxicTone is the largest public dataset of its kind.<n>Our data is sourced from diverse real-world audio and organized into 13 topical categories.<n>We propose a multimodal detection framework that integrates acoustic, linguistic, and emotional features.
arXiv Detail & Related papers (2025-05-21T17:25:27Z) - Talking Point based Ideological Discourse Analysis in News Events [62.18747509565779]
We propose a framework motivated by the theory of ideological discourse analysis to analyze news articles related to real-world events.
Our framework represents the news articles using a relational structure - talking points, which captures the interaction between entities, their roles, and media frames along with a topic of discussion.
We evaluate our framework's ability to generate these perspectives through automated tasks - ideology and partisan classification tasks, supplemented by human validation.
arXiv Detail & Related papers (2025-04-10T02:52:34Z) - MoonCast: High-Quality Zero-Shot Podcast Generation [81.29927724674602]
MoonCast is a solution for high-quality zero-shot podcast generation.
It aims to synthesize natural podcast-style speech from text-only sources.
Experiments demonstrate that MoonCast outperforms baselines.
arXiv Detail & Related papers (2025-03-18T15:25:08Z) - SafeSpeech: A Comprehensive and Interactive Tool for Analysing Sexist and Abusive Language in Conversations [7.4815142964548205]
SafeSpeech is a comprehensive platform for toxic content detection and analysis.
It bridges message-level and conversation-level insights.
The platform integrates fine-tuned classifiers and large language models.
Evaluations on benchmark datasets, including EDOS, OffensEval, and HatEval, demonstrate the reproduction of state-of-the-art performance.
arXiv Detail & Related papers (2025-03-09T09:31:17Z) - WavChat: A Survey of Spoken Dialogue Models [66.82775211793547]
Recent advancements in spoken dialogue models, exemplified by systems like GPT-4o, have captured significant attention in the speech domain.
These advanced spoken dialogue models not only comprehend audio, music, and other speech-related features, but also capture stylistic and timbral characteristics in speech.
Despite the progress in spoken dialogue systems, there is a lack of comprehensive surveys that systematically organize and analyze these systems.
arXiv Detail & Related papers (2024-11-15T04:16:45Z) - Characterizing Online Toxicity During the 2022 Mpox Outbreak: A Computational Analysis of Topical and Network Dynamics [0.9831489366502301]
The 2022 Mpox outbreak, initially termed "Monkeypox" but subsequently renamed to mitigate associated stigmas and societal concerns, serves as a poignant backdrop to this issue.
We collected more than 1.6 million unique tweets and analyzed them from five dimensions, including context, extent, content, speaker, and intent.
We identified five high-level topic categories in the toxic online discourse on Twitter, including disease (46.6%), health policy and healthcare (19.3%), homophobia (23.9%), politics.
We found that retweets of toxic content were widespread, while influential users rarely engaged with or countered this toxicity through retweets.
arXiv Detail & Related papers (2024-08-21T19:31:01Z) - Analyzing Toxicity in Deep Conversations: A Reddit Case Study [0.0]
This work employs a tree-based approach to understand how users behave concerning toxicity in public conversation settings.
We collect both the posts and the comment sections of the top 100 posts from 8 Reddit communities that allow profanity, totaling over 1 million responses.
We find that toxic comments increase the likelihood of subsequent toxic comments being produced in online conversations.
arXiv Detail & Related papers (2024-04-11T16:10:44Z) - Comprehensive Assessment of Toxicity in ChatGPT [49.71090497696024]
We evaluate the toxicity in ChatGPT by utilizing instruction-tuning datasets.
prompts in creative writing tasks can be 2x more likely to elicit toxic responses.
Certain deliberately toxic prompts, designed in earlier studies, no longer yield harmful responses.
arXiv Detail & Related papers (2023-11-03T14:37:53Z) - Dynamic Causal Disentanglement Model for Dialogue Emotion Detection [77.96255121683011]
We propose a Dynamic Causal Disentanglement Model based on hidden variable separation.
This model effectively decomposes the content of dialogues and investigates the temporal accumulation of emotions.
Specifically, we propose a dynamic temporal disentanglement model to infer the propagation of utterances and hidden variables.
arXiv Detail & Related papers (2023-09-13T12:58:09Z) - Understanding Multi-Turn Toxic Behaviors in Open-Domain Chatbots [8.763670548363443]
A new attack, toxicbot, is developed to generate toxic responses in a multi-turn conversation.
toxicbot can be used by both industry and researchers to develop methods for detecting and mitigating toxic responses in conversational dialogue.
arXiv Detail & Related papers (2023-07-14T03:58:42Z) - SpokenWOZ: A Large-Scale Speech-Text Benchmark for Spoken Task-Oriented
Dialogue Agents [72.42049370297849]
SpokenWOZ is a large-scale speech-text dataset for spoken TOD.
Cross-turn slot and reasoning slot detection are new challenges for SpokenWOZ.
arXiv Detail & Related papers (2023-05-22T13:47:51Z) - CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a
Context Synergized Hyperbolic Network [52.85130555886915]
CoSyn is a context-synergized neural network that explicitly incorporates user- and conversational context for detecting implicit hate speech in online conversations.
We show that CoSyn outperforms all our baselines in detecting implicit hate speech with absolute improvements in the range of 1.24% - 57.8%.
arXiv Detail & Related papers (2023-03-02T17:30:43Z) - Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable
Topics for the Russian Language [76.58220021791955]
We present two text collections labelled according to binary notion of inapropriateness and a multinomial notion of sensitive topic.
To objectivise the notion of inappropriateness, we define it in a data-driven way though crowdsourcing.
arXiv Detail & Related papers (2022-03-04T15:59:06Z) - Handling Bias in Toxic Speech Detection: A Survey [26.176340438312376]
We look at proposed methods for evaluating and mitigating bias in toxic speech detection.
Case study introduces the concept of bias shift due to knowledge-based bias mitigation.
Survey concludes with an overview of the critical challenges, research gaps, and future directions.
arXiv Detail & Related papers (2022-01-26T10:38:36Z) - Revisiting Contextual Toxicity Detection in Conversations [28.465019968374413]
We show that toxicity labelling by humans is in general influenced by the conversational structure, polarity and topic of the context.
We propose to bring these findings into computational detection models by introducing (a) neural architectures for contextual toxicity detection.
We have also demonstrated that such models can benefit from synthetic data, especially in the social media domain.
arXiv Detail & Related papers (2021-11-24T11:50:37Z) - Toxicity Detection can be Sensitive to the Conversational Context [64.28043776806213]
We construct and publicly release a dataset of 10,000 posts with two kinds of toxicity labels.
We introduce a new task, context sensitivity estimation, which aims to identify posts whose perceived toxicity changes if the context is also considered.
arXiv Detail & Related papers (2021-11-19T13:57:26Z) - Annotators with Attitudes: How Annotator Beliefs And Identities Bias
Toxic Language Detection [75.54119209776894]
We investigate the effect of annotator identities (who) and beliefs (why) on toxic language annotations.
We consider posts with three characteristics: anti-Black language, African American English dialect, and vulgarity.
Our results show strong associations between annotator identity and beliefs and their ratings of toxicity.
arXiv Detail & Related papers (2021-11-15T18:58:20Z) - Speech Toxicity Analysis: A New Spoken Language Processing Task [32.297717021285344]
Toxic speech, also known as hate speech, is regarded as one of the crucial issues plaguing online social media today.
We propose a new Spoken Language Processing task of detecting toxicity from spoken speech.
We introduce DeToxy, the first publicly available toxicity annotated dataset for English speech, sourced from various openly available speech databases.
arXiv Detail & Related papers (2021-10-14T17:51:04Z) - Mitigating Biases in Toxic Language Detection through Invariant
Rationalization [70.36701068616367]
biases toward some attributes, including gender, race, and dialect, exist in most training datasets for toxicity detection.
We propose to use invariant rationalization (InvRat), a game-theoretic framework consisting of a rationale generator and a predictor, to rule out the spurious correlation of certain syntactic patterns.
Our method yields lower false positive rate in both lexical and dialectal attributes than previous debiasing methods.
arXiv Detail & Related papers (2021-06-14T08:49:52Z) - Modeling Language Usage and Listener Engagement in Podcasts [3.8966039534272916]
We investigate how various factors -- vocabulary diversity, distinctiveness, emotion, and syntax -- correlate with engagement.
We build models with different textual representations, and show that the identified features are highly predictive of engagement.
Our analysis tests popular wisdom about stylistic elements in high-engagement podcasts, corroborating some aspects, and adding new perspectives on others.
arXiv Detail & Related papers (2021-06-11T20:40:15Z) - The Structure of Toxic Conversations on Twitter [10.983958397797847]
We study the relationship between structure and toxicity in conversations on Twitter.
At the individual level, we find that toxicity is spread across many low to moderately toxic users.
At the group level, we find that toxic conversations tend to have larger, wider, and deeper reply trees.
arXiv Detail & Related papers (2021-05-25T01:18:02Z) - Detecting Inappropriate Messages on Sensitive Topics that Could Harm a
Company's Reputation [64.22895450493729]
A calm discussion of turtles or fishing less often fuels inappropriate toxic dialogues than a discussion of politics or sexual minorities.
We define a set of sensitive topics that can yield inappropriate and toxic messages and describe the methodology of collecting and labeling a dataset for appropriateness.
arXiv Detail & Related papers (2021-03-09T10:50:30Z) - Challenges in Automated Debiasing for Toxic Language Detection [81.04406231100323]
Biased associations have been a challenge in the development of classifiers for detecting toxic language.
We investigate recently introduced debiasing methods for text classification datasets and models, as applied to toxic language detection.
Our focus is on lexical (e.g., swear words, slurs, identity mentions) and dialectal markers (specifically African American English)
arXiv Detail & Related papers (2021-01-29T22:03:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.