Which Argumentative Aspects of Hate Speech in Social Media can be
reliably identified?
- URL: http://arxiv.org/abs/2306.02978v1
- Date: Mon, 5 Jun 2023 15:50:57 GMT
- Title: Which Argumentative Aspects of Hate Speech in Social Media can be
reliably identified?
- Authors: Dami\'an Furman, Pablo Torres, Jos\'e A. Rodr\'iguez, Diego Letzen,
Vanina Mart\'inez, Laura Alonso Alemany
- Abstract summary: It is unclear which aspects of argumentation can be reliably identified and integrated in language models.
We show that some components can be identified with reasonable reliability.
We propose adaptations of those categories that can be more reliably reproduced.
- Score: 2.7647400328727256
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: With the increasing diversity of use cases of large language models, a more
informative treatment of texts seems necessary. An argumentative analysis could
foster a more reasoned usage of chatbots, text completion mechanisms or other
applications. However, it is unclear which aspects of argumentation can be
reliably identified and integrated in language models. In this paper, we
present an empirical assessment of the reliability with which different
argumentative aspects can be automatically identified in hate speech in social
media. We have enriched the Hateval corpus (Basile et al. 2019) with a manual
annotation of some argumentative components, adapted from Wagemans (2016)'s
Periodic Table of Arguments. We show that some components can be identified
with reasonable reliability. For those that present a high error ratio, we
analyze the patterns of disagreement between expert annotators and errors in
automatic procedures, and we propose adaptations of those categories that can
be more reliably reproduced.
Related papers
- Evaluating Text Classification Robustness to Part-of-Speech Adversarial Examples [0.6445605125467574]
Adversarial examples are inputs that are designed to trick the decision making process, and are intended to be imperceptible to humans.
For text-based classification systems, changes to the input, a string of text, are always perceptible.
To improve the quality of text-based adversarial examples, we need to know what elements of the input text are worth focusing on.
arXiv Detail & Related papers (2024-08-15T18:33:54Z) - Consolidating Strategies for Countering Hate Speech Using Persuasive
Dialogues [3.8979646385036175]
We explore controllable strategies for generating counter-arguments to hateful comments in online conversations.
Using automatic and human evaluations, we determine the best combination of features that generate fluent, argumentative, and logically sound arguments.
We share developed computational models for automatically annotating text with such features, and a silver-standard annotated version of an existing hate speech dialog corpora.
arXiv Detail & Related papers (2024-01-15T16:31:18Z) - Dialogue Quality and Emotion Annotations for Customer Support
Conversations [7.218791626731783]
This paper presents a holistic annotation approach for emotion and conversational quality in the context of bilingual customer support conversations.
It provides a unique and valuable resource for the development of text classification models.
arXiv Detail & Related papers (2023-11-23T10:56:14Z) - AI, write an essay for me: A large-scale comparison of human-written
versus ChatGPT-generated essays [66.36541161082856]
ChatGPT and similar generative AI models have attracted hundreds of millions of users.
This study compares human-written versus ChatGPT-generated argumentative student essays.
arXiv Detail & Related papers (2023-04-24T12:58:28Z) - Models See Hallucinations: Evaluating the Factuality in Video Captioning [57.85548187177109]
We conduct a human evaluation of the factuality in video captioning and collect two annotated factuality datasets.
We find that 57.0% of the model-generated sentences have factual errors, indicating it is a severe problem in this field.
We propose a weakly-supervised, model-based factuality metric FactVC, which outperforms previous metrics on factuality evaluation of video captioning.
arXiv Detail & Related papers (2023-03-06T08:32:50Z) - Shapley Head Pruning: Identifying and Removing Interference in
Multilingual Transformers [54.4919139401528]
We show that it is possible to reduce interference by identifying and pruning language-specific parameters.
We show that removing identified attention heads from a fixed model improves performance for a target language on both sentence classification and structural prediction.
arXiv Detail & Related papers (2022-10-11T18:11:37Z) - Polling Latent Opinions: A Method for Computational Sociolinguistics
Using Transformer Language Models [4.874780144224057]
We use the capacity for memorization and extrapolation of Transformer Language Models to learn the linguistic behaviors of a subgroup within larger corpora of Yelp reviews.
We show that even in cases where a specific keyphrase is limited or not present at all in the training corpora, the GPT is able to accurately generate large volumes of text that have the correct sentiment.
arXiv Detail & Related papers (2022-04-15T14:33:58Z) - Author Clustering and Topic Estimation for Short Texts [69.54017251622211]
We propose a novel model that expands on the Latent Dirichlet Allocation by modeling strong dependence among the words in the same document.
We also simultaneously cluster users, removing the need for post-hoc cluster estimation.
Our method performs as well as -- or better -- than traditional approaches to problems arising in short text.
arXiv Detail & Related papers (2021-06-15T20:55:55Z) - Curious Case of Language Generation Evaluation Metrics: A Cautionary
Tale [52.663117551150954]
A few popular metrics remain as the de facto metrics to evaluate tasks such as image captioning and machine translation.
This is partly due to ease of use, and partly because researchers expect to see them and know how to interpret them.
In this paper, we urge the community for more careful consideration of how they automatically evaluate their models.
arXiv Detail & Related papers (2020-10-26T13:57:20Z) - Fast and Robust Unsupervised Contextual Biasing for Speech Recognition [16.557586847398778]
We propose an alternative approach that does not entail explicit contextual language model.
We derive the bias score for every word in the system vocabulary from the training corpus.
We show significant improvement in recognition accuracy when the relevant context is available.
arXiv Detail & Related papers (2020-05-04T17:29:59Z) - Aspect-Controlled Neural Argument Generation [65.91772010586605]
We train a language model for argument generation that can be controlled on a fine-grained level to generate sentence-level arguments for a given topic, stance, and aspect.
Our evaluation shows that our generation model is able to generate high-quality, aspect-specific arguments.
These arguments can be used to improve the performance of stance detection models via data augmentation and to generate counter-arguments.
arXiv Detail & Related papers (2020-04-30T20:17:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.