Vocabulary-based Method for Quantifying Controversy in Social Media
- URL: http://arxiv.org/abs/2001.09899v1
- Date: Tue, 14 Jan 2020 17:43:21 GMT
- Title: Vocabulary-based Method for Quantifying Controversy in Social Media
- Authors: Juan Manuel Ortiz de Zarate and Esteban Feuerstein
- Abstract summary: We develop a method for controversy detection based primarily on the jargon used by the communities in social media.
Our method dispenses with the use of domain-specific knowledge, is language-agnostic, efficient and easy to apply.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Identifying controversial topics is not only interesting from a social point
of view, it also enables the application of methods to avoid the information
segregation, creating better discussion contexts and reaching agreements in the
best cases. In this paper we develop a systematic method for controversy
detection based primarily on the jargon used by the communities in social
media. Our method dispenses with the use of domain-specific knowledge, is
language-agnostic, efficient and easy to apply. We perform an extensive set of
experiments across many languages, regions and contexts, taking controversial
and non-controversial topics. We find that our vocabulary-based measure
performs better than state of the art measures that are based only on the
community graph structure. Moreover, we shows that it is possible to detect
polarization through text analysis.
Related papers
- FREDSum: A Dialogue Summarization Corpus for French Political Debates [26.76383031532945]
We present a dataset of French political debates for the purpose of enhancing resources for multi-lingual dialogue summarization.
Our dataset consists of manually transcribed and annotated political debates, covering a range of topics and perspectives.
arXiv Detail & Related papers (2023-12-08T05:42:04Z) - Towards Open Vocabulary Learning: A Survey [146.90188069113213]
Deep neural networks have made impressive advancements in various core tasks like segmentation, tracking, and detection.
Recently, open vocabulary settings were proposed due to the rapid progress of vision language pre-training.
This paper provides a thorough review of open vocabulary learning, summarizing and analyzing recent developments in the field.
arXiv Detail & Related papers (2023-06-28T02:33:06Z) - A Human Word Association based model for topic detection in social networks [1.8749305679160366]
This paper introduces a topic detection framework for social networks based on the concept of imitating the mental ability of word association.
The performance of this framework is evaluated using the FA-CUP dataset, a benchmark in the field of topic detection.
arXiv Detail & Related papers (2023-01-30T17:10:34Z) - Contextual information integration for stance detection via
cross-attention [59.662413798388485]
Stance detection deals with identifying an author's stance towards a target.
Most existing stance detection models are limited because they do not consider relevant contextual information.
We propose an approach to integrate contextual information as text.
arXiv Detail & Related papers (2022-11-03T15:04:29Z) - Keywords and Instances: A Hierarchical Contrastive Learning Framework
Unifying Hybrid Granularities for Text Generation [59.01297461453444]
We propose a hierarchical contrastive learning mechanism, which can unify hybrid granularities semantic meaning in the input text.
Experiments demonstrate that our model outperforms competitive baselines on paraphrasing, dialogue generation, and storytelling tasks.
arXiv Detail & Related papers (2022-05-26T13:26:03Z) - Towards Identifying Social Bias in Dialog Systems: Frame, Datasets, and
Benchmarks [95.29345070102045]
In this paper, we focus our investigation on social bias detection of dialog safety problems.
We first propose a novel Dial-Bias Frame for analyzing the social bias in conversations pragmatically.
We introduce CDail-Bias dataset that is the first well-annotated Chinese social bias dialog dataset.
arXiv Detail & Related papers (2022-02-16T11:59:29Z) - Retrieval-Free Knowledge-Grounded Dialogue Response Generation with
Adapters [52.725200145600624]
We propose KnowExpert to bypass the retrieval process by injecting prior knowledge into the pre-trained language models with lightweight adapters.
Experimental results show that KnowExpert performs comparably with the retrieval-based baselines.
arXiv Detail & Related papers (2021-05-13T12:33:23Z) - Term-community-based topic detection with variable resolution [0.0]
Network-based procedures for topic detection in huge text collections offer an intuitive alternative to probabilistic topic models.
We present a method that is especially designed with the requirements of domain experts in mind.
We demonstrate the application of our method with a widely used corpus of general news articles and show the results of detailed social-sciences expert evaluations.
arXiv Detail & Related papers (2021-03-25T01:29:39Z) - Detecting and Classifying Malevolent Dialogue Responses: Taxonomy, Data
and Methodology [68.8836704199096]
Corpus-based conversational interfaces are able to generate more diverse and natural responses than template-based or retrieval-based agents.
With their increased generative capacity of corpusbased conversational agents comes the need to classify and filter out malevolent responses.
Previous studies on the topic of recognizing and classifying inappropriate content are mostly focused on a certain category of malevolence.
arXiv Detail & Related papers (2020-08-21T22:43:27Z) - Discovering and Categorising Language Biases in Reddit [5.670038395203354]
This paper proposes a data-driven approach to automatically discover language biases encoded in the vocabulary of online discourse communities on Reddit.
We use word embeddings to transform text into high-dimensional dense vectors and capture semantic relations between words.
We successfully discover gender bias, religion bias, and ethnic bias in different Reddit communities.
arXiv Detail & Related papers (2020-08-06T16:42:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.