A Benchmark for Cross-Domain Argumentative Stance Classification on Social Media
- URL: http://arxiv.org/abs/2410.08900v2
- Date: Fri, 15 Nov 2024 23:18:53 GMT
- Title: A Benchmark for Cross-Domain Argumentative Stance Classification on Social Media
- Authors: Jiaqing Yuan, Ruijie Xi, Munindar P. Singh,
- Abstract summary: Argumentative stance classification plays a key role in identifying authors' viewpoints on specific topics.
Existing benchmarks often come from a single domain or focus on a limited set of topics.
We propose leveraging platform rules, readily available expert-curated content, and large language models to bypass the need for human annotation.
- Score: 12.479554210753664
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Argumentative stance classification plays a key role in identifying authors' viewpoints on specific topics. However, generating diverse pairs of argumentative sentences across various domains is challenging. Existing benchmarks often come from a single domain or focus on a limited set of topics. Additionally, manual annotation for accurate labeling is time-consuming and labor-intensive. To address these challenges, we propose leveraging platform rules, readily available expert-curated content, and large language models to bypass the need for human annotation. Our approach produces a multidomain benchmark comprising 4,498 topical claims and 30,961 arguments from three sources, spanning 21 domains. We benchmark the dataset in fully supervised, zero-shot, and few-shot settings, shedding light on the strengths and limitations of different methodologies. We release the dataset and code in this study at hidden for anonymity.
Related papers
- Can Humans Identify Domains? [17.579694463517363]
Textual domain is a crucial property within the Natural Language Processing (NLP) community due to its effects on downstream model performance.
We investigate the core notion of domains via human proficiency in identifying related intrinsic textual properties.
We find that despite the ubiquity of domains in NLP, there is little human consensus on how to define them.
arXiv Detail & Related papers (2024-04-02T09:49:07Z) - Structured Latent Embeddings for Recognizing Unseen Classes in Unseen
Domains [108.11746235308046]
We propose a novel approach that learns domain-agnostic structured latent embeddings by projecting images from different domains.
Our experiments on the challenging DomainNet and DomainNet-LS benchmarks show the superiority of our approach over existing methods.
arXiv Detail & Related papers (2021-07-12T17:57:46Z) - Author Clustering and Topic Estimation for Short Texts [69.54017251622211]
We propose a novel model that expands on the Latent Dirichlet Allocation by modeling strong dependence among the words in the same document.
We also simultaneously cluster users, removing the need for post-hoc cluster estimation.
Our method performs as well as -- or better -- than traditional approaches to problems arising in short text.
arXiv Detail & Related papers (2021-06-15T20:55:55Z) - Cross-Domain Label-Adaptive Stance Detection [32.800766653254634]
Stance detection concerns the classification of a writer's viewpoint towards a target.
In this paper, we perform an in-depth analysis of 16 stance detection datasets.
We propose an end-to-end unsupervised framework for out-of-domain prediction of unseen, user-defined labels.
arXiv Detail & Related papers (2021-04-15T14:04:29Z) - Learning to Select Context in a Hierarchical and Global Perspective for
Open-domain Dialogue Generation [15.01710843286394]
We propose a novel model with hierarchical self-attention mechanism and distant supervision to detect relevant words and utterances in short and long distances.
Our model significantly outperforms other baselines in terms of fluency, coherence, and informativeness.
arXiv Detail & Related papers (2021-02-18T11:56:42Z) - WikiAsp: A Dataset for Multi-domain Aspect-based Summarization [69.13865812754058]
We propose WikiAsp, a large-scale dataset for multi-domain aspect-based summarization.
Specifically, we build the dataset using Wikipedia articles from 20 different domains, using the section titles and boundaries of each article as a proxy for aspect annotation.
Results highlight key challenges that existing summarization models face in this setting, such as proper pronoun handling of quoted sources and consistent explanation of time-sensitive events.
arXiv Detail & Related papers (2020-11-16T10:02:52Z) - A Review of Single-Source Deep Unsupervised Visual Domain Adaptation [81.07994783143533]
Large-scale labeled training datasets have enabled deep neural networks to excel across a wide range of benchmark vision tasks.
In many applications, it is prohibitively expensive and time-consuming to obtain large quantities of labeled data.
To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain.
arXiv Detail & Related papers (2020-09-01T00:06:50Z) - Text Recognition in Real Scenarios with a Few Labeled Samples [55.07859517380136]
Scene text recognition (STR) is still a hot research topic in computer vision field.
This paper proposes a few-shot adversarial sequence domain adaptation (FASDA) approach to build sequence adaptation.
Our approach can maximize the character-level confusion between the source domain and the target domain.
arXiv Detail & Related papers (2020-06-22T13:03:01Z) - Unsupervised Domain Clusters in Pretrained Language Models [61.832234606157286]
We show that massive pre-trained language models implicitly learn sentence representations that cluster by domains without supervision.
We propose domain data selection methods based on such models.
We evaluate our data selection methods for neural machine translation across five diverse domains.
arXiv Detail & Related papers (2020-04-05T06:22:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.