Related papers: Language Independent Stance Detection: Social Interaction-based Embeddings and Large Language Models

Language Independent Stance Detection: Social Interaction-based Embeddings and Large Language Models

URL: http://arxiv.org/abs/2210.05715v2
Date: Thu, 27 Feb 2025 09:17:32 GMT
Title: Language Independent Stance Detection: Social Interaction-based Embeddings and Large Language Models
Authors: Joseba Fernandez de Landa, Rodrigo Agerri,
Abstract summary: This paper aims to take on the stance detection task by placing the emphasis not so much on the text itself but on the interaction available on social networks.<n>We propose a new method to leverage social information such as friends retweets by generating Embeddings.<n>Our experiments on seven publicly available datasets and four different languages show that combining our relational embeddings with discriminative textual methods helps to substantially improve performance.
Score: 4.899818550820576
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The large majority of the research performed on stance detection has been focused on developing more or less sophisticated text classification systems, even when many benchmarks are based on social network data such as Twitter. This paper aims to take on the stance detection task by placing the emphasis not so much on the text itself but on the interaction data available on social networks. More specifically, we propose a new method to leverage social information such as friends and retweets by generating Relational Embeddings, namely, dense vector representations of interaction pairs. Our experiments on seven publicly available datasets and four different languages (Basque, Catalan, Italian, and Spanish) show that combining our relational embeddings with discriminative textual methods helps to substantially improve performance, obtaining state-of-the-art results for six out of seven evaluation settings, outperforming strong baselines based on Large Language Models, or other popular interaction-based approaches such as DeepWalk or node2vec.

Related papers

A Multi-Task Benchmark for Abusive Language Detection in Low-Resource Settings [8.361945776819528]
This work presents a large-scale human-annotated benchmark dataset for abusive language detection in Tigrinya social media.<n>The dataset comprises 13,717 YouTube comments annotated by nine native speakers, collected from 7,373 videos with a total of over 1.2 billion views across 51 channels.<n>Our experiments reveal that small, specialized multi-task models outperform the current frontier models in the low-resource setting.
arXiv Detail & Related papers (2025-05-17T18:52:47Z)
Evaluating and explaining training strategies for zero-shot cross-lingual news sentiment analysis [8.770572911942635]
We introduce novel evaluation datasets in several less-resourced languages. We experiment with a range of approaches including the use of machine translation. We show that language similarity is not in itself sufficient for predicting the success of cross-lingual transfer.
arXiv Detail & Related papers (2024-09-30T07:59:41Z)
MultiSocial: Multilingual Benchmark of Machine-Generated Text Detection of Social-Media Texts [0.6053347262128919]
MultiSocial dataset contains 472,097 texts, of which about 58k are human-written. We use this benchmark to compare existing detection methods in zero-shot as well as fine-tuned form. Our results indicate that the fine-tuned detectors have no problem to be trained on social-media texts.
arXiv Detail & Related papers (2024-06-18T12:26:09Z)
A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus [71.77214818319054]
Natural language inference is a proxy for natural language understanding. There is no publicly available NLI corpus for the Romanian language. We introduce the first Romanian NLI corpus (RoNLI) comprising 58K training sentence pairs.
arXiv Detail & Related papers (2024-05-20T08:41:15Z)
Text Intimacy Analysis using Ensembles of Multilingual Transformers [0.0]
We present our work on the SemEval shared task 9 on predicting the level of intimacy for the given text. The dataset consists of tweets in ten languages, out of which only six are available in the training dataset. We show that an ensemble of multilingual models along with a language-specific monolingual model has the best performance.
arXiv Detail & Related papers (2023-12-05T09:04:22Z)
Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings. Our model operates on parallel data in $N$ languages. We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z)
Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance. This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings. Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z)
A New Generation of Perspective API: Efficient Multilingual Character-level Transformers [66.9176610388952]
We present the fundamentals behind the next version of the Perspective API from Google Jigsaw. At the heart of the approach is a single multilingual token-free Charformer model. We demonstrate that by forgoing static vocabularies, we gain flexibility across a variety of settings.
arXiv Detail & Related papers (2022-02-22T20:55:31Z)
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages [87.5457337866383]
We introduce the Image-Grounded Language Understanding Evaluation benchmark. IGLUE brings together visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages. We find that translate-test transfer is superior to zero-shot transfer and that few-shot learning is hard to harness for many tasks.
arXiv Detail & Related papers (2022-01-27T18:53:22Z)
Few-Shot Cross-Lingual Stance Detection with Sentiment-Based Pre-Training [32.800766653254634]
We present the most comprehensive study of cross-lingual stance detection to date. We use 15 diverse datasets in 12 languages from 6 language families. For our experiments, we build on pattern-exploiting training, proposing the addition of a novel label encoder.
arXiv Detail & Related papers (2021-09-13T15:20:06Z)
Sentiment analysis in tweets: an assessment study from classical to modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information. Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks. This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z)
Semi-automatic Generation of Multilingual Datasets for Stance Detection in Twitter [9.359018642178917]
This paper presents a method to obtain multilingual datasets for stance detection in Twitter. We leverage user-based information to semi-automatically label large amounts of tweets.
arXiv Detail & Related papers (2021-01-28T13:05:09Z)
Be More with Less: Hypergraph Attention Networks for Inductive Text Classification [56.98218530073927]
Graph neural networks (GNNs) have received increasing attention in the research community and demonstrated their promising results on this canonical task. Despite the success, their performance could be largely jeopardized in practice since they are unable to capture high-order interaction between words. We propose a principled model -- hypergraph attention networks (HyperGAT) which can obtain more expressive power with less computational consumption for text representation learning.
arXiv Detail & Related papers (2020-11-01T00:21:59Z)
Language and Visual Entity Relationship Graph for Agent Navigation [54.059606864535304]
Vision-and-Language Navigation (VLN) requires an agent to navigate in a real-world environment following natural language instructions. We propose a novel Language and Visual Entity Relationship Graph for modelling the inter-modal relationships between text and vision. Experiments show that by taking advantage of the relationships we are able to improve over state-of-the-art.
arXiv Detail & Related papers (2020-10-19T08:25:55Z)
TopicBERT: A Transformer transfer learning based memory-graph approach for multimodal streaming social media topic detection [8.338441212378587]
Social networks with bursty short messages and their respective large data scale spread among vast variety of topics are research interest of many researchers. These properties of social networks which are known as 5'Vs of big data has led to many unique and enlightenment algorithms and techniques applied to large social networking datasets and data streams.
arXiv Detail & Related papers (2020-08-16T10:39:50Z)
Human Trajectory Forecasting in Crowds: A Deep Learning Perspective [89.4600982169]
We present an in-depth analysis of existing deep learning-based methods for modelling social interactions. We propose two knowledge-based data-driven methods to effectively capture these social interactions. We develop a large scale interaction-centric benchmark TrajNet++, a significant yet missing component in the field of human trajectory forecasting.
arXiv Detail & Related papers (2020-07-07T17:19:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.