Language Models Learn Metadata: Political Stance Detection Case Study
- URL: http://arxiv.org/abs/2409.13756v1
- Date: Sun, 15 Sep 2024 14:57:41 GMT
- Title: Language Models Learn Metadata: Political Stance Detection Case Study
- Authors: Stanley Cao, Felix Drinkall,
- Abstract summary: This paper investigates the optimal way to incorporate metadata into a political stance detection task.
We show that our simple baseline, using only party membership information, surpasses the current state-of-the-art.
- Score: 1.2277343096128712
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stance detection is a crucial NLP task with numerous applications in social science, from analyzing online discussions to assessing political campaigns. This paper investigates the optimal way to incorporate metadata into a political stance detection task. We demonstrate that previous methods combining metadata with language-based data for political stance detection have not fully utilized the metadata information; our simple baseline, using only party membership information, surpasses the current state-of-the-art. We then show that prepending metadata (e.g., party and policy) to political speeches performs best, outperforming all baselines, indicating that complex metadata inclusion systems may not learn the task optimally.
Related papers
- The Impact of Persona-based Political Perspectives on Hateful Content Detection [4.04666623219944]
Politically diverse language models require computational resources often inaccessible to many researchers and organizations.
Recent work has established that persona-based prompting can introduce political diversity in model outputs without additional training.
We investigate whether such prompting strategies can achieve results comparable to political pretraining for downstream tasks.
arXiv Detail & Related papers (2025-02-01T09:53:17Z) - AgoraSpeech: A multi-annotated comprehensive dataset of political discourse through the lens of humans and AI [1.3060410279656598]
AgoraSpeech is a meticulously curated, high-quality dataset of 171 political speeches from six parties during the Greek national elections in 2023.
The dataset includes annotations (per paragraph) for six natural language processing (NLP) tasks: text classification, topic identification, sentiment analysis, named entity recognition, polarization and populism detection.
arXiv Detail & Related papers (2025-01-09T18:17:59Z) - LLMs for Generalizable Language-Conditioned Policy Learning under Minimal Data Requirements [50.544186914115045]
This paper presents TEDUO, a novel training pipeline for offline language-conditioned policy learning.
TEDUO operates on easy-to-obtain, unlabeled datasets and is suited for the so-called in-the-wild evaluation, wherein the agent encounters previously unseen goals and states.
arXiv Detail & Related papers (2024-12-09T18:43:56Z) - Political-LLM: Large Language Models in Political Science [159.95299889946637]
Large language models (LLMs) have been widely adopted in political science tasks.
Political-LLM aims to advance the comprehensive understanding of integrating LLMs into computational political science.
arXiv Detail & Related papers (2024-12-09T08:47:50Z) - When is Off-Policy Evaluation (Reward Modeling) Useful in Contextual Bandits? A Data-Centric Perspective [64.73162159837956]
evaluating the value of a hypothetical target policy with only a logged dataset is important but challenging.
We propose DataCOPE, a data-centric framework for evaluating a target policy given a dataset.
Our empirical analysis of DataCOPE in the logged contextual bandit settings using healthcare datasets confirms its ability to evaluate both machine-learning and human expert policies.
arXiv Detail & Related papers (2023-11-23T17:13:37Z) - Inducing Political Bias Allows Language Models Anticipate Partisan
Reactions to Controversies [5.958974943807783]
This study addresses the challenge of understanding political bias in digitized discourse using Large Language Models (LLMs)
We present a comprehensive analytical framework, consisting of Partisan Bias Divergence Assessment and Partisan Class Tendency Prediction.
Our findings reveal the model's effectiveness in capturing emotional and moral nuances, albeit with some challenges in stance detection.
arXiv Detail & Related papers (2023-11-16T08:57:53Z) - Machine Learning and Statistical Approaches to Measuring Similarity of
Political Parties [0.4893345190925177]
Mapping political party systems to metric policy spaces is one of the major methodological problems in political science.
We consider how advances in natural language processing, including large transformer-based language models, can be applied to solve that issue.
arXiv Detail & Related papers (2023-06-05T17:53:41Z) - Panning for gold: Lessons learned from the platform-agnostic automated
detection of political content in textual data [48.7576911714538]
We discuss how these techniques can be used to detect political content across different platforms.
We compare the performance of three groups of detection techniques relying on dictionaries, supervised machine learning, or neural networks.
Our results show the limited impact of preprocessing on model performance, with the best results for less noisy data being achieved by neural network- and machine-learning-based models.
arXiv Detail & Related papers (2022-07-01T15:23:23Z) - Supervised Off-Policy Ranking [145.3039527243585]
Off-policy evaluation (OPE) leverages data generated by other policies to evaluate a target policy.
We propose supervised off-policy ranking that learns a policy scoring model by correctly ranking training policies with known performance.
Our method outperforms strong baseline OPE methods in terms of both rank correlation and performance gap between the truly best and the best of the ranked top three policies.
arXiv Detail & Related papers (2021-07-03T07:01:23Z) - Policy Evaluation Networks [50.53250641051648]
We introduce a scalable, differentiable fingerprinting mechanism that retains essential policy information in a concise embedding.
Our empirical results demonstrate that combining these three elements can produce policies that outperform those that generated the training data.
arXiv Detail & Related papers (2020-02-26T23:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.