Language Models Learn Metadata: Political Stance Detection Case Study
- URL: http://arxiv.org/abs/2409.13756v1
- Date: Sun, 15 Sep 2024 14:57:41 GMT
- Title: Language Models Learn Metadata: Political Stance Detection Case Study
- Authors: Stanley Cao, Felix Drinkall,
- Abstract summary: This paper investigates the optimal way to incorporate metadata into a political stance detection task.
We show that our simple baseline, using only party membership information, surpasses the current state-of-the-art.
- Score: 1.2277343096128712
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stance detection is a crucial NLP task with numerous applications in social science, from analyzing online discussions to assessing political campaigns. This paper investigates the optimal way to incorporate metadata into a political stance detection task. We demonstrate that previous methods combining metadata with language-based data for political stance detection have not fully utilized the metadata information; our simple baseline, using only party membership information, surpasses the current state-of-the-art. We then show that prepending metadata (e.g., party and policy) to political speeches performs best, outperforming all baselines, indicating that complex metadata inclusion systems may not learn the task optimally.
Related papers
- The Impact of Persona-based Political Perspectives on Hateful Content Detection [4.04666623219944]
Politically diverse language models require computational resources often inaccessible to many researchers and organizations.
Recent work has established that persona-based prompting can introduce political diversity in model outputs without additional training.
We investigate whether such prompting strategies can achieve results comparable to political pretraining for downstream tasks.
arXiv Detail & Related papers (2025-02-01T09:53:17Z) - AgoraSpeech: A multi-annotated comprehensive dataset of political discourse through the lens of humans and AI [1.3060410279656598]
AgoraSpeech is a meticulously curated, high-quality dataset of 171 political speeches from six parties during the Greek national elections in 2023.
The dataset includes annotations (per paragraph) for six natural language processing (NLP) tasks: text classification, topic identification, sentiment analysis, named entity recognition, polarization and populism detection.
arXiv Detail & Related papers (2025-01-09T18:17:59Z) - LLMs for Generalizable Language-Conditioned Policy Learning under Minimal Data Requirements [50.544186914115045]
This paper presents TEDUO, a novel training pipeline for offline language-conditioned policy learning.
TEDUO operates on easy-to-obtain, unlabeled datasets and is suited for the so-called in-the-wild evaluation, wherein the agent encounters previously unseen goals and states.
arXiv Detail & Related papers (2024-12-09T18:43:56Z) - Political-LLM: Large Language Models in Political Science [159.95299889946637]
Large language models (LLMs) have been widely adopted in political science tasks.
Political-LLM aims to advance the comprehensive understanding of integrating LLMs into computational political science.
arXiv Detail & Related papers (2024-12-09T08:47:50Z) - Automated Detection and Analysis of Data Practices Using A Real-World
Corpus [20.4572759138767]
We propose an automated approach to identify and visualize data practices within privacy policies at different levels of detail.
Our approach accurately matches data practice descriptions with policy excerpts, facilitating the presentation of simplified privacy information to users.
arXiv Detail & Related papers (2024-02-16T18:51:40Z) - Language-Conditioned Semantic Search-Based Policy for Robotic
Manipulation Tasks [2.1332830068386217]
We propose a language-conditioned semantic search-based method to produce an online search-based policy.
Our approach surpasses the performance of the baselines on the CALVIN benchmark and exhibits strong zero-shot adaptation capabilities.
arXiv Detail & Related papers (2023-12-10T16:17:00Z) - When is Off-Policy Evaluation (Reward Modeling) Useful in Contextual Bandits? A Data-Centric Perspective [64.73162159837956]
evaluating the value of a hypothetical target policy with only a logged dataset is important but challenging.
We propose DataCOPE, a data-centric framework for evaluating a target policy given a dataset.
Our empirical analysis of DataCOPE in the logged contextual bandit settings using healthcare datasets confirms its ability to evaluate both machine-learning and human expert policies.
arXiv Detail & Related papers (2023-11-23T17:13:37Z) - Inducing Political Bias Allows Language Models Anticipate Partisan
Reactions to Controversies [5.958974943807783]
This study addresses the challenge of understanding political bias in digitized discourse using Large Language Models (LLMs)
We present a comprehensive analytical framework, consisting of Partisan Bias Divergence Assessment and Partisan Class Tendency Prediction.
Our findings reveal the model's effectiveness in capturing emotional and moral nuances, albeit with some challenges in stance detection.
arXiv Detail & Related papers (2023-11-16T08:57:53Z) - Making Metadata More FAIR Using Large Language Models [2.61630828688114]
This work presents a Natural Language Processing (NLP) informed application, called FAIRMetaText, that compares metadata.
Specifically, FAIRMetaText analyzes the natural language descriptions of metadata and provides a mathematical similarity measure between two terms.
This software can drastically reduce the human effort in sifting through various natural language metadata while employing several experimental datasets on the same topic.
arXiv Detail & Related papers (2023-07-24T19:14:38Z) - Machine Learning and Statistical Approaches to Measuring Similarity of
Political Parties [0.4893345190925177]
Mapping political party systems to metric policy spaces is one of the major methodological problems in political science.
We consider how advances in natural language processing, including large transformer-based language models, can be applied to solve that issue.
arXiv Detail & Related papers (2023-06-05T17:53:41Z) - Panning for gold: Lessons learned from the platform-agnostic automated
detection of political content in textual data [48.7576911714538]
We discuss how these techniques can be used to detect political content across different platforms.
We compare the performance of three groups of detection techniques relying on dictionaries, supervised machine learning, or neural networks.
Our results show the limited impact of preprocessing on model performance, with the best results for less noisy data being achieved by neural network- and machine-learning-based models.
arXiv Detail & Related papers (2022-07-01T15:23:23Z) - Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech
Translation [75.86581380817464]
A SimulST system generally includes two components: the pre-decision that aggregates the speech information and the policy that decides to read or write.
This paper proposes to model the adaptive policy by adapting the Continuous Integrate-and-Fire (CIF)
Compared with monotonic multihead attention (MMA), our method has the advantage of simpler computation, superior quality at low latency, and better generalization to long utterances.
arXiv Detail & Related papers (2022-03-22T23:33:18Z) - Supervised Off-Policy Ranking [145.3039527243585]
Off-policy evaluation (OPE) leverages data generated by other policies to evaluate a target policy.
We propose supervised off-policy ranking that learns a policy scoring model by correctly ranking training policies with known performance.
Our method outperforms strong baseline OPE methods in terms of both rank correlation and performance gap between the truly best and the best of the ranked top three policies.
arXiv Detail & Related papers (2021-07-03T07:01:23Z) - Meta-Learning with Fewer Tasks through Task Interpolation [67.03769747726666]
Current meta-learning algorithms require a large number of meta-training tasks, which may not be accessible in real-world scenarios.
By meta-learning with task gradient (MLTI), our approach effectively generates additional tasks by randomly sampling a pair of tasks and interpolating the corresponding features and labels.
Empirically, in our experiments on eight datasets from diverse domains, we find that the proposed general MLTI framework is compatible with representative meta-learning algorithms and consistently outperforms other state-of-the-art strategies.
arXiv Detail & Related papers (2021-06-04T20:15:34Z) - Policy Evaluation Networks [50.53250641051648]
We introduce a scalable, differentiable fingerprinting mechanism that retains essential policy information in a concise embedding.
Our empirical results demonstrate that combining these three elements can produce policies that outperform those that generated the training data.
arXiv Detail & Related papers (2020-02-26T23:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.