Analyzing German Parliamentary Speeches: A Machine Learning Approach for Topic and Sentiment Classification
- URL: http://arxiv.org/abs/2508.03181v1
- Date: Tue, 05 Aug 2025 07:44:42 GMT
- Title: Analyzing German Parliamentary Speeches: A Machine Learning Approach for Topic and Sentiment Classification
- Authors: Lukas Pätz, Moritz Beyer, Jannik Späth, Lasse Bohlen, Patrick Zschech, Mathias Kraus, Julian Rosenberger,
- Abstract summary: This study investigates political discourse in the German parliament, the Bundestag, by analyzing approximately 28,000 parliamentary speeches.<n>Two machine learning models for topic and sentiment classification were developed and trained on a manually labeled dataset.<n>The models showed strong classification performance, achieving an area under the receiver operating characteristic curve (AUROC) of 0.94 for topic classification.
- Score: 3.3595341706248876
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study investigates political discourse in the German parliament, the Bundestag, by analyzing approximately 28,000 parliamentary speeches from the last five years. Two machine learning models for topic and sentiment classification were developed and trained on a manually labeled dataset. The models showed strong classification performance, achieving an area under the receiver operating characteristic curve (AUROC) of 0.94 for topic classification (average across topics) and 0.89 for sentiment classification. Both models were applied to assess topic trends and sentiment distributions across political parties and over time. The analysis reveals remarkable relationships between parties and their role in parliament. In particular, a change in style can be observed for parties moving from government to opposition. While ideological positions matter, governing responsibilities also shape discourse. The analysis directly addresses key questions about the evolution of topics, sentiment dynamics, and party-specific discourse strategies in the Bundestag.
Related papers
- SpeechRole: A Large-Scale Dataset and Benchmark for Evaluating Speech Role-Playing Agents [52.29009595100625]
Role-playing agents have emerged as a promising paradigm for achieving personalized interaction and emotional resonance.<n>Existing research primarily focuses on the textual modality, neglecting the critical dimension of speech in realistic interactive scenarios.<n>We construct SpeechRole-Data, a large-scale, high-quality dataset that comprises 98 diverse roles and 112k speech-based single-turn and multi-turn conversations.
arXiv Detail & Related papers (2025-08-04T03:18:36Z) - Talking Point based Ideological Discourse Analysis in News Events [62.18747509565779]
We propose a framework motivated by the theory of ideological discourse analysis to analyze news articles related to real-world events.<n>Our framework represents the news articles using a relational structure - talking points, which captures the interaction between entities, their roles, and media frames along with a topic of discussion.<n>We evaluate our framework's ability to generate these perspectives through automated tasks - ideology and partisan classification tasks, supplemented by human validation.
arXiv Detail & Related papers (2025-04-10T02:52:34Z) - Beyond Partisan Leaning: A Comparative Analysis of Political Bias in Large Language Models [6.549047699071195]
This study adopts a persona-free, topic-specific approach to evaluate political behavior in large language models.<n>We analyze responses from 43 large language models developed in the U.S., Europe, China, and the Middle East.<n>Findings show most models lean center-left or left ideologically and vary in their nonpartisan engagement patterns.
arXiv Detail & Related papers (2024-12-21T19:42:40Z) - SpeakGer: A meta-data enriched speech corpus of German state and federal parliaments [0.12277343096128711]
We provide the SpeakGer data set, consisting of German parliament debates from all 16 federal states of Germany as well as the German Bundestag from 1947-2023.
This data set includes rich meta data in form of information on both reactions from the audience towards the speech as well as information about the speaker's party, their age, their constituency and their party's political alignment.
arXiv Detail & Related papers (2024-10-23T14:00:48Z) - A Structural Text-Based Scaling Model for Analyzing Political Discourse [0.0]
We introduce the Structural Text-Based Scaling model to infer ideological positions of speakers for latent topics from text data.
Applying STBS to U.S. Senate speeches, we identify immigration and gun violence as the most polarizing topics between the two major parties in Congress.
arXiv Detail & Related papers (2024-10-14T14:05:35Z) - Speaker attribution in German parliamentary debates with QLoRA-adapted
large language models [0.0]
We study the potential of the large language model family Llama 2 to automate speaker attribution in German parliamentary debates from 2017-2021.
Our results shed light on the capabilities of large language models in automating speaker attribution, revealing a promising avenue for computational analysis of political discourse and the development of semantic role labeling systems.
arXiv Detail & Related papers (2023-09-18T16:06:16Z) - The ParlaSent Multilingual Training Dataset for Sentiment Identification in Parliamentary Proceedings [0.0]
The paper presents a new training dataset of sentences in 7 languages, manually annotated for sentiment.
The paper additionally introduces the first domain-specific multilingual transformer language model for political science applications.
arXiv Detail & Related papers (2023-09-18T14:01:06Z) - The Face of Populism: Examining Differences in Facial Emotional Expressions of Political Leaders Using Machine Learning [50.24983453990065]
We use a deep-learning approach to process a sample of 220 YouTube videos of political leaders from 15 different countries.<n>We observe statistically significant differences in the average score of negative emotions between groups of leaders with varying degrees of populist rhetoric.
arXiv Detail & Related papers (2023-04-19T18:32:49Z) - Computational Assessment of Hyperpartisanship in News Titles [55.92100606666497]
We first adopt a human-guided machine learning framework to develop a new dataset for hyperpartisan news title detection.
Overall the Right media tends to use proportionally more hyperpartisan titles.
We identify three major topics including foreign issues, political systems, and societal issues that are suggestive of hyperpartisanship in news titles.
arXiv Detail & Related papers (2023-01-16T05:56:58Z) - A Spanish dataset for Targeted Sentiment Analysis of political headlines [0.0]
This work addresses the task of Targeted Sentiment Analysis for the domain of news headlines, published by the main outlets during the 2019 Argentinean Presidential Elections.
We present a polarity dataset of 1,976 headlines mentioning candidates in the 2019 elections at the target level.
Preliminary experiments with state-of-the-art classification algorithms based on pre-trained linguistic models suggest that target information is helpful for this task.
arXiv Detail & Related papers (2022-08-30T01:30:30Z) - Sentiment analysis in tweets: an assessment study from classical to
modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information.
Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks.
This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z) - A computational model implementing subjectivity with the 'Room Theory'.
The case of detecting Emotion from Text [68.8204255655161]
This work introduces a new method to consider subjectivity and general context dependency in text analysis.
By using similarity measure between words, we are able to extract the relative relevance of the elements in the benchmark.
This method could be applied to all the cases where evaluating subjectivity is relevant to understand the relative value or meaning of a text.
arXiv Detail & Related papers (2020-05-12T21:26:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.