Related papers: Who is we? Disambiguating the referents of first person plural pronouns in parliamentary debates

Who is we? Disambiguating the referents of first person plural pronouns in parliamentary debates

URL: http://arxiv.org/abs/2205.14182v1
Date: Fri, 27 May 2022 18:18:04 GMT
Title: Who is we? Disambiguating the referents of first person plural pronouns in parliamentary debates
Authors: Ines Rehbein, Josef Ruppenhofer and Julian Bernauer
Abstract summary: We present an annotation schema for disambiguating pronoun references and use our schema to create an annotated corpus of debates from the German Bundestag. We then use our corpus to learn to automatically resolve pronoun referents in parliamentary debates.
Score: 9.09904590211839
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: This paper investigates the use of first person plural pronouns as a rhetorical device in political speeches. We present an annotation schema for disambiguating pronoun references and use our schema to create an annotated corpus of debates from the German Bundestag. We then use our corpus to learn to automatically resolve pronoun referents in parliamentary debates. We explore the use of data augmentation with weak supervision to further expand our corpus and report preliminary results.

Related papers

Identifying Speaker Information in Feed-Forward Layers of Self-Supervised Speech Transformers [50.9040167152168]
We analyze neurons associated with k-means clusters of self-supervised features and i-vectors.<n>Our analysis reveals that these clusters correspond to broad phonetic and gender classes.<n>By protecting these neurons during pruning, we can significantly preserve performance on speaker-related task.
arXiv Detail & Related papers (2025-06-26T18:54:26Z)
Mention Attention for Pronoun Translation [5.896961355859321]
We introduce an additional mention attention module in the decoder to pay extra attention to source mentions but not non-mention tokens. Our mention attention module not only extracts features from source mentions, but also considers target-side context which benefits pronoun translation. We conduct experiments on the WMT17 English-German translation task, and evaluate our models on general translation and pronoun translation.
arXiv Detail & Related papers (2024-12-19T13:19:19Z)
The Knesset Corpus: An Annotated Corpus of Hebrew Parliamentary Proceedings [3.2405928866433067]
We present the Corpus Knesset, a corpus of Hebrew parliamentary proceedings from 1998 to 2022. We show that the corpus can be used to examine historical developments in the style of political discussions. We also investigate some differences between the styles of men and women speakers.
arXiv Detail & Related papers (2024-05-28T12:23:39Z)
KamerRaad: Enhancing Information Retrieval in Belgian National Politics through Hierarchical Summarization and Conversational Interfaces [55.00702535694059]
KamerRaad is an AI tool that leverages large language models to help citizens interactively engage with Belgian political information. The tool extracts and concisely summarizes key excerpts from parliamentary proceedings, followed by the potential for interaction based on generative AI.
arXiv Detail & Related papers (2024-04-22T15:01:39Z)
Audio-Visual Neural Syntax Acquisition [91.14892278795892]
We study phrase structure induction from visually-grounded speech. We present the Audio-Visual Neural Syntax Learner (AV-NSL) that learns phrase structure by listening to audio and looking at images, without ever being exposed to text.
arXiv Detail & Related papers (2023-10-11T16:54:57Z)
Dialogs Re-enacted Across Languages [2.5425323889482336]
We present a protocol for collecting closely matched pairs of utterances across languages. This report is intended for: people using this corpus, people extending this corpus, and people designing similar collections of bilingual dialog data.
arXiv Detail & Related papers (2022-11-18T17:08:12Z)
BasqueParl: A Bilingual Corpus of Basque Parliamentary Transcriptions [3.4447242282168777]
We release the first version of a newly compiled corpus from Basque parliamentary transcripts. The corpus is characterized by heavy Basque-Spanish code-switching, and represents an interesting resource to study political discourse in contrasting languages such as Basque and Spanish.
arXiv Detail & Related papers (2022-05-03T14:02:24Z)
DEBACER: a method for slicing moderated debates [55.705662163385966]
Partitioning debates into blocks with the same subject is essential for understanding. We propose a new algorithm, DEBACER, which partitions moderated debates.
arXiv Detail & Related papers (2021-12-10T10:39:07Z)
Explaining Latent Representations with a Corpus of Examples [72.50996504722293]
We propose SimplEx: a user-centred method that provides example-based explanations with reference to a freely selected set of examples. SimplEx uses the corpus to improve the user's understanding of the latent space with post-hoc explanations. We show that SimplEx empowers the user by highlighting relevant patterns in the corpus that explain model representations.
arXiv Detail & Related papers (2021-10-28T17:59:06Z)
Exophoric Pronoun Resolution in Dialogues with Topic Regularization [84.23706744602217]
Resolving pronouns to their referents has long been studied as a fundamental natural language understanding problem. Previous works on pronoun coreference resolution (PCR) mostly focus on resolving pronouns to mentions in text while ignoring the exophoric scenario. We propose to jointly leverage the local context and global topics of dialogues to solve the out-of-textPCR problem.
arXiv Detail & Related papers (2021-09-10T11:08:31Z)
Filling the Gap of Utterance-aware and Speaker-aware Representation for Multi-turn Dialogue [76.88174667929665]
A multi-turn dialogue is composed of multiple utterances from two or more different speaker roles. In the existing retrieval-based multi-turn dialogue modeling, the pre-trained language models (PrLMs) as encoder represent the dialogues coarsely. We propose a novel model to fill such a gap by modeling the effective utterance-aware and speaker-aware representations entailed in a dialogue history.
arXiv Detail & Related papers (2020-09-14T15:07:19Z)
The Discussion Tracker Corpus of Collaborative Argumentation [2.800857580710507]
The Discussion Tracker corpus was collected in American high school English classes. The corpus consists of 29 multi-party discussions of English literature transcribed from 985 minutes of audio.
arXiv Detail & Related papers (2020-05-22T18:27:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.