Optimizing text representations to capture (dis)similarity between
political parties
- URL: http://arxiv.org/abs/2210.11989v1
- Date: Fri, 21 Oct 2022 14:24:57 GMT
- Title: Optimizing text representations to capture (dis)similarity between
political parties
- Authors: Tanise Ceron, Nico Blokker, Sebastian Pad\'o
- Abstract summary: We look at the problem of modeling pairwise similarities between political parties.
Our research question is what level of structural information is necessary to create robust text representation.
We evaluate our models on the manifestos of German parties for the 2021 federal election.
- Score: 1.2891210250935146
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Even though fine-tuned neural language models have been pivotal in enabling
"deep" automatic text analysis, optimizing text representations for specific
applications remains a crucial bottleneck. In this study, we look at this
problem in the context of a task from computational social science, namely
modeling pairwise similarities between political parties. Our research question
is what level of structural information is necessary to create robust text
representation, contrasting a strongly informed approach (which uses both claim
span and claim category annotations) with approaches that forgo one or both
types of annotation with document structure-based heuristics. Evaluating our
models on the manifestos of German parties for the 2021 federal election. We
find that heuristics that maximize within-party over between-party similarity
along with a normalization step lead to reliable party similarity prediction,
without the need for manual annotation.
Related papers
- How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored.
Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges.
We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z) - Multilingual estimation of political-party positioning: From label
aggregation to long-input Transformers [3.651047982634467]
We implement and compare two approaches to automatic scaling analysis of political-party manifestos.
We find that the task can be efficiently solved by state-of-the-art models, with label aggregation producing the best results.
arXiv Detail & Related papers (2023-10-19T08:34:48Z) - SenteCon: Leveraging Lexicons to Learn Human-Interpretable Language
Representations [51.08119762844217]
SenteCon is a method for introducing human interpretability in deep language representations.
We show that SenteCon provides high-level interpretability at little to no cost to predictive performance on downstream tasks.
arXiv Detail & Related papers (2023-05-24T05:06:28Z) - An Inclusive Notion of Text [69.36678873492373]
We argue that clarity on the notion of text is crucial for reproducible and generalizable NLP.
We introduce a two-tier taxonomy of linguistic and non-linguistic elements that are available in textual sources and can be used in NLP modeling.
arXiv Detail & Related papers (2022-11-10T14:26:43Z) - Improve Discourse Dependency Parsing with Contextualized Representations [28.916249926065273]
We propose to take advantage of transformers to encode contextualized representations of units of different levels.
Motivated by the observation of writing patterns commonly shared across articles, we propose a novel method that treats discourse relation identification as a sequence labelling task.
arXiv Detail & Related papers (2022-05-04T14:35:38Z) - Two-stream Hierarchical Similarity Reasoning for Image-text Matching [66.43071159630006]
A hierarchical similarity reasoning module is proposed to automatically extract context information.
Previous approaches only consider learning single-stream similarity alignment.
A two-stream architecture is developed to decompose image-text matching into image-to-text level and text-to-image level similarity computation.
arXiv Detail & Related papers (2022-03-10T12:56:10Z) - Contrastive Learning for Neural Topic Model [14.65513836956786]
adversarial topic models (ATM) can successfully capture semantic patterns of the document by differentiating a document with another dissimilar sample.
We propose a novel approach to re-formulate discriminative goal as an optimization problem, and design a novel sampling method.
Experimental results show that our framework outperforms other state-of-the-art neural topic models in three common benchmark datasets.
arXiv Detail & Related papers (2021-10-25T09:46:26Z) - Electoral Programs of German Parties 2021: A Computational Analysis Of
Their Comprehensibility and Likeability Based On SentiArt [0.0]
We analyze the electoral programs of six German parties issued before the parliamentary elections of 2021.
Using novel indices of the readability and emotion potential of texts computed via SentiArt, our data shed light on the similarities and differences of the programs.
They reveal that the programs of the SPD and CDU have the best chances to be comprehensible and likeable.
arXiv Detail & Related papers (2021-09-26T05:27:14Z) - Author Clustering and Topic Estimation for Short Texts [69.54017251622211]
We propose a novel model that expands on the Latent Dirichlet Allocation by modeling strong dependence among the words in the same document.
We also simultaneously cluster users, removing the need for post-hoc cluster estimation.
Our method performs as well as -- or better -- than traditional approaches to problems arising in short text.
arXiv Detail & Related papers (2021-06-15T20:55:55Z) - Neural Deepfake Detection with Factual Structure of Text [78.30080218908849]
We propose a graph-based model for deepfake detection of text.
Our approach represents the factual structure of a given document as an entity graph.
Our model can distinguish the difference in the factual structure between machine-generated text and human-written text.
arXiv Detail & Related papers (2020-10-15T02:35:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.