Toward a digital twin of U.S. Congress
- URL: http://arxiv.org/abs/2505.00006v1
- Date: Fri, 04 Apr 2025 21:33:36 GMT
- Title: Toward a digital twin of U.S. Congress
- Authors: Hayden Helm, Tianyi Chen, Harvey McGuinness, Paige Lee, Brandon Duderstadt, Carey E. Priebe,
- Abstract summary: We introduce and provide descriptions of a daily-updated dataset that contains every Tweet from every U.S. congressperson during their respective terms.<n>We demonstrate that a modern language model equipped with congressperson-specific subsets of this data are capable of producing Tweets that are largely indistinguishable from actual Tweets posted by their physical counterparts.
- Score: 31.41179786444486
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we provide evidence that a virtual model of U.S. congresspersons based on a collection of language models satisfies the definition of a digital twin. In particular, we introduce and provide high-level descriptions of a daily-updated dataset that contains every Tweet from every U.S. congressperson during their respective terms. We demonstrate that a modern language model equipped with congressperson-specific subsets of this data are capable of producing Tweets that are largely indistinguishable from actual Tweets posted by their physical counterparts. We illustrate how generated Tweets can be used to predict roll-call vote behaviors and to quantify the likelihood of congresspersons crossing party lines, thereby assisting stakeholders in allocating resources and potentially impacting real-world legislative dynamics. We conclude with a discussion of the limitations and important extensions of our analysis.
Related papers
- Persona-driven Simulation of Voting Behavior in the European Parliament with Large Language Models [1.7990260056064977]
We analyze whether zero-shot persona prompting with limited information can accurately predict individual voting decisions.<n>We find that we can simulate voting behavior of Members of the European Parliament reasonably well with a weighted F1 score of approximately 0.793.
arXiv Detail & Related papers (2025-06-13T14:02:21Z) - The study of short texts in digital politics: Document aggregation for topic modeling [0.0]
We investigate the effects of aggregating short documents into larger ones based on natural units that partition the corpus.<n>We analyze one million tweets by U.S. state legislators from April 2016 to September 2020.<n>For documents aggregated at the account level, topics are more associated with individual states than when using individual tweets.
arXiv Detail & Related papers (2025-03-07T01:05:46Z) - Political Actor Agent: Simulating Legislative System for Roll Call Votes Prediction with Large Language Models [9.0463587094323]
Political Actor Agent (PAA) is a novel framework that utilizes Large Language Models to overcome limitations.<n>By employing role-playing architectures and simulating legislative system, PAA provides a scalable and interpretable paradigm for predicting roll-call votes.<n>We conducted comprehensive experiments using voting records from the 117-118th U.S. House of Representatives, validating the superior performance and interpretability of PAA.
arXiv Detail & Related papers (2024-12-10T03:06:28Z) - Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts [91.3755431537592]
The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases.
Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding.
This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
arXiv Detail & Related papers (2023-07-05T20:16:20Z) - Design and analysis of tweet-based election models for the 2021 Mexican
legislative election [55.41644538483948]
We use a dataset of 15 million election-related tweets in the six months preceding election day.
We find that models using data with geographical attributes determine the results of the election with better precision and accuracy than conventional polling methods.
arXiv Detail & Related papers (2023-01-02T12:40:05Z) - Artificial intelligence-driven digital twin of a modern house
demonstrated in virtual reality [0.0]
A digital twin is a virtual representation of a physical asset enabled through data and simulators for real-time prediction, optimization, monitoring, controlling, and improved decision-making.
Recently, the concept of capability level has been introduced to address this issue.
Based on its capability, the concept states that a digital twin can be categorized on a scale from zero to five, referred to as standalone, descriptive, diagnostic, predictive, prescriptive, and autonomous, respectively.
arXiv Detail & Related papers (2022-12-14T08:48:37Z) - Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation [70.81596088969378]
Cross-lingual Outline-based Dialogue dataset (termed COD) enables natural language understanding.
COD enables dialogue state tracking, and end-to-end dialogue modelling and evaluation in 4 diverse languages.
arXiv Detail & Related papers (2022-01-31T18:11:21Z) - Prediction of Political Leanings of Chinese Speaking Twitter Users [0.0]
It firstly collects data by scraping tweets of famous political figure and their related users.
It secondly defines the political spectrum in two groups: the group that shows approvals to the Chinese Communist Party and the group that does not.
It produces a classification model with high accuracy for understanding users' political stances from their tweets on Twitter.
arXiv Detail & Related papers (2021-10-12T03:18:10Z) - Exploiting BERT For Multimodal Target SentimentClassification Through
Input Space Translation [75.82110684355979]
We introduce a two-stream model that translates images in input space using an object-aware transformer.
We then leverage the translation to construct an auxiliary sentence that provides multimodal information to a language model.
We achieve state-of-the-art performance on two multimodal Twitter datasets.
arXiv Detail & Related papers (2021-08-03T18:02:38Z) - Sentiment analysis in tweets: an assessment study from classical to
modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information.
Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks.
This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.