Natural Language Processing for Policymaking
- URL: http://arxiv.org/abs/2302.03490v1
- Date: Tue, 7 Feb 2023 14:34:39 GMT
- Title: Natural Language Processing for Policymaking
- Authors: Zhijing Jin, Rada Mihalcea
- Abstract summary: Natural language processing (NLP) uses computational tools to parse text into key information needed for policymaking.
We introduce common methods of NLP, including text classification, topic modeling, event extraction, and text scaling.
We highlight some potential limitations and ethical concerns when using NLP for policymaking.
- Score: 34.93331735602826
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Language is the medium for many political activities, from campaigns to news
reports. Natural language processing (NLP) uses computational tools to parse
text into key information that is needed for policymaking. In this chapter, we
introduce common methods of NLP, including text classification, topic modeling,
event extraction, and text scaling. We then overview how these methods can be
used for policymaking through four major applications including data collection
for evidence-based policymaking, interpretation of political decisions, policy
communication, and investigation of policy effects. Finally, we highlight some
potential limitations and ethical concerns when using NLP for policymaking.
This text is from Chapter 7 (pages 141-162) of the Handbook of Computational
Social Science for Policy (2023). Open Access on Springer:
https://doi.org/10.1007/978-3-031-16624-2
Related papers
- Strategies for political-statement segmentation and labelling in unstructured text [2.5338097608867542]
A large corpus of manifestos with by-statement political-stance labels has been created by the participants of the MARPOR project.
We propose and test a range of unified split-and-label frameworks that can be used to jointly segment and classify statements from raw textual data.
We show that our approaches achieve competitive accuracy when applied to raw text of political manifestos, and then demonstrate the research potential of our method by applying it to the records of the UK House of Commons.
arXiv Detail & Related papers (2025-03-10T10:56:06Z) - AgoraSpeech: A multi-annotated comprehensive dataset of political discourse through the lens of humans and AI [1.3060410279656598]
AgoraSpeech is a meticulously curated, high-quality dataset of 171 political speeches from six parties during the Greek national elections in 2023.
The dataset includes annotations (per paragraph) for six natural language processing (NLP) tasks: text classification, topic identification, sentiment analysis, named entity recognition, polarization and populism detection.
arXiv Detail & Related papers (2025-01-09T18:17:59Z) - Political-LLM: Large Language Models in Political Science [159.95299889946637]
Large language models (LLMs) have been widely adopted in political science tasks.
Political-LLM aims to advance the comprehensive understanding of integrating LLMs into computational political science.
arXiv Detail & Related papers (2024-12-09T08:47:50Z) - Language Models Learn Metadata: Political Stance Detection Case Study [1.2277343096128712]
This paper investigates the optimal way to incorporate metadata into a political stance detection task.
We show that our simple baseline, using only party membership information, surpasses the current state-of-the-art.
arXiv Detail & Related papers (2024-09-15T14:57:41Z) - A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus [71.77214818319054]
Natural language inference is a proxy for natural language understanding.
There is no publicly available NLI corpus for the Romanian language.
We introduce the first Romanian NLI corpus (RoNLI) comprising 58K training sentence pairs.
arXiv Detail & Related papers (2024-05-20T08:41:15Z) - On Policy Reuse: An Expressive Language for Representing and Executing General Policies that Call Other Policies [14.591568801450496]
A simple but powerful language has been introduced in terms of rules defined over a set of numerical features.
We consider three extensions to this language aimed at making policies and sketches more flexible and reusable.
The expressive power of the resulting language for policies and sketches is illustrated through a number of examples.
arXiv Detail & Related papers (2024-03-25T14:48:54Z) - Changes in Policy Preferences in German Tweets during the COVID Pandemic [4.663960015139793]
We present a novel data set of tweets with fine grained political preference annotations.
A text classification model trained on this data is used to extract political opinions.
Results indicate that in response to the COVID pandemic, expression of political opinions increased.
arXiv Detail & Related papers (2023-07-31T16:07:28Z) - PLUE: Language Understanding Evaluation Benchmark for Privacy Policies
in English [77.79102359580702]
We introduce the Privacy Policy Language Understanding Evaluation benchmark, a multi-task benchmark for evaluating the privacy policy language understanding.
We also collect a large corpus of privacy policies to enable privacy policy domain-specific language model pre-training.
We demonstrate that domain-specific continual pre-training offers performance improvements across all tasks.
arXiv Detail & Related papers (2022-12-20T05:58:32Z) - An Inclusive Notion of Text [69.36678873492373]
We argue that clarity on the notion of text is crucial for reproducible and generalizable NLP.
We introduce a two-tier taxonomy of linguistic and non-linguistic elements that are available in textual sources and can be used in NLP modeling.
arXiv Detail & Related papers (2022-11-10T14:26:43Z) - PoliGraph: Automated Privacy Policy Analysis using Knowledge Graphs (Journal Version) [7.10483762466065]
We view and analyze, for the first time, the entire text of a privacy policy in an integrated way.
We develop PoliGraph, an NLP tool to automatically extract PoliGraph from the text using linguistic analysis.
Using a public dataset for evaluation, we show that PoliGrapher identifies 40% more collection statements than prior state-of-the-art, with 97% precision.
arXiv Detail & Related papers (2022-10-13T05:16:22Z) - PolicyQA: A Reading Comprehension Dataset for Privacy Policies [77.79102359580702]
We present PolicyQA, a dataset that contains 25,017 reading comprehension style examples curated from an existing corpus of 115 website privacy policies.
We evaluate two existing neural QA models and perform rigorous analysis to reveal the advantages and challenges offered by PolicyQA.
arXiv Detail & Related papers (2020-10-06T09:04:58Z) - Policy Evaluation Networks [50.53250641051648]
We introduce a scalable, differentiable fingerprinting mechanism that retains essential policy information in a concise embedding.
Our empirical results demonstrate that combining these three elements can produce policies that outperform those that generated the training data.
arXiv Detail & Related papers (2020-02-26T23:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.