Related papers: Understanding peacefulness through the world news

Understanding peacefulness through the world news

URL: http://arxiv.org/abs/2106.00306v2
Date: Thu, 3 Jun 2021 14:17:03 GMT
Title: Understanding peacefulness through the world news
Authors: Vasiliki Voukelatou, Ioanna Miliou, Fosca Giannotti, Luca Pappalardo
Abstract summary: We exploit information extracted from Global Data on Events, Location, and Tone (GDELT) digital news database to capture peacefulness through the Global Peace Index (GPI) Applying predictive machine learning models, we demonstrate that news media attention from GDELT can be used as a proxy for measuring GPI at a monthly level.
Score: 1.6975704972827304
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Peacefulness is a principal dimension of well-being for all humankind and is the way out of inequity and every single form of violence. Thus, its measurement has lately drawn the attention of researchers and policy-makers. During the last years, novel digital data streams have drastically changed the research in this field. In the current study, we exploit information extracted from Global Data on Events, Location, and Tone (GDELT) digital news database, to capture peacefulness through the Global Peace Index (GPI). Applying predictive machine learning models, we demonstrate that news media attention from GDELT can be used as a proxy for measuring GPI at a monthly level. Additionally, we use the SHAP methodology to obtain the most important variables that drive the predictions. This analysis highlights each country's profile and provides explanations for the predictions overall, and particularly for the errors and the events that drive these errors. We believe that digital data exploited by Social Good researchers, policy-makers, and peace-builders, with data science tools as powerful as machine learning, could contribute to maximize the societal benefits and minimize the risks to peacefulness.

Related papers

Forecasting Future International Events: A Reliable Dataset for Text-Based Event Modeling [37.508538729757404]
WorldREP is a novel dataset designed to address limitations by leveraging the advanced reasoning capabilities of large-language models (LLMs) Our dataset features high-quality scoring labels generated through advanced prompt modeling and rigorously validated by domain experts in political science. We publicly release our dataset along with the full automation source code for data collection, labeling, and benchmarking, aiming to support and advance research in text-based event prediction.
arXiv Detail & Related papers (2024-11-21T11:44:23Z)
Predicting Country Instability Using Bayesian Deep Learning and Random Forest [0.0]
Country instability is a global issue, with unpredictably high levels of instability thwarting socio-economic growth and possibly causing a slew of negative consequences. The Global Database of Activities, Voice, and Tone (GDELT Project) records broadcast, print, and web news in over 100 languages every second of every day. The main goal of our research is to investigate how, when our data grows more voluminous and fine-grained, we can conduct a more complex methodological analysis of political conflict.
arXiv Detail & Related papers (2024-11-11T00:23:03Z)
dsld: A Socially Relevant Tool for Teaching Statistics [3.314894584156197]
Data Science Looks At Discrimination (dsld) is an R and Python package designed to provide users with a comprehensive toolkit of statistical and graphical methods for assessing possible discrimination related to protected groups. Our software offers techniques for discrimination analysis by identifying and mitigating confounding variables, along with methods for reducing bias in predictive models. The inclusion of an 80-page Quarto book further supports users, from statistics educators to legal professionals, in effectively applying these analytical tools to real world scenarios.
arXiv Detail & Related papers (2024-11-06T19:50:00Z)
Data-Centric AI in the Age of Large Language Models [51.20451986068925]
This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs) We make the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs. We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization.
arXiv Detail & Related papers (2024-06-20T16:34:07Z)
On Responsible Machine Learning Datasets with Fairness, Privacy, and Regulatory Norms [56.119374302685934]
There have been severe concerns over the trustworthiness of AI technologies. Machine and deep learning algorithms depend heavily on the data used during their development. We propose a framework to evaluate the datasets through a responsible rubric.
arXiv Detail & Related papers (2023-10-24T14:01:53Z)
Predicting Temperature of Major Cities Using Machine Learning and Deep Learning [0.0]
We use the database made by University of Dayton which consists the change of temperature in major cities to predict the temperature of different cities during any time in future. This document contains our methodology for being able to make such predictions.
arXiv Detail & Related papers (2023-09-23T10:23:00Z)
Privacy-Preserving Graph Machine Learning from Data to Computation: A Survey [67.7834898542701]
We focus on reviewing privacy-preserving techniques of graph machine learning. We first review methods for generating privacy-preserving graph data. Then we describe methods for transmitting privacy-preserved information.
arXiv Detail & Related papers (2023-07-10T04:30:23Z)
PADME-SoSci: A Platform for Analytics and Distributed Machine Learning for the Social Sciences [4.294774517325059]
PADME is a distributed analytics tool that federates model implementation and training. It enables the analysis of data across locations while still allowing the model to be trained as if all data were in a single location.
arXiv Detail & Related papers (2023-03-27T15:32:35Z)
Citation Trajectory Prediction via Publication Influence Representation Using Temporal Knowledge Graph [52.07771598974385]
Existing approaches mainly rely on mining temporal and graph data from academic articles. Our framework is composed of three modules: difference-preserved graph embedding, fine-grained influence representation, and learning-based trajectory calculation. Experiments are conducted on both the APS academic dataset and our contributed AIPatent dataset.
arXiv Detail & Related papers (2022-10-02T07:43:26Z)
A Survey of Learning on Small Data: Generalization, Optimization, and Challenge [101.27154181792567]
Learning on small data that approximates the generalization ability of big data is one of the ultimate purposes of AI. This survey follows the active sampling theory under a PAC framework to analyze the generalization error and label complexity of learning on small data. Multiple data applications that may benefit from efficient small data representation are surveyed.
arXiv Detail & Related papers (2022-07-29T02:34:19Z)
Forecasting Future World Events with Neural Networks [68.43460909545063]
Autocast is a dataset containing thousands of forecasting questions and an accompanying news corpus. The news corpus is organized by date, allowing us to precisely simulate the conditions under which humans made past forecasts. We test language models on our forecasting task and find that performance is far below a human expert baseline.
arXiv Detail & Related papers (2022-06-30T17:59:14Z)
On the limits of algorithmic prediction across the globe [4.392517231156947]
We show that state-of-the-art machine learning models trained on data from the United States can predict achievement with high accuracy and generalize to other developed countries with comparable accuracy. Training the same model on national data yields high accuracy in every country, which highlights the value of local data collection.
arXiv Detail & Related papers (2021-03-28T19:53:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.