Identifying and Investigating Global News Coverage of Critical Events Such as Disasters and Terrorist Attacks
- URL: http://arxiv.org/abs/2506.12925v1
- Date: Sun, 15 Jun 2025 17:50:08 GMT
- Title: Identifying and Investigating Global News Coverage of Critical Events Such as Disasters and Terrorist Attacks
- Authors: Erica Cai, Xi Chen, Reagan Grey Keeney, Ethan Zuckerman, Brendan O'Connor, Przemyslaw A. Grabowicz,
- Abstract summary: We introduce an AI-powered method for identifying news articles based on an event FINGERPRINT.<n>The method achieves state-of-the-art performance and scales to massive databases of tens of millions of news articles.<n>We use FAME to identify 27,441 articles that cover natural disaster and terrorist attack events that happened in 2020.
- Score: 7.356090574846918
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Comparative studies of news coverage are challenging to conduct because methods to identify news articles about the same event in different languages require expertise that is difficult to scale. We introduce an AI-powered method for identifying news articles based on an event FINGERPRINT, which is a minimal set of metadata required to identify critical events. Our event coverage identification method, FINGERPRINT TO ARTICLE MATCHING FOR EVENTS (FAME), efficiently identifies news articles about critical world events, specifically terrorist attacks and several types of natural disasters. FAME does not require training data and is able to automatically and efficiently identify news articles that discuss an event given its fingerprint: time, location, and class (such as storm or flood). The method achieves state-of-the-art performance and scales to massive databases of tens of millions of news articles and hundreds of events happening globally. We use FAME to identify 27,441 articles that cover 470 natural disaster and terrorist attack events that happened in 2020. To this end, we use a massive database of news articles in three languages from MediaCloud, and three widely used, expert-curated databases of critical events: EM-DAT, USGS, and GTD. Our case study reveals patterns consistent with prior literature: coverage of disasters and terrorist attacks correlates to death counts, to the GDP of a country where the event occurs, and to trade volume between the reporting country and the country where the event occurred. We share our NLP annotations and cross-country media attention data to support the efforts of researchers and media monitoring organizations.
Related papers
- CrisisSense-LLM: Instruction Fine-Tuned Large Language Model for Multi-label Social Media Text Classification in Disaster Informatics [49.2719253711215]
This study introduces a novel approach to disaster text classification by enhancing a pre-trained Large Language Model (LLM)<n>Our methodology involves creating a comprehensive instruction dataset from disaster-related tweets, which is then used to fine-tune an open-source LLM.<n>This fine-tuned model can classify multiple aspects of disaster-related information simultaneously, such as the type of event, informativeness, and involvement of human aid.
arXiv Detail & Related papers (2024-06-16T23:01:10Z) - A Novel Method for News Article Event-Based Embedding [8.183446952097528]
We propose a novel lightweight method that optimized news embedding generation by focusing on entities and themes mentioned in articles.
We leveraged over 850,000 news articles and 1,000,000 events from the GDELT project to test and evaluate our method.
Our experiments demonstrate that our approach can both improve and outperform state-of-the-art methods on shared event detection tasks.
arXiv Detail & Related papers (2024-05-20T20:55:07Z) - CrisisMatch: Semi-Supervised Few-Shot Learning for Fine-Grained Disaster
Tweet Classification [51.58605842457186]
We present a fine-grained disaster tweet classification model under the semi-supervised, few-shot learning setting.
Our model, CrisisMatch, effectively classifies tweets into fine-grained classes of interest using few labeled data and large amounts of unlabeled data.
arXiv Detail & Related papers (2023-10-23T07:01:09Z) - ManiTweet: A New Benchmark for Identifying Manipulation of News on Social Media [74.93847489218008]
We present a novel task, identifying manipulation of news on social media, which aims to detect manipulation in social media posts and identify manipulated or inserted information.<n>To study this task, we have proposed a data collection schema and curated a dataset called ManiTweet, consisting of 3.6K pairs of tweets and corresponding articles.<n>Our analysis demonstrates that this task is highly challenging, with large language models (LLMs) yielding unsatisfactory performance.
arXiv Detail & Related papers (2023-05-23T16:40:07Z) - SumREN: Summarizing Reported Speech about Events in News [51.82314543729287]
We propose the novel task of summarizing the reactions of different speakers, as expressed by their reported statements, to a given event.
We create a new multi-document summarization benchmark, SUMREN, comprising 745 summaries of reported statements from various public figures.
arXiv Detail & Related papers (2022-12-02T12:51:39Z) - Cross-Lingual and Cross-Domain Crisis Classification for Low-Resource
Scenarios [4.147346416230273]
We study the task of automatically classifying messages related to crisis events by leveraging cross-language and cross-domain labeled data.
Our goal is to make use of labeled data from high-resource languages to classify messages from other (low-resource) languages and/or of new (previously unseen) types of crisis situations.
Our empirical findings show that it is indeed possible to leverage data from crisis events in English to classify the same type of event in other languages, such as Spanish and Italian.
arXiv Detail & Related papers (2022-09-05T20:57:23Z) - NewsEdits: A News Article Revision Dataset and a Document-Level
Reasoning Challenge [122.37011526554403]
NewsEdits is the first publicly available dataset of news revision histories.
It contains 1.2 million articles with 4.6 million versions from over 22 English- and French-language newspaper sources.
arXiv Detail & Related papers (2022-06-14T18:47:13Z) - Unsupervised Key Event Detection from Massive Text Corpora [42.31889135421941]
We propose a new task, key event detection at the intermediate level, aiming to detect from a news corpus key events.
This task can bridge event understanding and structuring and is inherently challenging because of the thematic and temporal closeness of key events.
We develop an unsupervised key event detection framework, EvMine, that extracts temporally frequent peak phrases using a novel ttf-itf score.
arXiv Detail & Related papers (2022-06-08T20:31:02Z) - COfEE: A Comprehensive Ontology for Event Extraction from text, with an
online annotation tool [3.8995911009078816]
Event Extraction (EE) seeks to derive information about specific incidents and their actors from the text.
EE is useful in many domains such as building a knowledge base, information retrieval, summarization and online monitoring systems.
COfEE consists of two hierarchy levels (event types and event sub-types) that include new categories relating to environmental issues, cyberspace, criminal activity and natural disasters.
arXiv Detail & Related papers (2021-07-21T19:43:22Z) - Event-Related Bias Removal for Real-time Disaster Events [67.2965372987723]
Social media has become an important tool to share information about crisis events such as natural disasters and mass attacks.
Detecting actionable posts that contain useful information requires rapid analysis of huge volume of data in real-time.
We train an adversarial neural model to remove latent event-specific biases and improve the performance on tweet importance classification.
arXiv Detail & Related papers (2020-11-02T02:03:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.