Near-real-time Earthquake-induced Fatality Estimation using Crowdsourced
Data and Large-Language Models
- URL: http://arxiv.org/abs/2312.03755v1
- Date: Mon, 4 Dec 2023 17:09:58 GMT
- Title: Near-real-time Earthquake-induced Fatality Estimation using Crowdsourced
Data and Large-Language Models
- Authors: Chenguang Wang, Davis Engler, Xuechun Li, James Hou, David J. Wald,
Kishor Jaiswal, Susu Xu
- Abstract summary: We introduce an end-to-end framework to significantly improve the timeliness and accuracy of global earthquake-induced loss forecasting.
Our framework integrates a hierarchical casualty extraction model built upon large language models, prompt design, and few-shot learning.
We test the framework in real-time on a series of global earthquake events in 2022 and 2022 and show that our framework streamlines casualty data retrieval, achieving speed and accuracy comparable to manual methods by 2021.
- Score: 5.031939163610801
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: When a damaging earthquake occurs, immediate information about casualties is
critical for time-sensitive decision-making by emergency response and aid
agencies in the first hours and days. Systems such as Prompt Assessment of
Global Earthquakes for Response (PAGER) by the U.S. Geological Survey (USGS)
were developed to provide a forecast within about 30 minutes of any significant
earthquake globally. Traditional systems for estimating human loss in disasters
often depend on manually collected early casualty reports from global media, a
process that's labor-intensive and slow with notable time delays. Recently,
some systems have employed keyword matching and topic modeling to extract
relevant information from social media. However, these methods struggle with
the complex semantics in multilingual texts and the challenge of interpreting
ever-changing, often conflicting reports of death and injury numbers from
various unverified sources on social media platforms. In this work, we
introduce an end-to-end framework to significantly improve the timeliness and
accuracy of global earthquake-induced human loss forecasting using
multi-lingual, crowdsourced social media. Our framework integrates (1) a
hierarchical casualty extraction model built upon large language models, prompt
design, and few-shot learning to retrieve quantitative human loss claims from
social media, (2) a physical constraint-aware, dynamic-truth discovery model
that discovers the truthful human loss from massive noisy and potentially
conflicting human loss claims, and (3) a Bayesian updating loss projection
model that dynamically updates the final loss estimation using discovered
truths. We test the framework in real-time on a series of global earthquake
events in 2021 and 2022 and show that our framework streamlines casualty data
retrieval, achieving speed and accuracy comparable to manual methods by USGS.
Related papers
- Turkey's Earthquakes: Damage Prediction and Feature Significance Using A Multivariate Analysis [1.9461727843485295]
This research contributes to the reduction of fatalities in future seismic events in Turkey.
We tested various machine-learning architectures to forecast death tolls and fatalities per affected population.
Our findings indicate that the Random Forest model provides the most reliable predictions.
arXiv Detail & Related papers (2024-10-29T10:29:06Z) - Time Series Foundation Models and Deep Learning Architectures for Earthquake Temporal and Spatial Nowcasting [1.4854797901022863]
Existing literature on earthquake nowcasting lacks comprehensive evaluations of pre-trained foundation models.
We introduce two innovation approaches called MultiFoundationQuake and GNNCoder.
We formulate earthquake nowcasting as a time series forecasting problem for the next 14 days within 0.1-degree spatial bins in Southern California.
arXiv Detail & Related papers (2024-08-21T20:57:03Z) - CrisisSense-LLM: Instruction Fine-Tuned Large Language Model for Multi-label Social Media Text Classification in Disaster Informatics [49.2719253711215]
This study introduces a novel approach to disaster text classification by enhancing a pre-trained Large Language Model (LLM)
Our methodology involves creating a comprehensive instruction dataset from disaster-related tweets, which is then used to fine-tune an open-source LLM.
This fine-tuned model can classify multiple aspects of disaster-related information simultaneously, such as the type of event, informativeness, and involvement of human aid.
arXiv Detail & Related papers (2024-06-16T23:01:10Z) - Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses [76.59021017301127]
We propose a large-scale traffic crash language dataset, named CrashEvent, summarizing 19,340 real-world crash reports.
We further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes.
Our experiments results show that our LLM-based approach not only predicts the severity of accidents but also classifies different types of accidents and predicts injury outcomes.
arXiv Detail & Related papers (2024-06-16T03:10:16Z) - Data-Driven Prediction of Seismic Intensity Distributions Featuring
Hybrid Classification-Regression Models [21.327960186900885]
This study develops linear regression models capable of predicting seismic intensity distributions based on earthquake parameters.
The dataset comprises seismic intensity data from earthquakes that occurred in the vicinity of Japan between 1997 and 2020.
The proposed model can predict even abnormal seismic intensity distributions, a task at conventional GMPEs often struggle.
arXiv Detail & Related papers (2024-02-03T13:39:22Z) - CrisisMatch: Semi-Supervised Few-Shot Learning for Fine-Grained Disaster
Tweet Classification [51.58605842457186]
We present a fine-grained disaster tweet classification model under the semi-supervised, few-shot learning setting.
Our model, CrisisMatch, effectively classifies tweets into fine-grained classes of interest using few labeled data and large amounts of unlabeled data.
arXiv Detail & Related papers (2023-10-23T07:01:09Z) - SurvivalGAN: Generating Time-to-Event Data for Survival Analysis [121.84429525403694]
Imbalances in censoring and time horizons cause generative models to experience three new failure modes specific to survival analysis.
We propose SurvivalGAN, a generative model that handles survival data by addressing the imbalance in the censoring and event horizons.
We evaluate this method via extensive experiments on medical datasets.
arXiv Detail & Related papers (2023-02-24T17:03:51Z) - Classification of structural building damage grades from multi-temporal
photogrammetric point clouds using a machine learning model trained on
virtual laser scanning data [58.720142291102135]
We present a novel approach to automatically assess multi-class building damage from real-world point clouds.
We use a machine learning model trained on virtual laser scanning (VLS) data.
The model yields high multi-target classification accuracies (overall accuracy: 92.0% - 95.1%)
arXiv Detail & Related papers (2023-02-24T12:04:46Z) - A CNN-BiLSTM Model with Attention Mechanism for Earthquake Prediction [0.0]
This paper proposes a novel prediction method based on attention mechanism (AM), convolution neural network (CNN), and bi-directional long short-term memory (BiLSTM) models.
It can predict the number and maximum magnitude of earthquakes in each area of mainland China-based on the earthquake catalog of the region.
arXiv Detail & Related papers (2021-12-26T20:16:20Z) - A Machine learning approach for rapid disaster response based on
multi-modal data. The case of housing & shelter needs [0.0]
One of the most immediate needs of people affected by a disaster is finding shelter.
This paper proposes a machine learning workflow that aims to fuse and rapidly analyse multimodal data.
Based on a database of 19 characteristics for more than 200 disasters worldwide, a fusion approach at the decision level was used.
arXiv Detail & Related papers (2021-07-29T18:22:34Z) - Taxonomizing local versus global structure in neural network loss
landscapes [60.206524503782006]
We show that the best test accuracy is obtained when the loss landscape is globally well-connected.
We also show that globally poorly-connected landscapes can arise when models are small or when they are trained to lower quality data.
arXiv Detail & Related papers (2021-07-23T13:37:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.