CReMa: Crisis Response through Computational Identification and Matching of Cross-Lingual Requests and Offers Shared on Social Media
- URL: http://arxiv.org/abs/2405.11897v1
- Date: Mon, 20 May 2024 09:30:03 GMT
- Title: CReMa: Crisis Response through Computational Identification and Matching of Cross-Lingual Requests and Offers Shared on Social Media
- Authors: Rabindra Lamsal, Maria Rodriguez Read, Shanika Karunasekera, Muhammad Imran,
- Abstract summary: This study addresses the challenge of efficiently identifying and matching assistance requests and offers on social media platforms during emergencies.
We propose CReMa, a systematic approach that integrates textual, temporal, and spatial features for multi-lingual request-offer matching.
We introduce a novel multi-lingual dataset that simulates scenarios of help-seeking and offering assistance on social media across the 16 most commonly used languages in Australia.
- Score: 5.384787836425144
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: During times of crisis, social media platforms play a vital role in facilitating communication and coordinating resources. Amidst chaos and uncertainty, communities often rely on these platforms to share urgent pleas for help, extend support, and organize relief efforts. However, the sheer volume of conversations during such periods, which can escalate to unprecedented levels, necessitates the automated identification and matching of requests and offers to streamline relief operations. This study addresses the challenge of efficiently identifying and matching assistance requests and offers on social media platforms during emergencies. We propose CReMa (Crisis Response Matcher), a systematic approach that integrates textual, temporal, and spatial features for multi-lingual request-offer matching. By leveraging CrisisTransformers, a set of pre-trained models specific to crises, and a cross-lingual embedding space, our methodology enhances the identification and matching tasks while outperforming strong baselines such as RoBERTa, MPNet, and BERTweet, in classification tasks, and Universal Sentence Encoder, Sentence Transformers in crisis embeddings generation tasks. We introduce a novel multi-lingual dataset that simulates scenarios of help-seeking and offering assistance on social media across the 16 most commonly used languages in Australia. We conduct comprehensive cross-lingual experiments across these 16 languages, also while examining trade-offs between multiple vector search strategies and accuracy. Additionally, we analyze a million-scale geotagged global dataset to comprehend patterns in relation to seeking help and offering assistance on social media. Overall, these contributions advance the field of crisis informatics and provide benchmarks for future research in the area.
Related papers
- CLARINET: Augmenting Language Models to Ask Clarification Questions for Retrieval [52.134133938779776]
We present CLARINET, a system that asks informative clarification questions by choosing questions whose answers would maximize certainty in the correct candidate.
Our approach works by augmenting a large language model (LLM) to condition on a retrieval distribution, finetuning end-to-end to generate the question that would have maximized the rank of the true candidate at each turn.
arXiv Detail & Related papers (2024-04-28T18:21:31Z) - Against The Achilles' Heel: A Survey on Red Teaming for Generative Models [60.21722603260243]
The field of red teaming is experiencing fast-paced growth, which highlights the need for a comprehensive organization covering the entire pipeline.
Our extensive survey, which examines over 120 papers, introduces a taxonomy of fine-grained attack strategies grounded in the inherent capabilities of language models.
We have developed the searcher framework that unifies various automatic red teaming approaches.
arXiv Detail & Related papers (2024-03-31T09:50:39Z) - Semantically Enriched Cross-Lingual Sentence Embeddings for Crisis-related Social Media Texts [3.690904966341072]
Tasks such as semantic search and clustering on crisis-related social media texts enhance our comprehension of crisis discourse.
Pre-trained language models have advanced performance in crisis informatics, but their contextual embeddings lack semantic meaningfulness.
We propose multi-lingual sentence encoders that embed crisis-related social media texts for over 50 languages.
arXiv Detail & Related papers (2024-03-25T10:44:38Z) - CrisisTransformers: Pre-trained language models and sentence encoders for crisis-related social media texts [3.690904966341072]
Social media platforms play an essential role in crisis communication, but analyzing crisis-related social media texts is challenging due to their informal nature.
This study introduces CrisisTransformers, an ensemble of pre-trained language models and sentence encoders trained on an extensive corpus of over 15 billion word tokens from tweets.
arXiv Detail & Related papers (2023-09-11T14:36:16Z) - Coping with low data availability for social media crisis message
categorisation [3.0255457622022495]
This thesis focuses on addressing the challenge of low data availability when categorising crisis messages for emergency response.
It first presents domain adaptation as a solution for this problem, which involves learning a categorisation model from annotated data from past crisis events.
In many-to-many adaptation, where the model is trained on multiple past events and adapted to multiple ongoing events, a multi-task learning approach is proposed.
arXiv Detail & Related papers (2023-05-26T19:08:24Z) - CrisisLTLSum: A Benchmark for Local Crisis Event Timeline Extraction and
Summarization [62.77066949111921]
This paper presents CrisisLTLSum, the largest dataset of local crisis event timelines available to date.
CrisisLTLSum contains 1,000 crisis event timelines across four domains: wildfires, local fires, traffic, and storms.
Our initial experiments indicate a significant gap between the performance of strong baselines compared to the human performance on both tasks.
arXiv Detail & Related papers (2022-10-25T17:32:40Z) - Cross-Lingual and Cross-Domain Crisis Classification for Low-Resource
Scenarios [4.147346416230273]
We study the task of automatically classifying messages related to crisis events by leveraging cross-language and cross-domain labeled data.
Our goal is to make use of labeled data from high-resource languages to classify messages from other (low-resource) languages and/or of new (previously unseen) types of crisis situations.
Our empirical findings show that it is indeed possible to leverage data from crisis events in English to classify the same type of event in other languages, such as Spanish and Italian.
arXiv Detail & Related papers (2022-09-05T20:57:23Z) - Cross-Lingual Query-Based Summarization of Crisis-Related Social Media:
An Abstractive Approach Using Transformers [3.042890194004583]
This work proposes a cross-lingual method for retrieving and summarizing crisis-relevant information from social media postings.
We describe a uniform way of expressing various information needs through structured queries and a way of creating summaries.
arXiv Detail & Related papers (2022-04-21T16:07:52Z) - BERTuit: Understanding Spanish language in Twitter through a native
transformer [70.77033762320572]
We present bfBERTuit, the larger transformer proposed so far for Spanish language, pre-trained on a massive dataset of 230M Spanish tweets.
Our motivation is to provide a powerful resource to better understand Spanish Twitter and to be used on applications focused on this social network.
arXiv Detail & Related papers (2022-04-07T14:28:51Z) - Clustering of Social Media Messages for Humanitarian Aid Response during
Crisis [47.187609203210705]
We show that recent advances in Deep Learning and Natural Language Processing outperform prior approaches for the task of classifying informativeness.
We extend these methods to two sub-tasks of informativeness and find that the Deep Learning methods are effective here as well.
arXiv Detail & Related papers (2020-07-23T02:18:05Z) - Multimodal Categorization of Crisis Events in Social Media [81.07061295887172]
We present a new multimodal fusion method that leverages both images and texts as input.
In particular, we introduce a cross-attention module that can filter uninformative and misleading components from weak modalities.
We show that our method outperforms the unimodal approaches and strong multimodal baselines by a large margin on three crisis-related tasks.
arXiv Detail & Related papers (2020-04-10T06:31:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.