ADSumm: Annotated Ground-truth Summary Datasets for Disaster Tweet Summarization
- URL: http://arxiv.org/abs/2405.06551v1
- Date: Fri, 10 May 2024 15:49:01 GMT
- Title: ADSumm: Annotated Ground-truth Summary Datasets for Disaster Tweet Summarization
- Authors: Piyush Kumar Garg, Roshni Chakraborty, Sourav Kumar Dandapat,
- Abstract summary: Existing tweet disaster summarization approaches provide a summary of these events to aid government agencies, humanitarian organizations, etc.
In this paper, we present ADSumm, which adds annotated ground-truth summaries for eight disaster events.
Our experimental analysis shows that the newly added datasets improve the performance of the supervised summarization approaches by 8-28% in terms of ROUGE-N F1-score.
- Score: 8.371475703337106
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Online social media platforms, such as Twitter, provide valuable information during disaster events. Existing tweet disaster summarization approaches provide a summary of these events to aid government agencies, humanitarian organizations, etc., to ensure effective disaster response. In the literature, there are two types of approaches for disaster summarization, namely, supervised and unsupervised approaches. Although supervised approaches are typically more effective, they necessitate a sizable number of disaster event summaries for testing and training. However, there is a lack of good number of disaster summary datasets for training and evaluation. This motivates us to add more datasets to make supervised learning approaches more efficient. In this paper, we present ADSumm, which adds annotated ground-truth summaries for eight disaster events which consist of both natural and man-made disaster events belonging to seven different countries. Our experimental analysis shows that the newly added datasets improve the performance of the supervised summarization approaches by 8-28% in terms of ROUGE-N F1-score. Moreover, in newly annotated dataset, we have added a category label for each input tweet which helps to ensure good coverage from different categories in summary. Additionally, we have added two other features relevance label and key-phrase, which provide information about the quality of a tweet and explanation about the inclusion of the tweet into summary, respectively. For ground-truth summary creation, we provide the annotation procedure adapted in detail, which has not been described in existing literature. Experimental analysis shows the quality of ground-truth summary is very good with Coverage, Relevance and Diversity.
Related papers
- CrisisSense-LLM: Instruction Fine-Tuned Large Language Model for Multi-label Social Media Text Classification in Disaster Informatics [49.2719253711215]
This study introduces a novel approach to disaster text classification by enhancing a pre-trained Large Language Model (LLM)
Our methodology involves creating a comprehensive instruction dataset from disaster-related tweets, which is then used to fine-tune an open-source LLM.
This fine-tuned model can classify multiple aspects of disaster-related information simultaneously, such as the type of event, informativeness, and involvement of human aid.
arXiv Detail & Related papers (2024-06-16T23:01:10Z) - Multi-Query Focused Disaster Summarization via Instruction-Based
Prompting [3.6199702611839792]
CrisisFACTS aims to advance disaster summarization based on multi-stream fact-finding.
Here, participants are asked to develop systems that can extract key facts from several disaster-related events.
This paper describes our method to tackle this challenging task.
arXiv Detail & Related papers (2024-02-14T08:22:58Z) - CrisisMatch: Semi-Supervised Few-Shot Learning for Fine-Grained Disaster
Tweet Classification [51.58605842457186]
We present a fine-grained disaster tweet classification model under the semi-supervised, few-shot learning setting.
Our model, CrisisMatch, effectively classifies tweets into fine-grained classes of interest using few labeled data and large amounts of unlabeled data.
arXiv Detail & Related papers (2023-10-23T07:01:09Z) - DeCrisisMB: Debiased Semi-Supervised Learning for Crisis Tweet
Classification via Memory Bank [52.20298962359658]
In crisis events, people often use social media platforms such as Twitter to disseminate information about the situation, warnings, advice, and support.
fully-supervised approaches require annotating vast amounts of data and are impractical due to limited response time.
Semi-supervised models can be biased, performing moderately well for certain classes while performing extremely poorly for others.
We propose a simple but effective debiasing method, DeCrisisMB, that utilizes a Memory Bank to store and perform equal sampling for generated pseudo-labels from each class at each training.
arXiv Detail & Related papers (2023-10-23T05:25:51Z) - IKDSumm: Incorporating Key-phrases into BERT for extractive Disaster
Tweet Summarization [5.299958874647294]
We propose a disaster-specific tweet summarization framework, IKDSumm.
IKDSumm identifies the crucial and important information from each tweet related to a disaster through key-phrases of that tweet.
We utilize these key-phrases to automatically generate a summary of the tweets.
arXiv Detail & Related papers (2023-05-19T11:05:55Z) - PORTRAIT: a hybrid aPproach tO cReate extractive ground-TRuth summAry
for dIsaster evenT [5.386050544766801]
Disaster summarization approaches provide an overview of the important information posted during disaster events on social media platforms, such as, Twitter.
We propose a hybrid (semi-automated) approach (PORTRAIT) where we partly automate the ground-truth summary generation procedure.
We validate the effectiveness of PORTRAIT on 5 disaster events through quantitative and qualitative comparisons of ground-truth summaries generated by existing intuitive approaches, a semi-automated approach, and PORTRAIT.
arXiv Detail & Related papers (2023-05-19T09:07:52Z) - SumREN: Summarizing Reported Speech about Events in News [51.82314543729287]
We propose the novel task of summarizing the reactions of different speakers, as expressed by their reported statements, to a given event.
We create a new multi-document summarization benchmark, SUMREN, comprising 745 summaries of reported statements from various public figures.
arXiv Detail & Related papers (2022-12-02T12:51:39Z) - Event-Related Bias Removal for Real-time Disaster Events [67.2965372987723]
Social media has become an important tool to share information about crisis events such as natural disasters and mass attacks.
Detecting actionable posts that contain useful information requires rapid analysis of huge volume of data in real-time.
We train an adversarial neural model to remove latent event-specific biases and improve the performance on tweet importance classification.
arXiv Detail & Related papers (2020-11-02T02:03:07Z) - Few-Shot Learning for Opinion Summarization [117.70510762845338]
Opinion summarization is the automatic creation of text reflecting subjective information expressed in multiple documents.
In this work, we show that even a handful of summaries is sufficient to bootstrap generation of the summary text.
Our approach substantially outperforms previous extractive and abstractive methods in automatic and human evaluation.
arXiv Detail & Related papers (2020-04-30T15:37:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.