Zero-Shot Classification of Crisis Tweets Using Instruction-Finetuned Large Language Models
- URL: http://arxiv.org/abs/2410.00182v1
- Date: Mon, 30 Sep 2024 19:33:58 GMT
- Title: Zero-Shot Classification of Crisis Tweets Using Instruction-Finetuned Large Language Models
- Authors: Emma McDaniel, Samuel Scheele, Jeff Liu,
- Abstract summary: We assess three commercial large language models in zero-shot classification of short social media posts.
The models are asked to perform two classification tasks: 1) identify if the post is informative in a humanitarian context; and 2) rank and provide probabilities for the post in relation to 16 possible humanitarian classes.
Results are evaluated using macro, weighted, and binary F1-scores.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Social media posts are frequently identified as a valuable source of open-source intelligence for disaster response, and pre-LLM NLP techniques have been evaluated on datasets of crisis tweets. We assess three commercial large language models (OpenAI GPT-4o, Gemini 1.5-flash-001 and Anthropic Claude-3-5 Sonnet) capabilities in zero-shot classification of short social media posts. In one prompt, the models are asked to perform two classification tasks: 1) identify if the post is informative in a humanitarian context; and 2) rank and provide probabilities for the post in relation to 16 possible humanitarian classes. The posts being classified are from the consolidated crisis tweet dataset, CrisisBench. Results are evaluated using macro, weighted, and binary F1-scores. The informative classification task, generally performed better without extra information, while for the humanitarian label classification providing the event that occurred during which the tweet was mined, resulted in better performance. Further, we found that the models have significantly varying performance by dataset, which raises questions about dataset quality.
Related papers
- ThangDLU at #SMM4H 2024: Encoder-decoder models for classifying text data on social disorders in children and adolescents [49.00494558898933]
This paper describes our participation in Task 3 and Task 5 of the #SMM4H (Social Media Mining for Health) 2024 Workshop.
Task 3 is a multi-class classification task centered on tweets discussing the impact of outdoor environments on symptoms of social anxiety.
Task 5 involves a binary classification task focusing on tweets reporting medical disorders in children.
We applied transfer learning from pre-trained encoder-decoder models such as BART-base and T5-small to identify the labels of a set of given tweets.
arXiv Detail & Related papers (2024-04-30T17:06:20Z) - CrisisMatch: Semi-Supervised Few-Shot Learning for Fine-Grained Disaster
Tweet Classification [51.58605842457186]
We present a fine-grained disaster tweet classification model under the semi-supervised, few-shot learning setting.
Our model, CrisisMatch, effectively classifies tweets into fine-grained classes of interest using few labeled data and large amounts of unlabeled data.
arXiv Detail & Related papers (2023-10-23T07:01:09Z) - Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts [91.3755431537592]
The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases.
Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding.
This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
arXiv Detail & Related papers (2023-07-05T20:16:20Z) - Coping with low data availability for social media crisis message
categorisation [3.0255457622022495]
This thesis focuses on addressing the challenge of low data availability when categorising crisis messages for emergency response.
It first presents domain adaptation as a solution for this problem, which involves learning a categorisation model from annotated data from past crisis events.
In many-to-many adaptation, where the model is trained on multiple past events and adapted to multiple ongoing events, a multi-task learning approach is proposed.
arXiv Detail & Related papers (2023-05-26T19:08:24Z) - AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators [98.11286353828525]
GPT-3.5 series models have demonstrated remarkable few-shot and zero-shot ability across various NLP tasks.
We propose AnnoLLM, which adopts a two-step approach, explain-then-annotate.
We build the first conversation-based information retrieval dataset employing AnnoLLM.
arXiv Detail & Related papers (2023-03-29T17:03:21Z) - Enhancing Crisis-Related Tweet Classification with Entity-Masked
Language Modeling and Multi-Task Learning [0.30458514384586394]
We propose a combination of entity-masked language modeling and hierarchical multi-label classification as a multi-task learning problem.
We evaluate our method on tweets from the TREC-IS dataset and show an absolute performance gain w.r.t. F1-score of up to 10% for actionable information types.
arXiv Detail & Related papers (2022-11-21T13:54:10Z) - Distant finetuning with discourse relations for stance classification [55.131676584455306]
We propose a new method to extract data with silver labels from raw text to finetune a model for stance classification.
We also propose a 3-stage training framework where the noisy level in the data used for finetuning decreases over different stages.
Our approach ranks 1st among 26 competing teams in the stance classification track of the NLPCC 2021 shared task Argumentative Text Understanding for AI Debater.
arXiv Detail & Related papers (2022-04-27T04:24:35Z) - ZeroBERTo -- Leveraging Zero-Shot Text Classification by Topic Modeling [57.80052276304937]
This paper proposes a new model, ZeroBERTo, which leverages an unsupervised clustering step to obtain a compressed data representation before the classification task.
We show that ZeroBERTo has better performance for long inputs and shorter execution time, outperforming XLM-R by about 12% in the F1 score in the FolhaUOL dataset.
arXiv Detail & Related papers (2022-01-04T20:08:17Z) - I-AID: Identifying Actionable Information from Disaster-related Tweets [0.0]
Social media plays a significant role in disaster management by providing valuable data about affected people, donations and help requests.
We propose I-AID, a multimodel approach to automatically categorize tweets into multi-label information types.
Our results indicate that I-AID outperforms state-of-the-art approaches in terms of weighted average F1 score by +6% and +4% on the TREC-IS dataset and COVID-19 Tweets, respectively.
arXiv Detail & Related papers (2020-08-04T19:07:50Z) - CrisisBench: Benchmarking Crisis-related Social Media Datasets for
Humanitarian Information Processing [13.11283003017537]
We consolidate eight human-annotated datasets and provide 166.1k and 141.5k tweets for textitinformativeness and textithumanitarian classification tasks.
We provide benchmarks for both binary and multiclass classification tasks using several deep learning architecrures including, CNN, fastText, and transformers.
arXiv Detail & Related papers (2020-04-14T19:51:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.