Leveraging Event Specific and Chunk Span features to Extract COVID
Events from tweets
- URL: http://arxiv.org/abs/2012.10052v1
- Date: Fri, 18 Dec 2020 04:49:32 GMT
- Title: Leveraging Event Specific and Chunk Span features to Extract COVID
Events from tweets
- Authors: Ayush Kaushal and Tejas Vaidhya
- Abstract summary: We describe our system entry for WNUT 2020 Shared Task-3.
The task was aimed at automating the extraction of a variety of COVID-19 related events from Twitter.
The system ranks 1st at the leader-board with F1 of 0.6598, without using any ensembles or additional datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Twitter has acted as an important source of information during disasters and
pandemic, especially during the times of COVID-19. In this paper, we describe
our system entry for WNUT 2020 Shared Task-3. The task was aimed at automating
the extraction of a variety of COVID-19 related events from Twitter, such as
individuals who recently contracted the virus, someone with symptoms who were
denied testing and believed remedies against the infection. The system consists
of separate multi-task models for slot-filling subtasks and
sentence-classification subtasks while leveraging the useful sentence-level
information for the corresponding event. The system uses COVID-Twitter-Bert
with attention-weighted pooling of candidate slot-chunk features to capture the
useful information chunks. The system ranks 1st at the leader-board with F1 of
0.6598, without using any ensembles or additional datasets. The code and
trained models are available at this https URL.
Related papers
- Grounding Partially-Defined Events in Multimodal Data [61.0063273919745]
We introduce a multimodal formulation for partially-defined events and cast the extraction of these events as a three-stage span retrieval task.
We propose a benchmark for this task, MultiVENT-G, that consists of 14.5 hours of densely annotated current event videos and 1,168 text documents, containing 22.8K labeled event-centric entities.
Results illustrate the challenges that abstract event understanding poses and demonstrates promise in event-centric video-language systems.
arXiv Detail & Related papers (2024-10-07T17:59:48Z) - ThangDLU at #SMM4H 2024: Encoder-decoder models for classifying text data on social disorders in children and adolescents [49.00494558898933]
This paper describes our participation in Task 3 and Task 5 of the #SMM4H (Social Media Mining for Health) 2024 Workshop.
Task 3 is a multi-class classification task centered on tweets discussing the impact of outdoor environments on symptoms of social anxiety.
Task 5 involves a binary classification task focusing on tweets reporting medical disorders in children.
We applied transfer learning from pre-trained encoder-decoder models such as BART-base and T5-small to identify the labels of a set of given tweets.
arXiv Detail & Related papers (2024-04-30T17:06:20Z) - CrisisMatch: Semi-Supervised Few-Shot Learning for Fine-Grained Disaster
Tweet Classification [51.58605842457186]
We present a fine-grained disaster tweet classification model under the semi-supervised, few-shot learning setting.
Our model, CrisisMatch, effectively classifies tweets into fine-grained classes of interest using few labeled data and large amounts of unlabeled data.
arXiv Detail & Related papers (2023-10-23T07:01:09Z) - On the Exploitability of Instruction Tuning [103.8077787502381]
In this work, we investigate how an adversary can exploit instruction tuning to change a model's behavior.
We propose textitAutoPoison, an automated data poisoning pipeline.
Our results show that AutoPoison allows an adversary to change a model's behavior by poisoning only a small fraction of data.
arXiv Detail & Related papers (2023-06-28T17:54:04Z) - Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks.
We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks.
Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z) - Unifying Event Detection and Captioning as Sequence Generation via
Pre-Training [53.613265415703815]
We propose a unified pre-training and fine-tuning framework to enhance the inter-task association between event detection and captioning.
Our model outperforms the state-of-the-art methods, and can be further boosted when pre-trained on extra large-scale video-text data.
arXiv Detail & Related papers (2022-07-18T14:18:13Z) - DiPD: Disruptive event Prediction Dataset from Twitter [0.0]
Riots and protests, if gone out of control, can cause havoc in a country.
This dataset collects tweets of past or ongoing events known to have caused disruption.
It contains 94855 records of unique events and 168706 records of unique non-events.
arXiv Detail & Related papers (2021-11-25T13:16:21Z) - NIT COVID-19 at WNUT-2020 Task 2: Deep Learning Model RoBERTa for
Identify Informative COVID-19 English Tweets [0.0]
This paper presents the model submitted by the NIT_COVID-19 team for identified informative COVID-19 English tweets at WNUT-2020 Task2.
The performance achieved by the proposed model for shared task WNUT 2020 Task2 is 89.14% in the F1-score metric.
arXiv Detail & Related papers (2020-11-11T05:20:39Z) - TEST_POSITIVE at W-NUT 2020 Shared Task-3: Joint Event Multi-task
Learning for Slot Filling in Noisy Text [26.270447944466557]
We propose the Joint Event Multi-task Learning (JOELIN) model for extracting COVID-19 events from Twitter.
Through a unified global learning framework, we make use of all the training data across different events to learn and fine-tune the language model.
We implement a type-aware post-processing procedure using named entity recognition (NER) to further filter the predictions.
arXiv Detail & Related papers (2020-09-29T19:08:45Z) - Characterizing drug mentions in COVID-19 Twitter Chatter [1.2400116527089997]
In this work, we mined a large twitter dataset of 424 million tweets of COVID-19 chatter to identify discourse around drug mentions.
While seemingly a straightforward task, due to the informal nature of language use in Twitter, we demonstrate the need of machine learning alongside traditional automated methods to aid in this task.
We are able to recover almost 15% additional data, making misspelling handling a needed task as a pre-processing step when dealing with social media data.
arXiv Detail & Related papers (2020-07-20T16:56:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.