5q032e@SMM4H'22: Transformer-based classification of premise in tweets
related to COVID-19
- URL: http://arxiv.org/abs/2209.03851v2
- Date: Sun, 15 Oct 2023 08:42:33 GMT
- Title: 5q032e@SMM4H'22: Transformer-based classification of premise in tweets
related to COVID-19
- Authors: Vadim Porvatov, Natalia Semenova
- Abstract summary: We propose a predictive model based on transformer architecture to classify the presence of premise in Twitter texts.
Our experiments on a Twitter dataset showed that RoBERTa is superior to the other transformer models in the case of the premise prediction task.
- Score: 2.3931689873603603
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automation of social network data assessment is one of the classic challenges
of natural language processing. During the COVID-19 pandemic, mining people's
stances from public messages have become crucial regarding understanding
attitudes towards health orders. In this paper, the authors propose the
predictive model based on transformer architecture to classify the presence of
premise in Twitter texts. This work is completed as part of the Social Media
Mining for Health (SMM4H) Workshop 2022. We explored modern transformer-based
classifiers in order to construct the pipeline efficiently capturing tweets
semantics. Our experiments on a Twitter dataset showed that RoBERTa is superior
to the other transformer models in the case of the premise prediction task. The
model achieved competitive performance with respect to ROC AUC value 0.807, and
0.7648 for the F1 score.
Related papers
- ThangDLU at #SMM4H 2024: Encoder-decoder models for classifying text data on social disorders in children and adolescents [49.00494558898933]
This paper describes our participation in Task 3 and Task 5 of the #SMM4H (Social Media Mining for Health) 2024 Workshop.
Task 3 is a multi-class classification task centered on tweets discussing the impact of outdoor environments on symptoms of social anxiety.
Task 5 involves a binary classification task focusing on tweets reporting medical disorders in children.
We applied transfer learning from pre-trained encoder-decoder models such as BART-base and T5-small to identify the labels of a set of given tweets.
arXiv Detail & Related papers (2024-04-30T17:06:20Z) - Text Augmentations with R-drop for Classification of Tweets Self
Reporting Covid-19 [28.91836510067532]
This paper presents models created for the Social Media Mining for Health 2023 shared task.
Our approach involves a classification model that incorporates diverse textual augmentations.
Our system achieves an impressive F1 score of 0.877 on the test set.
arXiv Detail & Related papers (2023-11-06T14:18:16Z) - Understanding writing style in social media with a supervised
contrastively pre-trained transformer [57.48690310135374]
Online Social Networks serve as fertile ground for harmful behavior, ranging from hate speech to the dissemination of disinformation.
We introduce the Style Transformer for Authorship Representations (STAR), trained on a large corpus derived from public sources of 4.5 x 106 authored texts.
Using a support base of 8 documents of 512 tokens, we can discern authors from sets of up to 1616 authors with at least 80% accuracy.
arXiv Detail & Related papers (2023-10-17T09:01:17Z) - How to Estimate Model Transferability of Pre-Trained Speech Models? [84.11085139766108]
"Score-based assessment" framework for estimating transferability of pre-trained speech models.
We leverage upon two representation theories, Bayesian likelihood estimation and optimal transport, to generate rank scores for the PSM candidates.
Our framework efficiently computes transferability scores without actual fine-tuning of candidate models or layers.
arXiv Detail & Related papers (2023-06-01T04:52:26Z) - TEDB System Description to a Shared Task on Euphemism Detection 2022 [0.0]
We considered Transformer-based models which are the current state-of-the-art methods for text classification.
Our best result of 0.816 F1-score consists of a euphemism-detection-finetuned/TimeLMs-pretrained RoBERTa model as a feature extractor.
arXiv Detail & Related papers (2023-01-16T20:37:56Z) - BJTU-WeChat's Systems for the WMT22 Chat Translation Task [66.81525961469494]
This paper introduces the joint submission of the Beijing Jiaotong University and WeChat AI to the WMT'22 chat translation task for English-German.
Based on the Transformer, we apply several effective variants.
Our systems achieve 0.810 and 0.946 COMET scores.
arXiv Detail & Related papers (2022-11-28T02:35:04Z) - Transformers for prompt-level EMA non-response prediction [62.41658786277712]
Ecological Momentary Assessments (EMAs) are an important psychological data source for measuring cognitive states, affect, behavior, and environmental factors.
Non-response, in which participants fail to respond to EMA prompts, is an endemic problem.
The ability to accurately predict non-response could be utilized to improve EMA delivery and develop compliance interventions.
arXiv Detail & Related papers (2021-11-01T18:38:47Z) - Understanding Transformers for Bot Detection in Twitter [0.0]
We focus on bot detection in Twitter, a key task to mitigate and counteract the automatic spreading of disinformation and bias in social media.
We investigate the use of pre-trained language models to tackle the detection of tweets generated by a bot or a human account based exclusively on its content.
We observe that fine-tuning generative transformers on a bot detection task produces higher accuracies.
arXiv Detail & Related papers (2021-04-13T13:32:55Z) - Utilizing Self-supervised Representations for MOS Prediction [51.09985767946843]
Existing evaluations usually require clean references or parallel ground truth data.
Subjective tests, on the other hand, do not need any additional clean or parallel data and correlates better to human perception.
We develop an automatic evaluation approach that correlates well with human perception while not requiring ground truth data.
arXiv Detail & Related papers (2021-04-07T09:44:36Z) - COVID-19 Tweets Analysis through Transformer Language Models [0.0]
In this study, we perform an in-depth, fine-grained sentiment analysis of tweets in COVID-19.
A trained transformer model is able to correctly predict, with high accuracy, the tone of a tweet.
We then leverage this model for predicting tones for 200,000 tweets on COVID-19.
arXiv Detail & Related papers (2021-02-27T12:06:33Z) - COVID-Twitter-BERT: A Natural Language Processing Model to Analyse
COVID-19 Content on Twitter [0.0]
We release COVID-Twitter-BERT (CT-BERT), a transformer-based model, pretrained on a large corpus of Twitter messages on the topic of COVID-19.
Our model shows a 10-30% marginal improvement compared to its base model, BERT-Large, on five different classification datasets.
CT-BERT is optimised to be used on COVID-19 content, in particular social media posts from Twitter.
arXiv Detail & Related papers (2020-05-15T12:40:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.