Improving Health Mentioning Classification of Tweets using Contrastive
Adversarial Training
- URL: http://arxiv.org/abs/2203.01895v1
- Date: Thu, 3 Mar 2022 18:20:51 GMT
- Title: Improving Health Mentioning Classification of Tweets using Contrastive
Adversarial Training
- Authors: Pervaiz Iqbal Khan, Shoaib Ahmed Siddiqui, Imran Razzak, Andreas
Dengel, and Sheraz Ahmed
- Abstract summary: We learn word representation by its surrounding words and utilize emojis in the text to help improve the classification results.
We generate adversarial examples by perturbing the embeddings of the model and then train the model on a pair of clean and adversarial examples.
Experiments show an improvement of 1.0% over BERT-Large baseline and 0.6% over RoBERTa-Large baseline, whereas 5.8% over the state-of-the-art in terms of F1 score.
- Score: 6.586675643422952
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Health mentioning classification (HMC) classifies an input text as health
mention or not. Figurative and non-health mention of disease words makes the
classification task challenging. Learning the context of the input text is the
key to this problem. The idea is to learn word representation by its
surrounding words and utilize emojis in the text to help improve the
classification results. In this paper, we improve the word representation of
the input text using adversarial training that acts as a regularizer during
fine-tuning of the model. We generate adversarial examples by perturbing the
embeddings of the model and then train the model on a pair of clean and
adversarial examples. Additionally, we utilize contrastive loss that pushes a
pair of clean and perturbed examples close to each other and other examples
away in the representation space. We train and evaluate the method on an
extended version of the publicly available PHM2017 dataset. Experiments show an
improvement of 1.0% over BERT-Large baseline and 0.6% over RoBERTa-Large
baseline, whereas 5.8% over the state-of-the-art in terms of F1 score.
Furthermore, we provide a brief analysis of the results by utilizing the power
of explainable AI.
Related papers
- Contrastive Multi-graph Learning with Neighbor Hierarchical Sifting for Semi-supervised Text Classification [16.75801747622402]
We propose a novel method of contrastive multi-graph learning with neighbor hierarchical sifting for semi-supervised text classification.
Specifically, we exploit core features to form a multi-relational text graph, enhancing semantic connections among texts.
Our experiments on ThuCNews, SogouNews, 20 Newsgroups, and Ohsumed datasets achieved 95.86%, 97.52%, 87.43%, and 70.65%, which demonstrates competitive results in semi-supervised text classification.
arXiv Detail & Related papers (2024-11-25T08:35:55Z) - BERT Goes Off-Topic: Investigating the Domain Transfer Challenge using
Genre Classification [0.27195102129095]
We show that classification tasks still suffer from a performance gap when the underlying distribution of topics changes.
We quantify this phenomenon empirically with a large corpus and a large set of topics.
We suggest and successfully test a possible remedy: after augmenting the training dataset with topically-controlled synthetic texts, the F1 score improves by up to 50% for some topics.
arXiv Detail & Related papers (2023-11-27T18:53:31Z) - Word-Level Explanations for Analyzing Bias in Text-to-Image Models [72.71184730702086]
Text-to-image (T2I) models can generate images that underrepresent minorities based on race and sex.
This paper investigates which word in the input prompt is responsible for bias in generated images.
arXiv Detail & Related papers (2023-06-03T21:39:07Z) - Fine-Grained Human Feedback Gives Better Rewards for Language Model
Training [108.25635150124539]
Language models (LMs) often exhibit undesirable text generation behaviors, including generating false, toxic, or irrelevant outputs.
We introduce Fine-Grained RLHF, a framework that enables training and learning from reward functions that are fine-grained in two respects.
arXiv Detail & Related papers (2023-06-02T17:11:37Z) - Verifying the Robustness of Automatic Credibility Assessment [50.55687778699995]
We show that meaning-preserving changes in input text can mislead the models.
We also introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
Our experimental results show that modern large language models are often more vulnerable to attacks than previous, smaller solutions.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - Enabling Classifiers to Make Judgements Explicitly Aligned with Human
Values [73.82043713141142]
Many NLP classification tasks, such as sexism/racism detection or toxicity detection, are based on human values.
We introduce a framework for value-aligned classification that performs prediction based on explicitly written human values in the command.
arXiv Detail & Related papers (2022-10-14T09:10:49Z) - A Novel Approach to Train Diverse Types of Language Models for Health
Mention Classification of Tweets [7.490229412640516]
We propose a novel approach to train language models for health mention classification of tweets that involves adversarial training.
We generate adversarial examples by adding perturbation to the representations of transformer models for tweet examples.
We evaluate the proposed method on the PHM 2017 dataset extended version.
arXiv Detail & Related papers (2022-04-13T12:38:15Z) - Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences
for Image-Text Retrieval [19.161248757493386]
We propose our TAiloring neGative Sentences with Discrimination and Correction (TAGS-DC) to generate synthetic sentences automatically as negative samples.
To keep the difficulty during training, we mutually improve the retrieval and generation through parameter sharing.
In experiments, we verify the effectiveness of our model on MS-COCO and Flickr30K compared with current state-of-the-art models.
arXiv Detail & Related papers (2021-11-05T09:36:41Z) - Perturbing Inputs for Fragile Interpretations in Deep Natural Language
Processing [18.91129968022831]
Interpretability methods need to be robust for trustworthy NLP applications in high-stake areas like medicine or finance.
Our paper demonstrates how interpretations can be manipulated by making simple word perturbations on an input text.
arXiv Detail & Related papers (2021-08-11T02:07:21Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - Visually Grounded Compound PCFGs [65.04669567781634]
Exploiting visual groundings for language understanding has recently been drawing much attention.
We study visually grounded grammar induction and learn a constituency from both unlabeled text and its visual captions.
arXiv Detail & Related papers (2020-09-25T19:07:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.