A Unique Training Strategy to Enhance Language Models Capabilities for
Health Mention Detection from Social Media Content
- URL: http://arxiv.org/abs/2310.19057v1
- Date: Sun, 29 Oct 2023 16:08:33 GMT
- Title: A Unique Training Strategy to Enhance Language Models Capabilities for
Health Mention Detection from Social Media Content
- Authors: Pervaiz Iqbal Khan, Muhammad Nabeel Asim, Andreas Dengel, Sheraz Ahmed
- Abstract summary: The extraction of health-related content from social media is useful for the development of diverse types of applications.
The primary reason for this shortfall lies in the non-standardized writing style commonly employed by social media users.
The key goal is achieved through the incorporation of random weighted perturbation and contrastive learning strategies.
A meta predictor is proposed that reaps the benefits of 5 different language models for discriminating posts of social media text into non-health and health-related classes.
- Score: 6.053876125887214
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An ever-increasing amount of social media content requires advanced AI-based
computer programs capable of extracting useful information. Specifically, the
extraction of health-related content from social media is useful for the
development of diverse types of applications including disease spread,
mortality rate prediction, and finding the impact of diverse types of drugs on
diverse types of diseases. Language models are competent in extracting the
syntactic and semantics of text. However, they face a hard time extracting
similar patterns from social media texts. The primary reason for this shortfall
lies in the non-standardized writing style commonly employed by social media
users. Following the need for an optimal language model competent in extracting
useful patterns from social media text, the key goal of this paper is to train
language models in such a way that they learn to derive generalized patterns.
The key goal is achieved through the incorporation of random weighted
perturbation and contrastive learning strategies. On top of a unique training
strategy, a meta predictor is proposed that reaps the benefits of 5 different
language models for discriminating posts of social media text into non-health
and health-related classes. Comprehensive experimentation across 3 public
benchmark datasets reveals that the proposed training strategy improves the
performance of the language models up to 3.87%, in terms of F1-score, as
compared to their performance with traditional training. Furthermore, the
proposed meta predictor outperforms existing health mention classification
predictors across all 3 benchmark datasets.
Related papers
- MetaKP: On-Demand Keyphrase Generation [52.48698290354449]
We introduce on-demand keyphrase generation, a novel paradigm that requires keyphrases that conform to specific high-level goals or intents.
We present MetaKP, a large-scale benchmark comprising four datasets, 7500 documents, and 3760 goals across news and biomedical domains with human-annotated keyphrases.
We demonstrate the potential of our method to serve as a general NLP infrastructure, exemplified by its application in epidemic event detection from social media.
arXiv Detail & Related papers (2024-06-28T19:02:59Z) - Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - Large Language Model Augmented Exercise Retrieval for Personalized
Language Learning [2.946562343070891]
We find that vector similarity approaches poorly capture the relationship between exercise content and the language that learners use to express what they want to learn.
We leverage the generative capabilities of large language models to bridge the gap by synthesizing hypothetical exercises based on the learner's input.
Our approach, which we call mHyER, overcomes three challenges: (1) lack of relevance labels for training, (2) unrestricted learner input content, and (3) low semantic similarity between input and retrieval candidates.
arXiv Detail & Related papers (2024-02-08T20:35:31Z) - Entity Recognition from Colloquial Text [0.0]
We focus on the healthcare domain and investigate the problem of symptom recognition from colloquial texts.
The best-performing models trained using these strategies outperform the state-of-the-art specialized symptom recognizer by a large margin.
We present design principles for training strategies for effective entity recognition in colloquial texts.
arXiv Detail & Related papers (2024-01-09T23:52:32Z) - Text generation for dataset augmentation in security classification
tasks [55.70844429868403]
This study evaluates the application of natural language text generators to fill this data gap in multiple security-related text classification tasks.
We find substantial benefits for GPT-3 data augmentation strategies in situations with severe limitations on known positive-class samples.
arXiv Detail & Related papers (2023-10-22T22:25:14Z) - A Predictive Model of Digital Information Engagement: Forecasting User
Engagement With English Words by Incorporating Cognitive Biases,
Computational Linguistics and Natural Language Processing [3.09766013093045]
This study introduces and empirically tests a novel predictive model for digital information engagement (IE)
The READ model integrates key cognitive biases with computational linguistics and natural language processing to develop a multidimensional perspective on information engagement.
The READ model's potential extends across various domains, including business, education, government, and healthcare.
arXiv Detail & Related papers (2023-07-26T20:58:47Z) - Multilingual Conceptual Coverage in Text-to-Image Models [98.80343331645626]
"Conceptual Coverage Across Languages" (CoCo-CroLa) is a technique for benchmarking the degree to which any generative text-to-image system provides multilingual parity to its training language in terms of tangible nouns.
For each model we can assess "conceptual coverage" of a given target language relative to a source language by comparing the population of images generated for a series of tangible nouns in the source language to the population of images generated for each noun under translation in the target language.
arXiv Detail & Related papers (2023-06-02T17:59:09Z) - Multi-Modal Perceiver Language Model for Outcome Prediction in Emergency
Department [0.03088120935391119]
We are interested in outcome prediction and patient triage in hospital emergency department based on text information in chief complaints and vital signs recorded at triage.
We adapt Perceiver - a modality-agnostic transformer-based model that has shown promising results in several applications.
In the experimental analysis, we show that mutli-modality improves the prediction performance compared with models trained solely on text or vital signs.
arXiv Detail & Related papers (2023-04-03T06:32:00Z) - Combining Contrastive Learning and Knowledge Graph Embeddings to develop
medical word embeddings for the Italian language [0.0]
This paper attempts to improve available embeddings in the uncovered niche of the Italian medical domain.
The main objective is to improve the accuracy of semantic similarity between medical terms.
Since the Italian language lacks medical texts and controlled vocabularies, we have developed a specific solution.
arXiv Detail & Related papers (2022-11-09T17:12:28Z) - Leveraging Pre-trained Language Model for Speech Sentiment Analysis [58.78839114092951]
We explore the use of pre-trained language models to learn sentiment information of written texts for speech sentiment analysis.
We propose a pseudo label-based semi-supervised training strategy using a language model on an end-to-end speech sentiment approach.
arXiv Detail & Related papers (2021-06-11T20:15:21Z) - Named Entity Recognition for Social Media Texts with Semantic
Augmentation [70.44281443975554]
Existing approaches for named entity recognition suffer from data sparsity problems when conducted on short and informal texts.
We propose a neural-based approach to NER for social media texts where both local (from running text) and augmented semantics are taken into account.
arXiv Detail & Related papers (2020-10-29T10:06:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.