Entity Recognition from Colloquial Text
- URL: http://arxiv.org/abs/2401.04853v1
- Date: Tue, 9 Jan 2024 23:52:32 GMT
- Title: Entity Recognition from Colloquial Text
- Authors: Tamara Babaian, Jennifer Xu
- Abstract summary: We focus on the healthcare domain and investigate the problem of symptom recognition from colloquial texts.
The best-performing models trained using these strategies outperform the state-of-the-art specialized symptom recognizer by a large margin.
We present design principles for training strategies for effective entity recognition in colloquial texts.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Extraction of concepts and entities of interest from non-formal texts such as
social media posts and informal communication is an important capability for
decision support systems in many domains, including healthcare, customer
relationship management, and others. Despite the recent advances in training
large language models for a variety of natural language processing tasks, the
developed models and techniques have mainly focused on formal texts and do not
perform as well on colloquial data, which is characterized by a number of
distinct challenges. In our research, we focus on the healthcare domain and
investigate the problem of symptom recognition from colloquial texts by
designing and evaluating several training strategies for BERT-based model
fine-tuning. These strategies are distinguished by the choice of the base
model, the training corpora, and application of term perturbations in the
training data. The best-performing models trained using these strategies
outperform the state-of-the-art specialized symptom recognizer by a large
margin. Through a series of experiments, we have found specific patterns of
model behavior associated with the training strategies we designed. We present
design principles for training strategies for effective entity recognition in
colloquial texts based on our findings.
Related papers
- Harnessing the Intrinsic Knowledge of Pretrained Language Models for Challenging Text Classification Settings [5.257719744958367]
This thesis explores three challenging settings in text classification by leveraging the intrinsic knowledge of pretrained language models (PLMs)
We develop models that utilize features based on contextualized word representations from PLMs, achieving performance that rivals or surpasses human accuracy.
Lastly, we tackle the sensitivity of large language models to in-context learning prompts by selecting effective demonstrations.
arXiv Detail & Related papers (2024-08-28T09:07:30Z) - Analyzing Persuasive Strategies in Meme Texts: A Fusion of Language Models with Paraphrase Enrichment [0.23020018305241333]
This paper describes our approach to hierarchical multi-label detection of persuasion techniques in meme texts.
The scope of the study encompasses enhancing model performance through innovative training techniques and data augmentation strategies.
arXiv Detail & Related papers (2024-07-01T20:25:20Z) - A Unique Training Strategy to Enhance Language Models Capabilities for
Health Mention Detection from Social Media Content [6.053876125887214]
The extraction of health-related content from social media is useful for the development of diverse types of applications.
The primary reason for this shortfall lies in the non-standardized writing style commonly employed by social media users.
The key goal is achieved through the incorporation of random weighted perturbation and contrastive learning strategies.
A meta predictor is proposed that reaps the benefits of 5 different language models for discriminating posts of social media text into non-health and health-related classes.
arXiv Detail & Related papers (2023-10-29T16:08:33Z) - Visually Grounded Continual Language Learning with Selective
Specialization [17.31203979844975]
A desirable trait of an artificial agent acting in the visual world is to continually learn a sequence of language-informed tasks.
Selective specialization, i.e., a careful selection of model components to specialize in each task, is a strategy to provide control over this trade-off.
arXiv Detail & Related papers (2023-10-24T07:35:23Z) - Self-training Strategies for Sentiment Analysis: An Empirical Study [7.416913210816592]
Self-training is an economical and efficient technique for developing sentiment analysis models.
We compare several self-training strategies with the intervention of large language models.
arXiv Detail & Related papers (2023-09-15T21:42:46Z) - Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world.
The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time.
The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z) - Learning Symbolic Rules over Abstract Meaning Representations for
Textual Reinforcement Learning [63.148199057487226]
We propose a modular, NEuroSymbolic Textual Agent (NESTA) that combines a generic semantic generalization with a rule induction system to learn interpretable rules as policies.
Our experiments show that the proposed NESTA method outperforms deep reinforcement learning-based techniques by achieving better to unseen test games and learning from fewer training interactions.
arXiv Detail & Related papers (2023-07-05T23:21:05Z) - Foundation Models for Decision Making: Problems, Methods, and
Opportunities [124.79381732197649]
Foundation models pretrained on diverse data at scale have demonstrated extraordinary capabilities in a wide range of vision and language tasks.
New paradigms are emerging for training foundation models to interact with other agents and perform long-term reasoning.
Research at the intersection of foundation models and decision making holds tremendous promise for creating powerful new systems.
arXiv Detail & Related papers (2023-03-07T18:44:07Z) - Context-Aware Language Modeling for Goal-Oriented Dialogue Systems [84.65707332816353]
We formulate goal-oriented dialogue as a partially observed Markov decision process.
We derive a simple and effective method to finetune language models in a goal-aware way.
We evaluate our method on a practical flight-booking task using AirDialogue.
arXiv Detail & Related papers (2022-04-18T17:23:11Z) - On Learning Text Style Transfer with Direct Rewards [101.97136885111037]
Lack of parallel corpora makes it impossible to directly train supervised models for the text style transfer task.
We leverage semantic similarity metrics originally used for fine-tuning neural machine translation models.
Our model provides significant gains in both automatic and human evaluation over strong baselines.
arXiv Detail & Related papers (2020-10-24T04:30:02Z) - Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions.
We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.