Accurate Medical Named Entity Recognition Through Specialized NLP Models
- URL: http://arxiv.org/abs/2412.08255v1
- Date: Wed, 11 Dec 2024 10:06:57 GMT
- Title: Accurate Medical Named Entity Recognition Through Specialized NLP Models
- Authors: Jiacheng Hu, Runyuan Bao, Yang Lin, Hanchao Zhang, Yanlin Xiang,
- Abstract summary: The study evaluated the effect of BioBERT in medical text processing for the task of medical named entity recognition.
The results showed that BioBERT achieved the best performance in both precision and F1 score, verifying its applicability and superiority in the medical field.
- Score: 3.9425549051034063
- License:
- Abstract: This study evaluated the effect of BioBERT in medical text processing for the task of medical named entity recognition. Through comparative experiments with models such as BERT, ClinicalBERT, SciBERT, and BlueBERT, the results showed that BioBERT achieved the best performance in both precision and F1 score, verifying its applicability and superiority in the medical field. BioBERT enhances its ability to understand professional terms and complex medical texts through pre-training on biomedical data, providing a powerful tool for medical information extraction and clinical decision support. The study also explored the privacy and compliance challenges of BioBERT when processing medical data, and proposed future research directions for combining other medical-specific models to improve generalization and robustness. With the development of deep learning technology, the potential of BioBERT in application fields such as intelligent medicine, personalized treatment, and disease prediction will be further expanded. Future research can focus on the real-time and interpretability of the model to promote its widespread application in the medical field.
Related papers
- MedBioLM: Optimizing Medical and Biological QA with Fine-Tuned Large Language Models and Retrieval-Augmented Generation [0.0]
We introduce MedBioLM, a domain-adapted biomedical question-answering model.
By integrating fine-tuning and retrieval-augmented generation (RAG), MedBioLM dynamically incorporates domain-specific knowledge.
Fine-tuning significantly improves accuracy on benchmark datasets, while RAG enhances factual consistency.
arXiv Detail & Related papers (2025-02-05T08:58:35Z) - AI-assisted Knowledge Discovery in Biomedical Literature to Support Decision-making in Precision Oncology [2.8353535592739534]
We evaluate the potential contributions of specific natural language processing solutions to support knowledge discovery from biomedical literature.
Two models from the Bidirectional Representations from Transformers (BERT) family, two Large Language Models, and PubTator 3.0 were tested for their ability to support the named entity recognition (NER) and the relation extraction (RE) tasks.
arXiv Detail & Related papers (2024-12-12T03:24:49Z) - STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical Question-Answering [58.79671189792399]
STLLaVA-Med is designed to train a policy model capable of auto-generating medical visual instruction data.
We validate the efficacy and data efficiency of STLLaVA-Med across three major medical Visual Question Answering (VQA) benchmarks.
arXiv Detail & Related papers (2024-06-28T15:01:23Z) - MedKP: Medical Dialogue with Knowledge Enhancement and Clinical Pathway
Encoding [48.348511646407026]
We introduce the Medical dialogue with Knowledge enhancement and clinical Pathway encoding framework.
The framework integrates an external knowledge enhancement module through a medical knowledge graph and an internal clinical pathway encoding via medical entities and physician actions.
arXiv Detail & Related papers (2024-03-11T10:57:45Z) - Multi-level biomedical NER through multi-granularity embeddings and
enhanced labeling [3.8599767910528917]
This paper proposes a hybrid approach that integrates the strengths of multiple models.
BERT provides contextualized word embeddings, a pre-trained multi-channel CNN for character-level information capture, and following by a BiLSTM + CRF for sequence labelling and modelling dependencies between the words in the text.
We evaluate our model on the benchmark i2b2/2010 dataset, achieving an F1-score of 90.11.
arXiv Detail & Related papers (2023-12-24T21:45:36Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - An Analysis on Large Language Models in Healthcare: A Case Study of
BioBERT [0.0]
This paper conducts a comprehensive investigation into applying large language models, particularly on BioBERT, in healthcare.
The analysis outlines a systematic methodology for fine-tuning BioBERT to meet the unique needs of the healthcare domain.
The paper thoroughly examines ethical considerations, particularly patient privacy and data security.
arXiv Detail & Related papers (2023-10-11T08:16:35Z) - A Review on Knowledge Graphs for Healthcare: Resources, Applications, and Promises [59.4999994297993]
This comprehensive review aims to provide an overview of the current state of Healthcare Knowledge Graphs (HKGs)
We thoroughly analyzed existing literature on HKGs, covering their construction methodologies, utilization techniques, and applications.
The review highlights the potential of HKGs to significantly impact biomedical research and clinical practice.
arXiv Detail & Related papers (2023-06-07T21:51:56Z) - Extrinsic Factors Affecting the Accuracy of Biomedical NER [0.1529342790344802]
Biomedical named entity recognition (NER) is a critial task that aims to identify structured information in clinical text.
NER in the biomedical domain is challenging due to limited data availability.
arXiv Detail & Related papers (2023-05-29T15:29:49Z) - BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks [68.39821375903591]
Generalist AI holds the potential to address limitations due to its versatility in interpreting different data types.
Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model.
arXiv Detail & Related papers (2023-05-26T17:14:43Z) - UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual
Embeddings Using the Unified Medical Language System Metathesaurus [73.86656026386038]
We introduce UmlsBERT, a contextual embedding model that integrates domain knowledge during the pre-training process.
By applying these two strategies, UmlsBERT can encode clinical domain knowledge into word embeddings and outperform existing domain-specific models.
arXiv Detail & Related papers (2020-10-20T15:56:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.