Distantly supervised end-to-end medical entity extraction from
electronic health records with human-level quality
- URL: http://arxiv.org/abs/2201.10463v1
- Date: Tue, 25 Jan 2022 17:04:46 GMT
- Title: Distantly supervised end-to-end medical entity extraction from
electronic health records with human-level quality
- Authors: Alexander Nesterov and Dmitry Umerenkov
- Abstract summary: We propose a novel method of doing medical EE from electronic health records ( EHR) as a single-step multi-label classification task.
Our model is trained end-to-end in a distantly supervised manner using targets automatically extracted from medical knowledge base.
Our work demonstrates that medical entity extraction can be done end-to-end without human supervision and with human quality given the availability of a large enough amount of unlabeled EHR and a medical knowledge base.
- Score: 77.34726150561087
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Medical entity extraction (EE) is a standard procedure used as a first stage
in medical texts processing. Usually Medical EE is a two-step process: named
entity recognition (NER) and named entity normalization (NEN). We propose a
novel method of doing medical EE from electronic health records (EHR) as a
single-step multi-label classification task by fine-tuning a transformer model
pretrained on a large EHR dataset. Our model is trained end-to-end in an
distantly supervised manner using targets automatically extracted from medical
knowledge base. We show that our model learns to generalize for entities that
are present frequently enough, achieving human-level classification quality for
most frequent entities. Our work demonstrates that medical entity extraction
can be done end-to-end without human supervision and with human quality given
the availability of a large enough amount of unlabeled EHR and a medical
knowledge base.
Related papers
- FEDMEKI: A Benchmark for Scaling Medical Foundation Models via Federated Knowledge Injection [83.54960238236548]
FEDMEKI not only preserves data privacy but also enhances the capability of medical foundation models.
FEDMEKI allows medical foundation models to learn from a broader spectrum of medical knowledge without direct data exposure.
arXiv Detail & Related papers (2024-08-17T15:18:56Z) - STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical Question-Answering [58.79671189792399]
STLLaVA-Med is designed to train a policy model capable of auto-generating medical visual instruction data.
We validate the efficacy and data efficiency of STLLaVA-Med across three major medical Visual Question Answering (VQA) benchmarks.
arXiv Detail & Related papers (2024-06-28T15:01:23Z) - Next Visit Diagnosis Prediction via Medical Code-Centric Multimodal Contrastive EHR Modelling with Hierarchical Regularisation [0.0]
We propose NECHO, a novel medical code-centric multimodal contrastive EHR learning framework with hierarchical regularisation.
First, we integrate multifaceted information encompassing medical codes, demographics, and clinical notes using a tailored network design.
We also regularise modality-specific encoders using a parental level information in medical ontology to learn hierarchical structure of EHR data.
arXiv Detail & Related papers (2024-01-22T01:58:32Z) - INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and
Prognosis [19.32686665459374]
We introduce INSPECT, which contains de-identified longitudinal records from a large cohort of patients at risk for pulmonary embolism (PE)
INSPECT contains data from 19,402 patients, including CT images, radiology report impression sections, and structured electronic health record (EHR) data (i.e. demographics, diagnoses, procedures, vitals, and medications)
arXiv Detail & Related papers (2023-11-17T07:28:16Z) - BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks [68.39821375903591]
Generalist AI holds the potential to address limitations due to its versatility in interpreting different data types.
Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model.
arXiv Detail & Related papers (2023-05-26T17:14:43Z) - Towards Medical Artificial General Intelligence via Knowledge-Enhanced
Multimodal Pretraining [121.89793208683625]
Medical artificial general intelligence (MAGI) enables one foundation model to solve different medical tasks.
We propose a new paradigm called Medical-knedge-enhanced mulTimOdal pretRaining (MOTOR)
arXiv Detail & Related papers (2023-04-26T01:26:19Z) - How to Leverage Multimodal EHR Data for Better Medical Predictions? [13.401754962583771]
The complexity of electronic health records ( EHR) data is a challenge for the application of deep learning.
In this paper, we first extract the accompanying clinical notes from EHR and propose a method to integrate these data.
The results on two medical prediction tasks show that our fused model with different data outperforms the state-of-the-art method.
arXiv Detail & Related papers (2021-10-29T13:26:05Z) - Recognising Biomedical Names: Challenges and Solutions [9.51284672475743]
We propose a transition-based NER model which can recognise discontinuous mentions.
We also develop a cost-effective approach that nominates the suitable pre-training data.
Our contributions have obvious practical implications, especially when new biomedical applications are needed.
arXiv Detail & Related papers (2021-06-23T08:20:13Z) - BiteNet: Bidirectional Temporal Encoder Network to Predict Medical
Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey.
An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys.
We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.