Applying unsupervised keyphrase methods on concepts extracted from
discharge sheets
- URL: http://arxiv.org/abs/2303.08928v1
- Date: Wed, 15 Mar 2023 20:55:25 GMT
- Title: Applying unsupervised keyphrase methods on concepts extracted from
discharge sheets
- Authors: Hoda Memarzadeh, Nasser Ghadiri, Matthias Samwald, Maryam Lotfi
Shahreza
- Abstract summary: It is necessary to identify the section in which each content is recorded and also to identify key concepts to extract meaning from clinical texts.
In this study, these challenges have been addressed by using clinical natural language processing techniques.
A set of popular unsupervised key phrase extraction methods has been verified and evaluated.
- Score: 7.102620843620572
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Clinical notes containing valuable patient information are written by
different health care providers with various scientific levels and writing
styles. It might be helpful for clinicians and researchers to understand what
information is essential when dealing with extensive electronic medical
records. Entities recognizing and mapping them to standard terminologies is
crucial in reducing ambiguity in processing clinical notes. Although named
entity recognition and entity linking are critical steps in clinical natural
language processing, they can also result in the production of repetitive and
low-value concepts. In other hand, all parts of a clinical text do not share
the same importance or content in predicting the patient's condition. As a
result, it is necessary to identify the section in which each content is
recorded and also to identify key concepts to extract meaning from clinical
texts. In this study, these challenges have been addressed by using clinical
natural language processing techniques. In addition, in order to identify key
concepts, a set of popular unsupervised key phrase extraction methods has been
verified and evaluated. Considering that most of the clinical concepts are in
the form of multi-word expressions and their accurate identification requires
the user to specify n-gram range, we have proposed a shortcut method to
preserve the structure of the expression based on TF-IDF. In order to evaluate
the pre-processing method and select the concepts, we have designed two types
of downstream tasks (multiple and binary classification) using the capabilities
of transformer-based models. The obtained results show the superiority of
proposed method in combination with SciBERT model, also offer an insight into
the efficacy of general extracting essential phrase methods for clinical notes.
Related papers
- SNOBERT: A Benchmark for clinical notes entity linking in the SNOMED CT clinical terminology [43.89160296332471]
We propose a method for linking text spans in clinical notes to specific concepts in the SNOMED CT using BERT-based models.
The method consists of two stages: candidate selection and candidate matching. The models were trained on one of the largest publicly available dataset of labeled clinical notes.
arXiv Detail & Related papers (2024-05-25T08:00:44Z) - Cross-Lingual Knowledge Transfer for Clinical Phenotyping [55.92262310716537]
We investigate cross-lingual knowledge transfer strategies to execute this task for clinics that do not use the English language.
We evaluate these strategies for a Greek and a Spanish clinic leveraging clinical notes from different clinical domains.
Our results show that using multilingual data overall improves clinical phenotyping models and can compensate for data sparseness.
arXiv Detail & Related papers (2022-08-03T08:33:21Z) - Assessing mortality prediction through different representation models
based on concepts extracted from clinical notes [2.707154152696381]
Learning of embedding is a method for converting notes into a format that makes them comparable.
Transformer-based representation models have recently made a great leap forward.
We performed experiments to measure the usefulness of the learned embedding vectors in the task of hospital mortality prediction.
arXiv Detail & Related papers (2022-07-22T04:34:33Z) - Semantic Search for Large Scale Clinical Ontologies [63.71950996116403]
We present a deep learning approach to build a search system for large clinical vocabularies.
We propose a Triplet-BERT model and a method that generates training data based on semantic training data.
The model is evaluated using five real benchmark data sets and the results show that our approach achieves high results on both free text to concept and concept to searching concept vocabularies.
arXiv Detail & Related papers (2022-01-01T05:15:42Z) - Towards more patient friendly clinical notes through language models and
ontologies [57.51898902864543]
We present a novel approach to automated medical text based on word simplification and language modelling.
We use a new dataset pairs of publicly available medical sentences and a version of them simplified by clinicians.
Our method based on a language model trained on medical forum data generates simpler sentences while preserving both grammar and the original meaning.
arXiv Detail & Related papers (2021-12-23T16:11:19Z) - Self-supervised Answer Retrieval on Clinical Notes [68.87777592015402]
We introduce CAPR, a rule-based self-supervision objective for training Transformer language models for domain-specific passage matching.
We apply our objective in four Transformer-based architectures: Contextual Document Vectors, Bi-, Poly- and Cross-encoders.
We report that CAPR outperforms strong baselines in the retrieval of domain-specific passages and effectively generalizes across rule-based and human-labeled passages.
arXiv Detail & Related papers (2021-08-02T10:42:52Z) - Clinical Named Entity Recognition using Contextualized Token
Representations [49.036805795072645]
This paper introduces the technique of contextualized word embedding to better capture the semantic meaning of each word based on its context.
We pre-train two deep contextualized language models, Clinical Embeddings from Language Model (C-ELMo) and Clinical Contextual String Embeddings (C-Flair)
Explicit experiments show that our models gain dramatic improvements compared to both static word embeddings and domain-generic language models.
arXiv Detail & Related papers (2021-06-23T18:12:58Z) - Drug and Disease Interpretation Learning with Biomedical Entity
Representation Transformer [9.152161078854146]
Concept normalization in free-form texts is a crucial step in every text-mining pipeline.
We propose a simple and effective two-stage neural approach based on fine-tuned BERT architectures.
arXiv Detail & Related papers (2021-01-22T20:01:25Z) - Clinical Text Summarization with Syntax-Based Negation and Semantic
Concept Identification [22.556855536939878]
We use computational linguistics with human experts-curated biomedical knowledge base to achieve the interpretable and meaningful clinical text summarization.
Our research objective is to use the biomedical ontology with semantic information, and take the advantage from the language hierarchical structure, the constituency tree, in order to identify the correct clinical concepts and the corresponding negation information.
arXiv Detail & Related papers (2020-02-29T22:15:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.