Application of Transformers based methods in Electronic Medical Records:
A Systematic Literature Review
- URL: http://arxiv.org/abs/2304.02768v1
- Date: Wed, 5 Apr 2023 22:19:42 GMT
- Title: Application of Transformers based methods in Electronic Medical Records:
A Systematic Literature Review
- Authors: Vitor Alcantara Batista, Alexandre Gon\c{c}alves Evsukoff
- Abstract summary: This work presents a systematic literature review of state-of-the-art advances using transformer-based methods on electronic medical records (EMRs) in different NLP tasks.
- Score: 77.34726150561087
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The combined growth of available data and their unstructured nature has
received increased interest in natural language processing (NLP) techniques to
make value of these data assets since this format is not suitable for
statistical analysis. This work presents a systematic literature review of
state-of-the-art advances using transformer-based methods on electronic medical
records (EMRs) in different NLP tasks. To the best of our knowledge, this work
is unique in providing a comprehensive review of research on transformer-based
methods for NLP applied to the EMR field. In the initial query, 99 articles
were selected from three public databases and filtered into 65 articles for
detailed analysis. The papers were analyzed with respect to the business
problem, NLP task, models and techniques, availability of datasets,
reproducibility of modeling, language, and exchange format. The paper presents
some limitations of current research and some recommendations for further
research.
Related papers
- SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature [80.49349719239584]
We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks.
SciRIFF is the first dataset focused on extracting and synthesizing information from research literature across a wide range of scientific fields.
arXiv Detail & Related papers (2024-06-10T21:22:08Z) - A Survey on Data Selection for Language Models [151.6210632830082]
Data selection methods aim to determine which data points to include in a training dataset.
Deep learning is mostly driven by empirical evidence and experimentation on large-scale data is expensive.
Few organizations have the resources for extensive data selection research.
arXiv Detail & Related papers (2024-02-26T18:54:35Z) - The SourceData-NLP dataset: integrating curation into scientific
publishing for training large language models [1.0423199374671421]
We present the SourceData-NLP dataset produced through the routine curation of papers during the publication process.
This dataset contains more than 620,000 annotated biomedical entities, curated from 18,689 figures in 3,223 papers in molecular and cell biology.
arXiv Detail & Related papers (2023-10-31T13:22:38Z) - An experiment on an automated literature survey of data-driven speech
enhancement methods [5.931978628000179]
This work explores the use of a generative pre-trained transformer (GPT) model to automate a literature survey of 116 articles on data-driven speech enhancement methods.
arXiv Detail & Related papers (2023-10-10T02:07:24Z) - Accelerated materials language processing enabled by GPT [5.518792725397679]
We develop generative transformer (GPT)-enabled pipelines for materials language processing.
First, we develop a GPT-enabled document classification method for screening relevant documents.
Secondly, for NER task, we design an entity-centric prompts, and learning few-shot of them improved the performance.
Finally, we develop an GPT-enabled extractive QA model, which provides improved performance and shows the possibility of automatically correcting annotations.
arXiv Detail & Related papers (2023-08-18T07:31:13Z) - Advancing Italian Biomedical Information Extraction with
Transformers-based Models: Methodological Insights and Multicenter Practical
Application [0.27027468002793437]
Information Extraction can help clinical practitioners overcome the limitation by using automated text-mining pipelines.
We created the first Italian neuropsychiatric Named Entity Recognition dataset, PsyNIT, and used it to develop a Transformers-based model.
The lessons learned are: (i) the crucial role of a consistent annotation process and (ii) a fine-tuning strategy that combines classical methods with a "low-resource" approach.
arXiv Detail & Related papers (2023-06-08T16:15:46Z) - Efficient Methods for Natural Language Processing: A Survey [76.34572727185896]
This survey synthesizes and relates current methods and findings in efficient NLP.
We aim to provide both guidance for conducting NLP under limited resources, and point towards promising research directions for developing more efficient methods.
arXiv Detail & Related papers (2022-08-31T20:32:35Z) - Human-in-the-Loop Disinformation Detection: Stance, Sentiment, or
Something Else? [93.91375268580806]
Both politics and pandemics have recently provided ample motivation for the development of machine learning-enabled disinformation (a.k.a. fake news) detection algorithms.
Existing literature has focused primarily on the fully-automated case, but the resulting techniques cannot reliably detect disinformation on the varied topics, sources, and time scales required for military applications.
By leveraging an already-available analyst as a human-in-the-loop, canonical machine learning techniques of sentiment analysis, aspect-based sentiment analysis, and stance detection become plausible methods to use for a partially-automated disinformation detection system.
arXiv Detail & Related papers (2021-11-09T13:30:34Z) - Pretrained Transformers for Text Ranking: BERT and Beyond [53.83210899683987]
This survey provides an overview of text ranking with neural network architectures known as transformers.
The combination of transformers and self-supervised pretraining has been responsible for a paradigm shift in natural language processing.
arXiv Detail & Related papers (2020-10-13T15:20:32Z) - Data Mining in Clinical Trial Text: Transformers for Classification and
Question Answering Tasks [2.127049691404299]
This research applies advances in natural language processing to evidence synthesis based on medical texts.
The main focus is on information characterized via the Population, Intervention, Comparator, and Outcome (PICO) framework.
Recent neural network architectures based on transformers show capacities for transfer learning and increased performance on downstream natural language processing tasks.
arXiv Detail & Related papers (2020-01-30T11:45:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.