A Scoping Review of Publicly Available Language Tasks in Clinical
Natural Language Processing
- URL: http://arxiv.org/abs/2112.05780v1
- Date: Tue, 7 Dec 2021 22:49:58 GMT
- Title: A Scoping Review of Publicly Available Language Tasks in Clinical
Natural Language Processing
- Authors: Yanjun Gao, Dmitriy Dligach, Leslie Christensen, Samuel Tesch, Ryan
Laffin, Dongfang Xu, Timothy Miller, Ozlem Uzuner, Matthew M Churpek, Majid
Afshar
- Abstract summary: We searched six databases, including biomedical research and computer science literature database.
A total of 35 papers with 47 clinical NLP tasks met inclusion criteria between 2007 and 2021.
- Score: 7.966218734325912
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Objective: to provide a scoping review of papers on clinical natural language
processing (NLP) tasks that use publicly available electronic health record
data from a cohort of patients. Materials and Methods: We searched six
databases, including biomedical research and computer science literature
database. A round of title/abstract screening and full-text screening were
conducted by two reviewers. Our method followed the Preferred Reporting Items
for Systematic Reviews and Meta-Analysis (PRISMA) guidelines. Results: A total
of 35 papers with 47 clinical NLP tasks met inclusion criteria between 2007 and
2021. We categorized the tasks by the type of NLP problems, including name
entity recognition, summarization, and other NLP tasks. Some tasks were
introduced with a topic of clinical decision support applications, such as
substance abuse, phenotyping, cohort selection for clinical trial. We
summarized the tasks by publication and dataset information. Discussion: The
breadth of clinical NLP tasks keeps growing as the field of NLP evolves with
advancements in language systems. However, gaps exist in divergent interests
between general domain NLP community and clinical informatics community, and in
generalizability of the data sources. We also identified issues in data
selection and preparation including the lack of time-sensitive data, and
invalidity of problem size and evaluation. Conclusions: The existing clinical
NLP tasks cover a wide range of topics and the field will continue to grow and
attract more attention from both general domain NLP and clinical informatics
community. We encourage future work to incorporate multi-disciplinary
collaboration, reporting transparency, and standardization in data preparation.
Related papers
- Clinical trial cohort selection using Large Language Models on n2c2 Challenges [3.0208841164563838]
Large language models (LLMs) have gained popularity for various NLP tasks due to their ability to acquire a nuanced understanding of text.
Our results are promising with regard to the incorporation of LLMs for simple cohort selection tasks, but also highlight the difficulties encountered by these models as soon as fine-grained knowledge and reasoning are required.
arXiv Detail & Related papers (2025-01-19T17:07:02Z) - A Systematic Review of NLP for Dementia -- Tasks, Datasets and Opportunities [15.879500944648237]
We review over 240 papers applying NLP to dementia-related efforts.
Half of all papers focus solely on dementia detection using clinical data.
We highlight gaps and opportunities around trust, scientific rigor, applicability and cross-community collaboration.
arXiv Detail & Related papers (2024-09-29T15:30:59Z) - Large Language Models in the Clinic: A Comprehensive Benchmark [63.21278434331952]
We build a benchmark ClinicBench to better understand large language models (LLMs) in the clinic.
We first collect eleven existing datasets covering diverse clinical language generation, understanding, and reasoning tasks.
We then construct six novel datasets and clinical tasks that are complex but common in real-world practice.
We conduct an extensive evaluation of twenty-two LLMs under both zero-shot and few-shot settings.
arXiv Detail & Related papers (2024-04-25T15:51:06Z) - Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models [46.32860360019374]
Large language models (LLMs) have shown promise in this domain, but their direct deployment can lead to privacy issues.
We propose an innovative, resource-efficient approach, ClinGen, which infuses knowledge into the process.
Our empirical study across 7 clinical NLP tasks and 16 datasets reveals that ClinGen consistently enhances performance across various tasks.
arXiv Detail & Related papers (2023-11-01T04:37:28Z) - Natural Language Processing in Electronic Health Records in Relation to
Healthcare Decision-making: A Systematic Review [2.555168694997103]
Natural Language Processing is widely used to extract clinical insights from Electronic Health Records.
Lack of annotated data, automated tools, and other challenges hinder the full utilisation of NLP for EHRs.
Various Machine Learning (ML), Deep Learning (DL) and NLP techniques are studied and compared to understand the limitations and opportunities in this space comprehensively.
arXiv Detail & Related papers (2023-06-22T12:10:41Z) - Development and validation of a natural language processing algorithm to
pseudonymize documents in the context of a clinical data warehouse [53.797797404164946]
The study highlights the difficulties faced in sharing tools and resources in this domain.
We annotated a corpus of clinical documents according to 12 types of identifying entities.
We build a hybrid system, merging the results of a deep learning model as well as manual rules.
arXiv Detail & Related papers (2023-03-23T17:17:46Z) - GDPR Compliant Collection of Therapist-Patient-Dialogues [48.091760741427656]
We elaborate on the challenges we faced in starting our collection of therapist-patient dialogues in a psychiatry clinic under the General Data Privacy Regulation of the European Union.
We give an overview of each step in our procedure and point out the potential pitfalls to motivate further research in this field.
arXiv Detail & Related papers (2022-11-22T15:51:10Z) - ITTC @ TREC 2021 Clinical Trials Track [54.141379782822206]
The task focuses on the problem of matching eligible clinical trials to topics constituting a summary of a patient's admission notes.
We explore different ways of representing trials and topics using NLP techniques, and then use a common retrieval model to generate the ranked list of relevant trials for each topic.
The results from all our submitted runs are well above the median scores for all topics, but there is still plenty of scope for improvement.
arXiv Detail & Related papers (2022-02-16T04:56:47Z) - Benchmarking Automated Clinical Language Simplification: Dataset,
Algorithm, and Evaluation [48.87254340298189]
We construct a new dataset named MedLane to support the development and evaluation of automated clinical language simplification approaches.
We propose a new model called DECLARE that follows the human annotation procedure and achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-12-04T06:09:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.