Keyword-optimized Template Insertion for Clinical Information Extraction
via Prompt-based Learning
- URL: http://arxiv.org/abs/2310.20089v1
- Date: Tue, 31 Oct 2023 00:07:11 GMT
- Title: Keyword-optimized Template Insertion for Clinical Information Extraction
via Prompt-based Learning
- Authors: Eugenia Alleva, Isotta Landi, Leslee J Shaw, Erwin B\"ottinger, Thomas
J Fuchs, Ipek Ensari
- Abstract summary: We develop a keyword-optimized template insertion method (KOTI) for clinical notes.
We show how it can improve performance on several clinical tasks in a zero-shot and few-shot training setting.
- Score: 0.2939632869678985
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Clinical note classification is a common clinical NLP task. However,
annotated data-sets are scarse. Prompt-based learning has recently emerged as
an effective method to adapt pre-trained models for text classification using
only few training examples. A critical component of prompt design is the
definition of the template (i.e. prompt text). The effect of template position,
however, has been insufficiently investigated. This seems particularly
important in the clinical setting, where task-relevant information is usually
sparse in clinical notes. In this study we develop a keyword-optimized template
insertion method (KOTI) and show how optimizing position can improve
performance on several clinical tasks in a zero-shot and few-shot training
setting.
Related papers
- Language Model Training Paradigms for Clinical Feature Embeddings [1.4513150969598638]
We use self-supervised training paradigms for language models to learn high-quality clinical feature embeddings.
We visualize the learnt embeddings via unsupervised dimension reduction techniques and observe a high degree of consistency with prior clinical knowledge.
arXiv Detail & Related papers (2023-11-01T18:23:12Z) - Clinical Text Deduplication Practices for Efficient Pretraining and
Improved Clinical Tasks [39.65514468447604]
We provide a fine-grained characterization of duplicates stemming from common writing practices and clinical relevancy.
We demonstrate that deduplicating clinical text can help clinical LMs encode less redundant information in a more efficient manner.
arXiv Detail & Related papers (2023-09-29T18:35:52Z) - Development and validation of a natural language processing algorithm to
pseudonymize documents in the context of a clinical data warehouse [53.797797404164946]
The study highlights the difficulties faced in sharing tools and resources in this domain.
We annotated a corpus of clinical documents according to 12 types of identifying entities.
We build a hybrid system, merging the results of a deep learning model as well as manual rules.
arXiv Detail & Related papers (2023-03-23T17:17:46Z) - Modelling Temporal Document Sequences for Clinical ICD Coding [9.906895077843663]
We propose a hierarchical transformer architecture that uses text across the entire sequence of clinical notes in each hospital stay for ICD coding.
While using all clinical notes increases the quantity of data substantially, superconvergence can be used to reduce training costs.
Our model exceeds the prior state-of-the-art when using only discharge summaries as input, and achieves further performance improvements when all clinical notes are used as input.
arXiv Detail & Related papers (2023-02-24T14:41:48Z) - Cross-Lingual Knowledge Transfer for Clinical Phenotyping [55.92262310716537]
We investigate cross-lingual knowledge transfer strategies to execute this task for clinics that do not use the English language.
We evaluate these strategies for a Greek and a Spanish clinic leveraging clinical notes from different clinical domains.
Our results show that using multilingual data overall improves clinical phenotyping models and can compensate for data sparseness.
arXiv Detail & Related papers (2022-08-03T08:33:21Z) - HealthPrompt: A Zero-shot Learning Paradigm for Clinical Natural
Language Processing [3.762895631262445]
We developed a novel prompt-based clinical NLP framework called HealthPrompt.
We performed an in-depth analysis of HealthPrompt on six different PLMs in a no-data setting.
Our experiments prove that prompts effectively capture the context of clinical texts and perform remarkably well without any training data.
arXiv Detail & Related papers (2022-03-09T21:44:28Z) - Colorectal Polyp Classification from White-light Colonoscopy Images via
Domain Alignment [57.419727894848485]
A computer-aided diagnosis system is required to assist accurate diagnosis from colonoscopy images.
Most previous studies at-tempt to develop models for polyp differentiation using Narrow-Band Imaging (NBI) or other enhanced images.
We propose a novel framework based on a teacher-student architecture for the accurate colorectal polyp classification.
arXiv Detail & Related papers (2021-08-05T09:31:46Z) - KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization
for Relation Extraction [111.74812895391672]
We propose a Knowledge-aware Prompt-tuning approach with synergistic optimization (KnowPrompt)
We inject latent knowledge contained in relation labels into prompt construction with learnable virtual type words and answer words.
arXiv Detail & Related papers (2021-04-15T17:57:43Z) - Benchmarking Automated Clinical Language Simplification: Dataset,
Algorithm, and Evaluation [48.87254340298189]
We construct a new dataset named MedLane to support the development and evaluation of automated clinical language simplification approaches.
We propose a new model called DECLARE that follows the human annotation procedure and achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-12-04T06:09:02Z) - An Interpretable End-to-end Fine-tuning Approach for Long Clinical Text [72.62848911347466]
Unstructured clinical text in EHRs contains crucial information for applications including decision support, trial matching, and retrospective research.
Recent work has applied BERT-based models to clinical information extraction and text classification, given these models' state-of-the-art performance in other NLP domains.
In this work, we propose a novel fine-tuning approach called SnipBERT. Instead of using entire notes, SnipBERT identifies crucial snippets and feeds them into a truncated BERT-based model in a hierarchical manner.
arXiv Detail & Related papers (2020-11-12T17:14:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.