An Interpretable End-to-end Fine-tuning Approach for Long Clinical Text
- URL: http://arxiv.org/abs/2011.06504v1
- Date: Thu, 12 Nov 2020 17:14:32 GMT
- Title: An Interpretable End-to-end Fine-tuning Approach for Long Clinical Text
- Authors: Kexin Huang, Sankeerth Garapati, Alexander S. Rich
- Abstract summary: Unstructured clinical text in EHRs contains crucial information for applications including decision support, trial matching, and retrospective research.
Recent work has applied BERT-based models to clinical information extraction and text classification, given these models' state-of-the-art performance in other NLP domains.
In this work, we propose a novel fine-tuning approach called SnipBERT. Instead of using entire notes, SnipBERT identifies crucial snippets and feeds them into a truncated BERT-based model in a hierarchical manner.
- Score: 72.62848911347466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unstructured clinical text in EHRs contains crucial information for
applications including decision support, trial matching, and retrospective
research. Recent work has applied BERT-based models to clinical information
extraction and text classification, given these models' state-of-the-art
performance in other NLP domains. However, BERT is difficult to apply to
clinical notes because it doesn't scale well to long sequences of text. In this
work, we propose a novel fine-tuning approach called SnipBERT. Instead of using
entire notes, SnipBERT identifies crucial snippets and then feeds them into a
truncated BERT-based model in a hierarchical manner. Empirically, SnipBERT not
only has significant predictive performance gain across three tasks but also
provides improved interpretability, as the model can identify key pieces of
text that led to its prediction.
Related papers
- SNOBERT: A Benchmark for clinical notes entity linking in the SNOMED CT clinical terminology [43.89160296332471]
We propose a method for linking text spans in clinical notes to specific concepts in the SNOMED CT using BERT-based models.
The method consists of two stages: candidate selection and candidate matching. The models were trained on one of the largest publicly available dataset of labeled clinical notes.
arXiv Detail & Related papers (2024-05-25T08:00:44Z) - Attribute Structuring Improves LLM-Based Evaluation of Clinical Text
Summaries [62.32403630651586]
Large language models (LLMs) have shown the potential to generate accurate clinical text summaries, but still struggle with issues regarding grounding and evaluation.
Here, we explore a general mitigation framework using Attribute Structuring (AS), which structures the summary evaluation process.
AS consistently improves the correspondence between human annotations and automated metrics in clinical text summarization.
arXiv Detail & Related papers (2024-03-01T21:59:03Z) - Keyword-optimized Template Insertion for Clinical Information Extraction
via Prompt-based Learning [0.2939632869678985]
We develop a keyword-optimized template insertion method (KOTI) for clinical notes.
We show how it can improve performance on several clinical tasks in a zero-shot and few-shot training setting.
arXiv Detail & Related papers (2023-10-31T00:07:11Z) - Making the Most Out of the Limited Context Length: Predictive Power
Varies with Clinical Note Type and Note Section [70.37720062263176]
We propose a framework to analyze the sections with high predictive power.
Using MIMIC-III, we show that: 1) predictive power distribution is different between nursing notes and discharge notes and 2) combining different types of notes could improve performance when the context length is large.
arXiv Detail & Related papers (2023-07-13T20:04:05Z) - Interpretable Medical Diagnostics with Structured Data Extraction by
Large Language Models [59.89454513692417]
Tabular data is often hidden in text, particularly in medical diagnostic reports.
We propose a novel, simple, and effective methodology for extracting structured tabular data from textual medical reports, called TEMED-LLM.
We demonstrate that our approach significantly outperforms state-of-the-art text classification models in medical diagnostics.
arXiv Detail & Related papers (2023-06-08T09:12:28Z) - HealthPrompt: A Zero-shot Learning Paradigm for Clinical Natural
Language Processing [3.762895631262445]
We developed a novel prompt-based clinical NLP framework called HealthPrompt.
We performed an in-depth analysis of HealthPrompt on six different PLMs in a no-data setting.
Our experiments prove that prompts effectively capture the context of clinical texts and perform remarkably well without any training data.
arXiv Detail & Related papers (2022-03-09T21:44:28Z) - Fine-Tuning Large Neural Language Models for Biomedical Natural Language
Processing [55.52858954615655]
We conduct a systematic study on fine-tuning stability in biomedical NLP.
We show that finetuning performance may be sensitive to pretraining settings, especially in low-resource domains.
We show that these techniques can substantially improve fine-tuning performance for lowresource biomedical NLP applications.
arXiv Detail & Related papers (2021-12-15T04:20:35Z) - Clinical Trial Information Extraction with BERT [0.0]
We propose a framework called CT-BERT for information extraction from clinical trial text.
We trained named entity recognition (NER) models to extract eligibility criteria entities.
The results demonstrate the superiority of CT-BERT in clinical trial NLP.
arXiv Detail & Related papers (2021-09-11T17:15:10Z) - Artificial Text Detection via Examining the Topology of Attention Maps [58.46367297712477]
We propose three novel types of interpretable topological features for this task based on Topological Data Analysis (TDA)
We empirically show that the features derived from the BERT model outperform count- and neural-based baselines up to 10% on three common datasets.
The probing analysis of the features reveals their sensitivity to the surface and syntactic properties.
arXiv Detail & Related papers (2021-09-10T12:13:45Z) - Fine-tuning Pretrained Language Models with Label Attention for
Explainable Biomedical Text Classification [1.066048003460524]
We develop an improved label attention-based architecture to inject semantic label description into the fine-tuning process of PTMs.
Results on two public medical datasets show that the proposed fine-tuning scheme outperforms the conventionally fine-tuned PTMs and prior state-of-the-art models.
arXiv Detail & Related papers (2021-08-26T14:23:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.