PLM-ICD: Automatic ICD Coding with Pretrained Language Models
- URL: http://arxiv.org/abs/2207.05289v1
- Date: Tue, 12 Jul 2022 03:56:28 GMT
- Title: PLM-ICD: Automatic ICD Coding with Pretrained Language Models
- Authors: Chao-Wei Huang, Shang-Chi Tsai, Yun-Nung Chen
- Abstract summary: This paper develops a framework for automatic ICD coding with pretrained language models.
Three main issues are 1) large label space, 2) long input sequences, and 3) domain mismatch between pretraining and fine-tuning.
Our proposed framework can overcome the challenges and achieves state-of-the-art performance in terms of multiple metrics on the benchmark MIMIC data.
- Score: 35.161696760157824
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatically classifying electronic health records (EHRs) into diagnostic
codes has been challenging to the NLP community. State-of-the-art methods
treated this problem as a multilabel classification problem and proposed
various architectures to model this problem. However, these systems did not
leverage the superb performance of pretrained language models, which achieved
superb performance on natural language understanding tasks. Prior work has
shown that pretrained language models underperformed on this task with the
regular finetuning scheme. Therefore, this paper aims at analyzing the causes
of the underperformance and developing a framework for automatic ICD coding
with pretrained language models. We spotted three main issues through the
experiments: 1) large label space, 2) long input sequences, and 3) domain
mismatch between pretraining and fine-tuning. We propose PLMICD, a framework
that tackles the challenges with various strategies. The experimental results
show that our proposed framework can overcome the challenges and achieves
state-of-the-art performance in terms of multiple metrics on the benchmark
MIMIC data. The source code is available at https://github.com/MiuLab/PLM-ICD
Related papers
- Unsupervised Pre-training with Language-Vision Prompts for Low-Data Instance Segmentation [105.23631749213729]
We propose a novel method for unsupervised pre-training in low-data regimes.
Inspired by the recently successful prompting technique, we introduce a new method, Unsupervised Pre-training with Language-Vision Prompts.
We show that our method can converge faster and perform better than CNN-based models in low-data regimes.
arXiv Detail & Related papers (2024-05-22T06:48:43Z) - Low-Cost Language Models: Survey and Performance Evaluation on Python Code Generation [0.0]
Large Language Models (LLMs) have become the go-to solution for many Natural Language Processing (NLP) tasks.
We conduct a semi-manual evaluation of their strengths and weaknesses in generating Python code.
We propose a dataset of 60 programming problems with varying difficulty levels for evaluation purposes.
arXiv Detail & Related papers (2024-04-17T08:16:48Z) - Rationale-Guided Few-Shot Classification to Detect Abusive Language [5.977278650516324]
We propose RGFS (Rationale-Guided Few-Shot Classification) for abusive language detection.
We introduce two rationale-integrated BERT-based architectures (the RGFS models) and evaluate our systems over five different abusive language datasets.
arXiv Detail & Related papers (2022-11-30T14:47:14Z) - Improving Pre-trained Language Model Fine-tuning with Noise Stability
Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR)
Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model.
We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z) - Bridging the Gap Between Training and Inference of Bayesian Controllable
Language Models [58.990214815032495]
Large-scale pre-trained language models have achieved great success on natural language generation tasks.
BCLMs have been shown to be efficient in controllable language generation.
We propose a "Gemini Discriminator" for controllable language generation which alleviates the mismatch problem with a small computational cost.
arXiv Detail & Related papers (2022-06-11T12:52:32Z) - Prompt Tuning for Discriminative Pre-trained Language Models [96.04765512463415]
Recent works have shown promising results of prompt tuning in stimulating pre-trained language models (PLMs) for natural language processing (NLP) tasks.
It is still unknown whether and how discriminative PLMs, e.g., ELECTRA, can be effectively prompt-tuned.
We present DPT, the first prompt tuning framework for discriminative PLMs, which reformulates NLP tasks into a discriminative language modeling problem.
arXiv Detail & Related papers (2022-05-23T10:11:50Z) - Switch Point biased Self-Training: Re-purposing Pretrained Models for
Code-Switching [44.034300203700234]
Code-switching is a ubiquitous phenomenon due to the ease of communication it offers in multilingual communities.
We propose a self training method to repurpose the existing pretrained models using a switch-point bias.
Our approach performs well on both tasks by reducing the gap between the switch point performance.
arXiv Detail & Related papers (2021-11-01T19:42:08Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Language Models are Few-Shot Learners [61.36677350504291]
We show that scaling up language models greatly improves task-agnostic, few-shot performance.
We train GPT-3, an autoregressive language model with 175 billion parameters, and test its performance in the few-shot setting.
GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks.
arXiv Detail & Related papers (2020-05-28T17:29:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.