Code Synonyms Do Matter: Multiple Synonyms Matching Network for
Automatic ICD Coding
- URL: http://arxiv.org/abs/2203.01515v1
- Date: Thu, 3 Mar 2022 04:57:08 GMT
- Title: Code Synonyms Do Matter: Multiple Synonyms Matching Network for
Automatic ICD Coding
- Authors: Zheng Yuan, Chuanqi Tan, Songfang Huang
- Abstract summary: We argue that the code synonyms can provide more comprehensive knowledge based on the observation that the code expressions in EMRs vary from their descriptions in ICD.
We propose a multiple synonyms matching network to leverage synonyms for better code representation learning, and finally help the code classification.
- Score: 26.718721379738813
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Automatic ICD coding is defined as assigning disease codes to electronic
medical records (EMRs). Existing methods usually apply label attention with
code representations to match related text snippets. Unlike these works that
model the label with the code hierarchy or description, we argue that the code
synonyms can provide more comprehensive knowledge based on the observation that
the code expressions in EMRs vary from their descriptions in ICD. By aligning
codes to concepts in UMLS, we collect synonyms of every code. Then, we propose
a multiple synonyms matching network to leverage synonyms for better code
representation learning, and finally help the code classification. Experiments
on the MIMIC-III dataset show that our proposed method outperforms previous
state-of-the-art methods.
Related papers
- A Novel ICD Coding Method Based on Associated and Hierarchical Code Description Distillation [6.524062529847299]
ICD coding is a challenging multilabel text classification problem due to noisy medical document inputs.
Recent advancements in automated ICD coding have enhanced performance by integrating additional data and knowledge bases with the encoding of medical notes and codes.
We propose a novel framework based on associated and hierarchical code description distillation (AHDD) for better code representation learning and avoidance of improper code assignment.
arXiv Detail & Related papers (2024-04-17T07:26:23Z) - A Two-Stage Decoder for Efficient ICD Coding [10.634394331433322]
We propose a two-stage decoding mechanism to predict ICD codes.
At first, we predict the parent code and then predict the child code based on the previous prediction.
Experiments on the public MIMIC-III data set show that our model performs well in single-model settings.
arXiv Detail & Related papers (2023-05-27T17:25:13Z) - Exploring Structured Semantic Prior for Multi Label Recognition with
Incomplete Labels [60.675714333081466]
Multi-label recognition (MLR) with incomplete labels is very challenging.
Recent works strive to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to compensate for insufficient annotations.
We advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior.
arXiv Detail & Related papers (2023-03-23T12:39:20Z) - HieNet: Bidirectional Hierarchy Framework for Automated ICD Coding [2.9373912230684573]
International Classification of Diseases (ICD) is a set of classification codes for medical records.
In this work, we proposed a novel Bidirectional Hierarchy Framework(HieNet) to address the challenges.
Specifically, a personalized PageRank routine is developed to capture the co-relation of codes, a bidirectional hierarchy passage encoder to capture the codes' hierarchical representations, and a progressive predicting method is then proposed to narrow down the semantic searching space of prediction.
arXiv Detail & Related papers (2022-12-09T14:51:12Z) - Soft-Labeled Contrastive Pre-training for Function-level Code
Representation [127.71430696347174]
We present textbfSCodeR, a textbfSoft-labeled contrastive pre-training framework with two positive sample construction methods.
Considering the relevance between codes in a large-scale code corpus, the soft-labeled contrastive pre-training can obtain fine-grained soft-labels.
SCodeR achieves new state-of-the-art performance on four code-related tasks over seven datasets.
arXiv Detail & Related papers (2022-10-18T05:17:37Z) - Label Semantics for Few Shot Named Entity Recognition [68.01364012546402]
We study the problem of few shot learning for named entity recognition.
We leverage the semantic information in the names of the labels as a way of giving the model additional signal and enriched priors.
Our model learns to match the representations of named entities computed by the first encoder with label representations computed by the second encoder.
arXiv Detail & Related papers (2022-03-16T23:21:05Z) - CodeRetriever: Unimodal and Bimodal Contrastive Learning [128.06072658302165]
We propose the CodeRetriever model, which combines the unimodal and bimodal contrastive learning to train function-level code semantic representations.
For unimodal contrastive learning, we design a semantic-guided method to build positive code pairs based on the documentation and function name.
For bimodal contrastive learning, we leverage the documentation and in-line comments of code to build text-code pairs.
arXiv Detail & Related papers (2022-01-26T10:54:30Z) - A Meta-embedding-based Ensemble Approach for ICD Coding Prediction [64.42386426730695]
International Classification of Diseases (ICD) are the de facto codes used globally for clinical coding.
These codes enable healthcare providers to claim reimbursement and facilitate efficient storage and retrieval of diagnostic information.
Our proposed approach enhances the performance of neural models by effectively training word vectors using routine medical data as well as external knowledge from scientific articles.
arXiv Detail & Related papers (2021-02-26T17:49:58Z) - COSEA: Convolutional Code Search with Layer-wise Attention [90.35777733464354]
We propose a new deep learning architecture, COSEA, which leverages convolutional neural networks with layer-wise attention to capture the code's intrinsic structural logic.
COSEA can achieve significant improvements over state-of-the-art methods on code search tasks.
arXiv Detail & Related papers (2020-10-19T13:53:38Z) - Self-Supervised Contrastive Learning for Code Retrieval and
Summarization via Semantic-Preserving Transformations [28.61567319928316]
Corder is a self-supervised contrastive learning framework for source code model.
Key innovation is that we train the source code model by asking it to recognize similar and dissimilar code snippets.
We have shown that the code models pretrained by Corder substantially outperform the other baselines for code-to-code retrieval, text-to-code retrieval, and code-to-text summarization tasks.
arXiv Detail & Related papers (2020-09-06T13:31:16Z) - OCoR: An Overlapping-Aware Code Retriever [15.531119719750807]
Given a natural language description, code retrieval aims to search for the most relevant code among a set of code.
Existing state-of-the-art approaches apply neural networks to code retrieval.
We propose a novel neural architecture named OCoR, where we introduce two specifically-designed components to capture overlaps.
arXiv Detail & Related papers (2020-08-12T09:43:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.