Few-Shot Electronic Health Record Coding through Graph Contrastive
Learning
- URL: http://arxiv.org/abs/2106.15467v1
- Date: Tue, 29 Jun 2021 14:53:17 GMT
- Title: Few-Shot Electronic Health Record Coding through Graph Contrastive
Learning
- Authors: Shanshan Wang, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Huasheng Liang,
Qiang Yan, Evangelos Kanoulas, Maarten de Rijke
- Abstract summary: We seek to improve the performance for both frequent and rare ICD codes by using a contrastive graph-based EHR coding framework, CoGraph.
CoGraph learns similarities and dissimilarities between HEWE graphs from different ICD codes so that information can be transferred among them.
Two graph contrastive learning schemes, GSCL and GECL, exploit the HEWE graph structures so as to encode transferable features.
- Score: 64.8138823920883
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Electronic health record (EHR) coding is the task of assigning ICD codes to
each EHR. Most previous studies either only focus on the frequent ICD codes or
treat rare and frequent ICD codes in the same way. These methods perform well
on frequent ICD codes but due to the extremely unbalanced distribution of ICD
codes, the performance on rare ones is far from satisfactory. We seek to
improve the performance for both frequent and rare ICD codes by using a
contrastive graph-based EHR coding framework, CoGraph, which re-casts EHR
coding as a few-shot learning task. First, we construct a heterogeneous EHR
word-entity (HEWE) graph for each EHR, where the words and entities extracted
from an EHR serve as nodes and the relations between them serve as edges. Then,
CoGraph learns similarities and dissimilarities between HEWE graphs from
different ICD codes so that information can be transferred among them. In a
few-shot learning scenario, the model only has access to frequent ICD codes
during training, which might force it to encode features that are useful for
frequent ICD codes only. To mitigate this risk, CoGraph devises two graph
contrastive learning schemes, GSCL and GECL, that exploit the HEWE graph
structures so as to encode transferable features. GSCL utilizes the
intra-correlation of different sub-graphs sampled from HEWE graphs while GECL
exploits the inter-correlation among HEWE graphs at different clinical stages.
Experiments on the MIMIC-III benchmark dataset show that CoGraph significantly
outperforms state-of-the-art methods on EHR coding, not only on frequent ICD
codes, but also on rare codes, in terms of several evaluation indicators. On
frequent ICD codes, GSCL and GECL improve the classification accuracy and F1 by
1.31% and 0.61%, respectively, and on rare ICD codes CoGraph has more obvious
improvements by 2.12% and 2.95%.
Related papers
- A Two-Stage Decoder for Efficient ICD Coding [10.634394331433322]
We propose a two-stage decoding mechanism to predict ICD codes.
At first, we predict the parent code and then predict the child code based on the previous prediction.
Experiments on the public MIMIC-III data set show that our model performs well in single-model settings.
arXiv Detail & Related papers (2023-05-27T17:25:13Z) - HieNet: Bidirectional Hierarchy Framework for Automated ICD Coding [2.9373912230684573]
International Classification of Diseases (ICD) is a set of classification codes for medical records.
In this work, we proposed a novel Bidirectional Hierarchy Framework(HieNet) to address the challenges.
Specifically, a personalized PageRank routine is developed to capture the co-relation of codes, a bidirectional hierarchy passage encoder to capture the codes' hierarchical representations, and a progressive predicting method is then proposed to narrow down the semantic searching space of prediction.
arXiv Detail & Related papers (2022-12-09T14:51:12Z) - New Frontiers in Graph Autoencoders: Joint Community Detection and Link
Prediction [27.570978996576503]
Graph autoencoders (GAE) and variational graph autoencoders (VGAE) emerged as powerful methods for link prediction (LP)
It is still unclear to what extent one can improve CD with GAE and VGAE, especially in the absence of node features.
We show that jointly addressing these two tasks with high accuracy is possible.
arXiv Detail & Related papers (2022-11-16T15:26:56Z) - Self-supervised Representation Learning on Electronic Health Records
with Graph Kernel Infomax [4.133378723518227]
We propose Graph Kernel Infomax, a self-supervised graph kernel learning approach on the graphical representation of EHR.
Unlike the state-of-the-art, we do not change the graph structure to construct augmented views.
Our approach yields performance on clinical downstream tasks that exceeds the state-of-the-art.
arXiv Detail & Related papers (2022-09-01T16:15:08Z) - GraphCoCo: Graph Complementary Contrastive Learning [65.89743197355722]
Graph Contrastive Learning (GCL) has shown promising performance in graph representation learning (GRL) without the supervision of manual annotations.
This paper proposes an effective graph complementary contrastive learning approach named GraphCoCo to tackle the above issue.
arXiv Detail & Related papers (2022-03-24T02:58:36Z) - Diversified Multiscale Graph Learning with Graph Self-Correction [55.43696999424127]
We propose a diversified multiscale graph learning model equipped with two core ingredients.
A graph self-correction (GSC) mechanism to generate informative embedded graphs, and a diversity boosting regularizer (DBR) to achieve a comprehensive characterization of the input graph.
Experiments on popular graph classification benchmarks show that the proposed GSC mechanism leads to significant improvements over state-of-the-art graph pooling methods.
arXiv Detail & Related papers (2021-03-17T16:22:24Z) - Heterogeneous Similarity Graph Neural Network on Electronic Health
Records [74.66674469510251]
We propose Heterogeneous Similarity Graph Neural Network (HSGNN) to analyze EHRs with a novel heterogeneous GNN.
Our framework consists of two parts: one is a preprocessing method and the other is an end-to-end GNN.
The GNN takes all homogeneous graphs as input and fuses all of them into one graph to make a prediction.
arXiv Detail & Related papers (2021-01-17T23:14:29Z) - SE-ECGNet: A Multi-scale Deep Residual Network with
Squeeze-and-Excitation Module for ECG Signal Classification [6.124438924401066]
We develop a multi-scale deep residual network for the ECG signal classification task.
We are the first to propose to treat the multi-lead signal as a 2-dimensional matrix.
Our proposed model achieves 99.2% F1-score in the MIT-BIH dataset and 89.4% F1-score in Alibaba dataset.
arXiv Detail & Related papers (2020-12-10T08:37:44Z) - Attention-Driven Dynamic Graph Convolutional Network for Multi-Label
Image Recognition [53.17837649440601]
We propose an Attention-Driven Dynamic Graph Convolutional Network (ADD-GCN) to dynamically generate a specific graph for each image.
Experiments on public multi-label benchmarks demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2020-12-05T10:10:12Z) - Inverse Graph Identification: Can We Identify Node Labels Given Graph
Labels? [89.13567439679709]
Graph Identification (GI) has long been researched in graph learning and is essential in certain applications.
This paper defines a novel problem dubbed Inverse Graph Identification (IGI)
We propose a simple yet effective method that makes the node-level message passing process using Graph Attention Network (GAT) under the protocol of GI.
arXiv Detail & Related papers (2020-07-12T12:06:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.