Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for
Imbalanced Medical Classification
- URL: http://arxiv.org/abs/2311.16650v1
- Date: Tue, 28 Nov 2023 10:02:08 GMT
- Title: Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for
Imbalanced Medical Classification
- Authors: Jiahuan Yan, Haojun Gao, Zhang Kai, Weize Liu, Danny Chen, Jian Wu,
Jintai Chen
- Abstract summary: This paper aims to rethink the data challenges in medical texts and present a novel framework-agnostic algorithm called Text2Tree.
We embed the ICD code tree structure of labels into cascade attention modules for learning hierarchy-aware label representations.
Two new learning schemes, Similarity Surrogate Learning (SSL) and Dissimilarity Mixup Learning (DML), are devised to boost text classification by reusing and distinguishing samples of other labels.
- Score: 9.391704905671476
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning approaches exhibit promising performances on various text
tasks. However, they are still struggling on medical text classification since
samples are often extremely imbalanced and scarce. Different from existing
mainstream approaches that focus on supplementary semantics with external
medical information, this paper aims to rethink the data challenges in medical
texts and present a novel framework-agnostic algorithm called Text2Tree that
only utilizes internal label hierarchy in training deep learning models. We
embed the ICD code tree structure of labels into cascade attention modules for
learning hierarchy-aware label representations. Two new learning schemes,
Similarity Surrogate Learning (SSL) and Dissimilarity Mixup Learning (DML), are
devised to boost text classification by reusing and distinguishing samples of
other labels following the label representation hierarchy, respectively.
Experiments on authoritative public datasets and real-world medical records
show that our approach stably achieves superior performances over classical and
advanced imbalanced classification methods.
Related papers
- Leveraging Label Semantics and Meta-Label Refinement for Multi-Label Question Classification [11.19022605804112]
This paper introduces RR2QC, a novel Retrieval Reranking method To multi-label Question Classification.
It uses label semantics and meta-label refinement to enhance personalized learning and resource recommendation.
Experimental results demonstrate that RR2QC outperforms existing classification methods in Precision@k and F1 scores.
arXiv Detail & Related papers (2024-11-04T06:27:14Z) - Description-Enhanced Label Embedding Contrastive Learning for Text
Classification [65.01077813330559]
Self-Supervised Learning (SSL) in model learning process and design a novel self-supervised Relation of Relation (R2) classification task.
Relation of Relation Learning Network (R2-Net) for text classification, in which text classification and R2 classification are treated as optimization targets.
external knowledge from WordNet to obtain multi-aspect descriptions for label semantic learning.
arXiv Detail & Related papers (2023-06-15T02:19:34Z) - Exploring Structured Semantic Prior for Multi Label Recognition with
Incomplete Labels [60.675714333081466]
Multi-label recognition (MLR) with incomplete labels is very challenging.
Recent works strive to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to compensate for insufficient annotations.
We advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior.
arXiv Detail & Related papers (2023-03-23T12:39:20Z) - Label Semantic Aware Pre-training for Few-shot Text Classification [53.80908620663974]
We propose Label Semantic Aware Pre-training (LSAP) to improve the generalization and data efficiency of text classification systems.
LSAP incorporates label semantics into pre-trained generative models (T5 in our case) by performing secondary pre-training on labeled sentences from a variety of domains.
arXiv Detail & Related papers (2022-04-14T17:33:34Z) - Academic Resource Text Level Multi-label Classification based on
Attention [16.71166207897885]
Hierarchical multi-label academic text classification (HMTC) is to assign academic texts into a hierarchically structured labeling system.
We propose an attention-based hierarchical multi-label classification algorithm of academic texts (AHMCA) by integrating features such as text, keywords, and hierarchical structure.
arXiv Detail & Related papers (2022-03-21T05:32:35Z) - MATCH: Metadata-Aware Text Classification in A Large Hierarchy [60.59183151617578]
MATCH is an end-to-end framework that leverages both metadata and hierarchy information.
We propose different ways to regularize the parameters and output probability of each child label by its parents.
Experiments on two massive text datasets with large-scale label hierarchies demonstrate the effectiveness of MATCH.
arXiv Detail & Related papers (2021-02-15T05:23:08Z) - Joint Learning of Hyperbolic Label Embeddings for Hierarchical
Multi-label Classification [9.996804039553858]
We consider the problem of multi-label classification where the labels lie in a hierarchy.
We propose a novel formulation for the joint learning and empirically evaluate its efficacy.
arXiv Detail & Related papers (2021-01-13T10:58:54Z) - A Teacher-Student Framework for Semi-supervised Medical Image
Segmentation From Mixed Supervision [62.4773770041279]
We develop a semi-supervised learning framework based on a teacher-student fashion for organ and lesion segmentation.
We show our model is robust to the quality of bounding box and achieves comparable performance compared with full-supervised learning methods.
arXiv Detail & Related papers (2020-10-23T07:58:20Z) - Joint Embedding of Words and Category Labels for Hierarchical
Multi-label Text Classification [4.2750700546937335]
hierarchical text classification (HTC) has received extensive attention and has broad application prospects.
We propose a joint embedding of text and parent category based on hierarchical fine-tuning ordered neurons LSTM (HFT-ONLSTM) for HTC.
arXiv Detail & Related papers (2020-04-06T11:06:08Z) - Hierarchical Image Classification using Entailment Cone Embeddings [68.82490011036263]
We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier.
We empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance.
arXiv Detail & Related papers (2020-04-02T10:22:02Z) - An Ontology-Aware Framework for Audio Event Classification [19.11706899266862]
Recent advancements in audio event classification often ignore the structure and relation between the label classes available as prior information.
We propose an ontology-aware neural network containing two components: feed-forward ontology layers and graph convolutional networks (GCN)
The framework is evaluated on two benchmark datasets for single-label and multi-label audio event classification tasks.
arXiv Detail & Related papers (2020-01-27T20:07:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.