Hierarchical Deep Learning Classification of Unstructured Pathology
Reports to Automate ICD-O Morphology Grading
- URL: http://arxiv.org/abs/2009.00542v1
- Date: Fri, 28 Aug 2020 12:36:58 GMT
- Title: Hierarchical Deep Learning Classification of Unstructured Pathology
Reports to Automate ICD-O Morphology Grading
- Authors: Waheeda Saib, Tapiwa Chiwewe, Elvira Singh
- Abstract summary: We present a hierarchical deep learning classification method that employs convolutional neural network models to automate the classification of 1813 breast cancer pathology reports.
We demonstrate that the hierarchical deep learning classification method improves on performance in comparison to a flat multiclass CNN model for ICD-O morphology classification of the same reports.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Timely cancer reporting data are required in order to understand the impact
of cancer, inform public health resource planning and implement cancer policy
especially in Sub Saharan Africa where the reporting lag is behind world
averages. Unstructured pathology reports, which contain tumor specific data,
are the main source of information collected by cancer registries. Due to
manual processing and labelling of pathology reports using the International
Classification of Disease for oncology (ICD-O) codes, by human coders employed
by cancer registries, has led to a considerable lag in cancer reporting. We
present a hierarchical deep learning classification method that employs
convolutional neural network models to automate the classification of 1813
anonymized breast cancer pathology reports with applicable ICD-O morphology
codes across 9 classes. We demonstrate that the hierarchical deep learning
classification method improves on performance in comparison to a flat
multiclass CNN model for ICD-O morphology classification of the same reports.
Related papers
- Medical-GAT: Cancer Document Classification Leveraging Graph-Based Residual Network for Scenarios with Limited Data [2.913761513290171]
We present a curated dataset of 1,874 biomedical abstracts, categorized into thyroid cancer, colon cancer, lung cancer, and generic topics.
Our research focuses on leveraging this dataset to improve classification performance, particularly in data-scarce scenarios.
We introduce a Residual Graph Attention Network (R-GAT) with multiple graph attention layers that capture the semantic information and structural relationships within cancer-related documents.
arXiv Detail & Related papers (2024-10-19T20:07:40Z) - Hierarchical Classification System for Breast Cancer Specimen Report
(HCSBC) -- an end-to-end model for characterizing severity and diagnosis [3.4454444815042735]
We develop a hierarchical hybrid transformer-based pipeline (59 labels) - Hierarchical Classification System for Breast Cancer Specimen Report (HCSBC)
We trained the model on the EUH data and evaluated our model's performance on two external datasets - MGH and Mayo Clinic.
arXiv Detail & Related papers (2023-11-02T18:37:45Z) - PathLDM: Text conditioned Latent Diffusion Model for Histopathology [62.970593674481414]
We introduce PathLDM, the first text-conditioned Latent Diffusion Model tailored for generating high-quality histopathology images.
Our approach fuses image and textual data to enhance the generation process.
We achieved a SoTA FID score of 7.64 for text-to-image generation on the TCGA-BRCA dataset, significantly outperforming the closest text-conditioned competitor with FID 30.1.
arXiv Detail & Related papers (2023-09-01T22:08:32Z) - Automated Knowledge Modeling for Cancer Clinical Practice Guidelines [1.1083289076967895]
Clinical Practice Guidelines (CPGs) for cancer diseases evolve rapidly due to new evidence generated by active research.
Currently, CPGs are primarily published in a document format that is ill-suited for managing this developing knowledge.
This work proposes an automated method for extraction of knowledge from NCCN CPGs in Oncology.
arXiv Detail & Related papers (2023-07-15T18:07:08Z) - Data and Knowledge Co-driving for Cancer Subtype Classification on
Multi-Scale Histopathological Slides [4.22412600279685]
We propose a Data and Knowledge Co-driving (D&K) model to replicate the process of cancer subtype classification on a histological slide like a pathologist.
Specifically, in the data-driven module, the bagging mechanism in ensemble learning is leveraged to integrate the histological features from various bags extracted by the embedding representation unit.
arXiv Detail & Related papers (2023-04-18T21:57:37Z) - Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based
Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E)
We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z) - Triplet Contrastive Learning for Brain Tumor Classification [99.07846518148494]
We present a novel approach of directly learning deep embeddings for brain tumor types, which can be used for downstream tasks such as classification.
We evaluate our method on an extensive brain tumor dataset which consists of 27 different tumor classes, out of which 13 are defined as rare.
arXiv Detail & Related papers (2021-08-08T11:26:34Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Topological Data Analysis of copy number alterations in cancer [70.85487611525896]
We explore the potential to capture information contained in cancer genomic information using a novel topology-based approach.
We find that this technique has the potential to extract meaningful low-dimensional representations in cancer somatic genetic data.
arXiv Detail & Related papers (2020-11-22T17:31:23Z) - Hierarchical Deep Learning Ensemble to Automate the Classification of
Breast Cancer Pathology Reports by ICD-O Topography [0.0]
We present a hierarchical deep learning ensemble method incorporating state of the art convolutional neural network models for the automatic labelling of 2201 pathology reports.
Our results show an improvement in primary site classification over the state of the art CNN model by greater than 14% for F1 micro and 55% for F1 macro scores.
arXiv Detail & Related papers (2020-08-28T10:29:56Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.