Hierarchical Deep Learning Ensemble to Automate the Classification of
Breast Cancer Pathology Reports by ICD-O Topography
- URL: http://arxiv.org/abs/2008.12571v1
- Date: Fri, 28 Aug 2020 10:29:56 GMT
- Title: Hierarchical Deep Learning Ensemble to Automate the Classification of
Breast Cancer Pathology Reports by ICD-O Topography
- Authors: Waheeda Saib, David Sengeh, Gcininwe Dlamini, Elvira Singh
- Abstract summary: We present a hierarchical deep learning ensemble method incorporating state of the art convolutional neural network models for the automatic labelling of 2201 pathology reports.
Our results show an improvement in primary site classification over the state of the art CNN model by greater than 14% for F1 micro and 55% for F1 macro scores.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Like most global cancer registries, the National Cancer Registry in South
Africa employs expert human coders to label pathology reports using appropriate
International Classification of Disease for Oncology (ICD-O) codes spanning 42
different cancer types. The annotation is extensive for the large volume of
cancer pathology reports the registry receives annually from public and private
sector institutions. This manual process, coupled with other challenges results
in a significant 4-year lag in reporting of annual cancer statistics in South
Africa. We present a hierarchical deep learning ensemble method incorporating
state of the art convolutional neural network models for the automatic
labelling of 2201 de-identified, free text pathology reports, with appropriate
ICD-O breast cancer topography codes across 8 classes. Our results show an
improvement in primary site classification over the state of the art CNN model
by greater than 14% for F1 micro and 55% for F1 macro scores. We demonstrate
that the hierarchical deep learning ensemble improves on state-of-the-art
models for ICD-O topography classification in comparison to a flat multiclass
model for predicting ICD-O topography codes for pathology reports.
Related papers
- Medical-GAT: Cancer Document Classification Leveraging Graph-Based Residual Network for Scenarios with Limited Data [2.913761513290171]
We present a curated dataset of 1,874 biomedical abstracts, categorized into thyroid cancer, colon cancer, lung cancer, and generic topics.
Our research focuses on leveraging this dataset to improve classification performance, particularly in data-scarce scenarios.
We introduce a Residual Graph Attention Network (R-GAT) with multiple graph attention layers that capture the semantic information and structural relationships within cancer-related documents.
arXiv Detail & Related papers (2024-10-19T20:07:40Z) - Hierarchical Classification System for Breast Cancer Specimen Report
(HCSBC) -- an end-to-end model for characterizing severity and diagnosis [3.4454444815042735]
We develop a hierarchical hybrid transformer-based pipeline (59 labels) - Hierarchical Classification System for Breast Cancer Specimen Report (HCSBC)
We trained the model on the EUH data and evaluated our model's performance on two external datasets - MGH and Mayo Clinic.
arXiv Detail & Related papers (2023-11-02T18:37:45Z) - A Multi-Institutional Open-Source Benchmark Dataset for Breast Cancer
Clinical Decision Support using Synthetic Correlated Diffusion Imaging Data [82.74877848011798]
Cancer-Net BCa is a multi-institutional open-source benchmark dataset of volumetric CDI$s$ imaging data of breast cancer patients.
Cancer-Net BCa is publicly available as a part of a global open-source initiative dedicated to accelerating advancement in machine learning to aid clinicians in the fight against cancer.
arXiv Detail & Related papers (2023-04-12T05:41:44Z) - Significantly improving zero-shot X-ray pathology classification via fine-tuning pre-trained image-text encoders [50.689585476660554]
We propose a new fine-tuning strategy that includes positive-pair loss relaxation and random sentence sampling.
Our approach consistently improves overall zero-shot pathology classification across four chest X-ray datasets and three pre-trained models.
arXiv Detail & Related papers (2022-12-14T06:04:18Z) - ICDBigBird: A Contextual Embedding Model for ICD Code Classification [71.58299917476195]
Contextual word embedding models have achieved state-of-the-art results in multiple NLP tasks.
ICDBigBird is a BigBird-based model which can integrate a Graph Convolutional Network (GCN)
Our experiments on a real-world clinical dataset demonstrate the effectiveness of our BigBird-based model on the ICD classification task.
arXiv Detail & Related papers (2022-04-21T20:59:56Z) - WSSS4LUAD: Grand Challenge on Weakly-supervised Tissue Semantic
Segmentation for Lung Adenocarcinoma [51.50991881342181]
This challenge includes 10,091 patch-level annotations and over 130 million labeled pixels.
First place team achieved mIoU of 0.8413 (tumor: 0.8389, stroma: 0.7931, normal: 0.8919)
arXiv Detail & Related papers (2022-04-13T15:27:05Z) - Automated risk classification of colon biopsies based on semantic
segmentation of histopathology images [4.144141972397873]
We present an approach to address two major challenges in automated assessment of colorectal histopathology whole-slide images.
First, we present an AI-based method to segment multiple tissue compartments in the H&E-stained whole-slide image.
Second, we use the best performing AI model as the basis for a computer-aided diagnosis system.
arXiv Detail & Related papers (2021-09-16T11:50:10Z) - A Novel Self-Learning Framework for Bladder Cancer Grading Using
Histopathological Images [1.244681179922733]
We present a self-learning framework to grade bladder cancer from histological images stained viachemical techniques.
We propose a novel Deep Convolutional Embedded Attention Clustering (DCEAC) which allows classifying histological patches into different levels of the disease.
arXiv Detail & Related papers (2021-06-25T11:04:04Z) - Wide & Deep neural network model for patch aggregation in CNN-based
prostate cancer detection systems [51.19354417900591]
Prostate cancer (PCa) is one of the leading causes of death among men, with almost 1.41 million new cases and around 375,000 deaths in 2020.
To perform an automatic diagnosis, prostate tissue samples are first digitized into gigapixel-resolution whole-slide images.
Small subimages called patches are extracted and predicted, obtaining a patch-level classification.
arXiv Detail & Related papers (2021-05-20T18:13:58Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Hierarchical Deep Learning Classification of Unstructured Pathology
Reports to Automate ICD-O Morphology Grading [0.0]
We present a hierarchical deep learning classification method that employs convolutional neural network models to automate the classification of 1813 breast cancer pathology reports.
We demonstrate that the hierarchical deep learning classification method improves on performance in comparison to a flat multiclass CNN model for ICD-O morphology classification of the same reports.
arXiv Detail & Related papers (2020-08-28T12:36:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.