Hierarchical Deep Learning Ensemble to Automate the Classification of
Breast Cancer Pathology Reports by ICD-O Topography
- URL: http://arxiv.org/abs/2008.12571v1
- Date: Fri, 28 Aug 2020 10:29:56 GMT
- Title: Hierarchical Deep Learning Ensemble to Automate the Classification of
Breast Cancer Pathology Reports by ICD-O Topography
- Authors: Waheeda Saib, David Sengeh, Gcininwe Dlamini, Elvira Singh
- Abstract summary: We present a hierarchical deep learning ensemble method incorporating state of the art convolutional neural network models for the automatic labelling of 2201 pathology reports.
Our results show an improvement in primary site classification over the state of the art CNN model by greater than 14% for F1 micro and 55% for F1 macro scores.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Like most global cancer registries, the National Cancer Registry in South
Africa employs expert human coders to label pathology reports using appropriate
International Classification of Disease for Oncology (ICD-O) codes spanning 42
different cancer types. The annotation is extensive for the large volume of
cancer pathology reports the registry receives annually from public and private
sector institutions. This manual process, coupled with other challenges results
in a significant 4-year lag in reporting of annual cancer statistics in South
Africa. We present a hierarchical deep learning ensemble method incorporating
state of the art convolutional neural network models for the automatic
labelling of 2201 de-identified, free text pathology reports, with appropriate
ICD-O breast cancer topography codes across 8 classes. Our results show an
improvement in primary site classification over the state of the art CNN model
by greater than 14% for F1 micro and 55% for F1 macro scores. We demonstrate
that the hierarchical deep learning ensemble improves on state-of-the-art
models for ICD-O topography classification in comparison to a flat multiclass
model for predicting ICD-O topography codes for pathology reports.
Related papers
- Multi-Class Segmentation of Aortic Branches and Zones in Computed Tomography Angiography: The AortaSeg24 Challenge [55.252714550918824]
AortaSeg24 MICCAI Challenge introduced the first dataset of 100 CTA volumes annotated for 23 clinically relevant aortic branches and zones.
This paper presents the challenge design, dataset details, evaluation metrics, and an in-depth analysis of the top-performing algorithms.
arXiv Detail & Related papers (2025-02-07T21:09:05Z) - A Knowledge-enhanced Pathology Vision-language Foundation Model for Cancer Diagnosis [58.85247337449624]
We propose a knowledge-enhanced vision-language pre-training approach that integrates disease knowledge into the alignment within hierarchical semantic groups.
KEEP achieves state-of-the-art performance in zero-shot cancer diagnostic tasks.
arXiv Detail & Related papers (2024-12-17T17:45:21Z) - Hierarchical Classification System for Breast Cancer Specimen Report
(HCSBC) -- an end-to-end model for characterizing severity and diagnosis [3.4454444815042735]
We develop a hierarchical hybrid transformer-based pipeline (59 labels) - Hierarchical Classification System for Breast Cancer Specimen Report (HCSBC)
We trained the model on the EUH data and evaluated our model's performance on two external datasets - MGH and Mayo Clinic.
arXiv Detail & Related papers (2023-11-02T18:37:45Z) - A Multi-Institutional Open-Source Benchmark Dataset for Breast Cancer
Clinical Decision Support using Synthetic Correlated Diffusion Imaging Data [82.74877848011798]
Cancer-Net BCa is a multi-institutional open-source benchmark dataset of volumetric CDI$s$ imaging data of breast cancer patients.
Cancer-Net BCa is publicly available as a part of a global open-source initiative dedicated to accelerating advancement in machine learning to aid clinicians in the fight against cancer.
arXiv Detail & Related papers (2023-04-12T05:41:44Z) - ICDBigBird: A Contextual Embedding Model for ICD Code Classification [71.58299917476195]
Contextual word embedding models have achieved state-of-the-art results in multiple NLP tasks.
ICDBigBird is a BigBird-based model which can integrate a Graph Convolutional Network (GCN)
Our experiments on a real-world clinical dataset demonstrate the effectiveness of our BigBird-based model on the ICD classification task.
arXiv Detail & Related papers (2022-04-21T20:59:56Z) - WSSS4LUAD: Grand Challenge on Weakly-supervised Tissue Semantic
Segmentation for Lung Adenocarcinoma [51.50991881342181]
This challenge includes 10,091 patch-level annotations and over 130 million labeled pixels.
First place team achieved mIoU of 0.8413 (tumor: 0.8389, stroma: 0.7931, normal: 0.8919)
arXiv Detail & Related papers (2022-04-13T15:27:05Z) - Automated risk classification of colon biopsies based on semantic
segmentation of histopathology images [4.144141972397873]
We present an approach to address two major challenges in automated assessment of colorectal histopathology whole-slide images.
First, we present an AI-based method to segment multiple tissue compartments in the H&E-stained whole-slide image.
Second, we use the best performing AI model as the basis for a computer-aided diagnosis system.
arXiv Detail & Related papers (2021-09-16T11:50:10Z) - A Novel Self-Learning Framework for Bladder Cancer Grading Using
Histopathological Images [1.244681179922733]
We present a self-learning framework to grade bladder cancer from histological images stained viachemical techniques.
We propose a novel Deep Convolutional Embedded Attention Clustering (DCEAC) which allows classifying histological patches into different levels of the disease.
arXiv Detail & Related papers (2021-06-25T11:04:04Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Hierarchical Deep Learning Classification of Unstructured Pathology
Reports to Automate ICD-O Morphology Grading [0.0]
We present a hierarchical deep learning classification method that employs convolutional neural network models to automate the classification of 1813 breast cancer pathology reports.
We demonstrate that the hierarchical deep learning classification method improves on performance in comparison to a flat multiclass CNN model for ICD-O morphology classification of the same reports.
arXiv Detail & Related papers (2020-08-28T12:36:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.