Improving Precancerous Case Characterization via Transformer-based
Ensemble Learning
- URL: http://arxiv.org/abs/2212.05150v1
- Date: Sat, 10 Dec 2022 00:06:28 GMT
- Title: Improving Precancerous Case Characterization via Transformer-based
Ensemble Learning
- Authors: Yizhen Zhong, Jiajie Xiao, Thomas Vetterli, Mahan Matin, Ellen Loo,
Jimmy Lin, Richard Bourgon, Ofer Shapira
- Abstract summary: The application of natural language processing to cancer pathology reports has been focused on detecting cancer cases.
Improving the characterization of precancerous adenomas assists in developing diagnostic tests for early cancer detection and prevention.
Our results demonstrated the potential of using NLP to leverage real-world health record data to facilitate the development of diagnostic tests for early cancer prevention.
- Score: 31.891340667123124
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The application of natural language processing (NLP) to cancer pathology
reports has been focused on detecting cancer cases, largely ignoring
precancerous cases. Improving the characterization of precancerous adenomas
assists in developing diagnostic tests for early cancer detection and
prevention, especially for colorectal cancer (CRC). Here we developed
transformer-based deep neural network NLP models to perform the CRC
phenotyping, with the goal of extracting precancerous lesion attributes and
distinguishing cancer and precancerous cases. We achieved 0.914 macro-F1 scores
for classifying patients into negative, non-advanced adenoma, advanced adenoma
and CRC. We further improved the performance to 0.923 using an ensemble of
classifiers for cancer status classification and lesion size named entity
recognition (NER). Our results demonstrated the potential of using NLP to
leverage real-world health record data to facilitate the development of
diagnostic tests for early cancer prevention.
Related papers
- Boosting Medical Image-based Cancer Detection via Text-guided Supervision from Reports [68.39938936308023]
We propose a novel text-guided learning method to achieve highly accurate cancer detection results.
Our approach can leverage clinical knowledge by large-scale pre-trained VLM to enhance generalization ability.
arXiv Detail & Related papers (2024-05-23T07:03:38Z) - Improving Breast Cancer Grade Prediction with Multiparametric MRI Created Using Optimized Synthetic Correlated Diffusion Imaging [71.91773485443125]
Grading plays a vital role in breast cancer treatment planning.
The current tumor grading method involves extracting tissue from patients, leading to stress, discomfort, and high medical costs.
This paper examines using optimized CDI$s$ to improve breast cancer grade prediction.
arXiv Detail & Related papers (2024-05-13T15:48:26Z) - Variational Autoencoders for Feature Exploration and Malignancy
Prediction of Lung Lesions [0.0]
Lung cancer is responsible for 21% of cancer deaths in the UK.
Recent studies have demonstrated the capability of AI methods for accurate and early diagnosis of lung cancer from routine scans.
This study investigates the application Variational Autoencoders (VAEs), a type of generative AI model, to lung cancer lesions.
arXiv Detail & Related papers (2023-11-27T11:12:33Z) - Cancer-Net PCa-Data: An Open-Source Benchmark Dataset for Prostate
Cancer Clinical Decision Support using Synthetic Correlated Diffusion Imaging
Data [75.77035221531261]
Cancer-Net PCa-Data is an open-source benchmark dataset of volumetric CDI$s$ imaging data of PCa patients.
Cancer-Net PCa-Data is the first-ever public dataset of CDI$s$ imaging data for PCa.
arXiv Detail & Related papers (2023-11-20T10:28:52Z) - A Multi-Institutional Open-Source Benchmark Dataset for Breast Cancer
Clinical Decision Support using Synthetic Correlated Diffusion Imaging Data [82.74877848011798]
Cancer-Net BCa is a multi-institutional open-source benchmark dataset of volumetric CDI$s$ imaging data of breast cancer patients.
Cancer-Net BCa is publicly available as a part of a global open-source initiative dedicated to accelerating advancement in machine learning to aid clinicians in the fight against cancer.
arXiv Detail & Related papers (2023-04-12T05:41:44Z) - Enhancing Clinical Support for Breast Cancer with Deep Learning Models
using Synthetic Correlated Diffusion Imaging [66.63200823918429]
We investigate enhancing clinical support for breast cancer with deep learning models.
We leverage a volumetric convolutional neural network to learn deep radiomic features from a pre-treatment cohort.
We find that the proposed approach can achieve better performance for both grade and post-treatment response prediction.
arXiv Detail & Related papers (2022-11-10T03:02:12Z) - A Combined PCA-MLP Network for Early Breast Cancer Detection [0.0]
We have studied different machine learning algorithms to detect whether a patient is likely to face breast cancer or not.
Our 4 layers-PCA network has obtained the best accuracy of 100% with a mean of 90.48% on the BCCD dataset.
arXiv Detail & Related papers (2022-06-18T06:17:40Z) - Gene selection from microarray expression data: A Multi-objective PSO
with adaptive K-nearest neighborhood [0.0]
This paper deals with the classification problem of human cancer diseases by using gene expression data.
It is presented a new methodology to analyze microarray datasets and efficiently classify cancer diseases.
arXiv Detail & Related papers (2022-05-27T04:22:10Z) - Leveraging a Joint of Phenotypic and Genetic Features on Cancer Patient
Subgrouping [7.381190270069632]
We developed a system leveraging a joint of phenotypic and genetic features for cancer patient subgrouping.
In feature preprocessing, we performed filtering, retaining the most relevant features.
In cancer patient classification, we utilized nine different machine learning models, Random Forests (RF), Decision Tree (DT), Support Vector Machine (SVM), Naive Bayes (NB), Logistic Regression (LR), Multilayer Perceptron (MLP), Gradient Boosting (GB), Convolutional Neural Network (CNN), and Feedforward Neural Network (FNN)
arXiv Detail & Related papers (2021-03-30T13:07:05Z) - Topological Data Analysis of copy number alterations in cancer [70.85487611525896]
We explore the potential to capture information contained in cancer genomic information using a novel topology-based approach.
We find that this technique has the potential to extract meaningful low-dimensional representations in cancer somatic genetic data.
arXiv Detail & Related papers (2020-11-22T17:31:23Z) - Machine Learning Against Cancer: Accurate Diagnosis of Cancer by Machine
Learning Classification of the Whole Genome Sequencing Data [0.0]
We have developed novel methods of MLAC (Machine Learning Against Cancer) achieving perfect results with perfect precision, sensitivity, and specificity.
We have used the whole genome sequencing data acquired by next-generation RNA sequencing techniques in The Cancer Genome Atlas and Genotype-Tissue Expression projects for cancerous and healthy tissues respectively.
arXiv Detail & Related papers (2020-09-12T18:51:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.