DepViT-CAD: Deployable Vision Transformer-Based Cancer Diagnosis in Histopathology
- URL: http://arxiv.org/abs/2507.10250v1
- Date: Mon, 14 Jul 2025 13:17:46 GMT
- Title: DepViT-CAD: Deployable Vision Transformer-Based Cancer Diagnosis in Histopathology
- Authors: Ashkan Shakarami, Lorenzo Nicole, Rocco Cappellesso, Angelo Paolo Dei Tos, Stefano Ghidoni,
- Abstract summary: DepViT-CAD is a deployable AI system for multi-class cancer diagnosis in histopathology.<n>At its core is MAViT, a novel Multi-Attention Vision Transformer designed to capture fine-grained morphological patterns.
- Score: 2.579491964930558
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate and timely cancer diagnosis from histopathological slides is vital for effective clinical decision-making. This paper introduces DepViT-CAD, a deployable AI system for multi-class cancer diagnosis in histopathology. At its core is MAViT, a novel Multi-Attention Vision Transformer designed to capture fine-grained morphological patterns across diverse tumor types. MAViT was trained on expert-annotated patches from 1008 whole-slide images, covering 11 diagnostic categories, including 10 major cancers and non-tumor tissue. DepViT-CAD was validated on two independent cohorts: 275 WSIs from The Cancer Genome Atlas and 50 routine clinical cases from pathology labs, achieving diagnostic sensitivities of 94.11% and 92%, respectively. By combining state-of-the-art transformer architecture with large-scale real-world validation, DepViT-CAD offers a robust and scalable approach for AI-assisted cancer diagnostics. To support transparency and reproducibility, software and code will be made publicly available at GitHub.
Related papers
- RadFabric: Agentic AI System with Reasoning Capability for Radiology [61.25593938175618]
RadFabric is a multi agent, multimodal reasoning framework that unifies visual and textual analysis for comprehensive CXR interpretation.<n>System employs specialized CXR agents for pathology detection, an Anatomical Interpretation Agent to map visual findings to precise anatomical structures, and a Reasoning Agent powered by large multimodal reasoning models to synthesize visual, anatomical, and clinical data into transparent and evidence based diagnoses.
arXiv Detail & Related papers (2025-06-17T03:10:33Z) - PathOrchestra: A Comprehensive Foundation Model for Computational Pathology with Over 100 Diverse Clinical-Grade Tasks [39.97710183184273]
We present PathOrchestra, a versatile pathology foundation model trained via self-supervised learning on a dataset comprising 300K pathological slides.<n>The model was rigorously evaluated on 112 clinical tasks using a combination of 61 private and 51 public datasets.<n>PathOrchestra demonstrated exceptional performance across 27,755 WSIs and 9,415,729 ROIs, achieving over 0.950 accuracy in 47 tasks.
arXiv Detail & Related papers (2025-03-31T17:28:02Z) - GRAPHITE: Graph-Based Interpretable Tissue Examination for Enhanced Explainability in Breast Cancer Histopathology [14.812589661794592]
GRAPHITE is a post-hoc explainable framework designed for breast cancer tissue microarray (TMA) analysis.<n>We trained the model on 140 tumour TMA cores and four benign whole slide images from which 140 benign samples were created, and tested it on 53 pathologist-annotated TMA samples.<n>It achieved a mean average precision (mAP) of 0.56, an area under the receiver operating characteristic curve (AUROC) of 0.94, and a threshold robustness (ThR) of 0.70.
arXiv Detail & Related papers (2025-01-08T00:54:43Z) - A Knowledge-enhanced Pathology Vision-language Foundation Model for Cancer Diagnosis [58.85247337449624]
We propose a knowledge-enhanced vision-language pre-training approach that integrates disease knowledge into the alignment within hierarchical semantic groups.<n>KEEP achieves state-of-the-art performance in zero-shot cancer diagnostic tasks.
arXiv Detail & Related papers (2024-12-17T17:45:21Z) - Towards a Comprehensive Benchmark for Pathological Lymph Node Metastasis in Breast Cancer Sections [21.75452517154339]
We reprocessed 1,399 whole slide images (WSIs) and labels from the Camelyon-16 and Camelyon-17 datasets.
Based on the sizes of re-annotated tumor regions, we upgraded the binary cancer screening task to a four-class task.
arXiv Detail & Related papers (2024-11-16T09:19:24Z) - Large-scale Long-tailed Disease Diagnosis on Radiology Images [51.453990034460304]
RadDiag is a foundational model supporting 2D and 3D inputs across various modalities and anatomies.
Our dataset, RP3D-DiagDS, contains 40,936 cases with 195,010 scans covering 5,568 disorders.
arXiv Detail & Related papers (2023-12-26T18:20:48Z) - Virchow: A Million-Slide Digital Pathology Foundation Model [34.38679208931425]
We present Virchow, a foundation model for computational pathology.
Virchow is a vision transformer model with 632 million parameters trained on 1.5 million hematoxylin and eosin stained whole slide images.
arXiv Detail & Related papers (2023-09-14T15:09:35Z) - A Pathologist-Informed Workflow for Classification of Prostate Glands in
Histopathology [62.997667081978825]
Pathologists diagnose and grade prostate cancer by examining tissue from needle biopsies on glass slides.
Cancer's severity and risk of metastasis are determined by the Gleason grade, a score based on the organization and morphology of prostate cancer glands.
This paper proposes an automated workflow that follows pathologists' textitmodus operandi, isolating and classifying multi-scale patches of individual glands.
arXiv Detail & Related papers (2022-09-27T14:08:19Z) - Multi-Scale Hybrid Vision Transformer for Learning Gastric Histology:
AI-Based Decision Support System for Gastric Cancer Treatment [50.89811515036067]
Gastric endoscopic screening is an effective way to decide appropriate gastric cancer (GC) treatment at an early stage, reducing GC-associated mortality rate.
We propose a practical AI system that enables five subclassifications of GC pathology, which can be directly matched to general GC treatment guidance.
arXiv Detail & Related papers (2022-02-17T08:33:52Z) - In-Line Image Transformations for Imbalanced, Multiclass Computer Vision
Classification of Lung Chest X-Rays [91.3755431537592]
This study aims to leverage a body of literature in order to apply image transformations that would serve to balance the lack of COVID-19 LCXR data.
Deep learning techniques such as convolutional neural networks (CNNs) are able to select features that distinguish between healthy and disease states.
This study utilizes a simple CNN architecture for high-performance multiclass LCXR classification at 94 percent accuracy.
arXiv Detail & Related papers (2021-04-06T02:01:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.