Histologic Dataset of Normal and Atypical Mitotic Figures on Human Breast Cancer (AMi-Br)
- URL: http://arxiv.org/abs/2501.04467v2
- Date: Thu, 13 Mar 2025 07:10:26 GMT
- Title: Histologic Dataset of Normal and Atypical Mitotic Figures on Human Breast Cancer (AMi-Br)
- Authors: Christof A. Bertram, Viktoria Weiss, Taryn A. Donovan, Sweta Banerjee, Thomas Conrad, Jonas Ammeling, Robert Klopfleisch, Christopher Kaltenecker, Marc Aubreville,
- Abstract summary: Assessment of the density of mitotic figures (MFs) in histologic tumor sections is an important prognostic marker for many tumor types.<n>Recently, it has been reported in multiple works that the quantity of MFs with an atypical morphology might be an independent prognostic criterion for breast cancer.<n>We present the first ever publicly available dataset of atypical and normal MFs (AMi-Br)
- Score: 0.2786153781225932
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Assessment of the density of mitotic figures (MFs) in histologic tumor sections is an important prognostic marker for many tumor types, including breast cancer. Recently, it has been reported in multiple works that the quantity of MFs with an atypical morphology (atypical MFs, AMFs) might be an independent prognostic criterion for breast cancer. AMFs are an indicator of mutations in the genes regulating the cell cycle and can lead to aberrant chromosome constitution (aneuploidy) of the tumor cells. To facilitate further research on this topic using pattern recognition, we present the first ever publicly available dataset of atypical and normal MFs (AMi-Br). For this, we utilized two of the most popular MF datasets (MIDOG 2021 and TUPAC) and subclassified all MFs using a three expert majority vote. Our final dataset consists of 3,720 MFs, split into 832 AMFs (22.4%) and 2,888 normal MFs (77.6%) across all 223 tumor cases in the combined set. We provide baseline classification experiments to investigate the consistency of the dataset, using a Monte Carlo cross-validation and different strategies to combat class imbalance. We found an averaged balanced accuracy of up to 0.806 when using a patch-level data set split, and up to 0.713 when using a patient-level split.
Related papers
- PanCanBench: A Comprehensive Benchmark for Evaluating Large Language Models in Pancreatic Oncology [48.732366302949515]
Large language models (LLMs) have achieved expert-level performance on standardized examinations, yet multiple-choice accuracy poorly reflects real-world clinical utility and safety.<n>We developed a human-in-the-loop pipeline to create expert rubrics for de-identified patient questions.<n>We evaluated 22 proprietary and open-source LLMs using an LLM-as-a-judge framework, measuring clinical completeness, factual accuracy, and web-search integration.
arXiv Detail & Related papers (2026-03-02T00:50:39Z) - OMG-Net: A Deep Learning Framework Deploying Segment Anything to Detect Pan-Cancer Mitotic Figures from Haematoxylin and Eosin-Stained Slides [27.84599956781646]
In this study, we propose an artificial intelligence (AI) approach to detect MFs in digitised whole slide images (WSIs)
Here we establish the largest pan-cancer dataset of mitotic figures by combining an in-house dataset of soft tissue tumours (STMF) with five open-source mitotic datasets (IPAC, TUPAC, CCMCT, CMC and MIDOG++)
We then employed a two-stage framework (Optimised Mitoses Generator Network (OMG-Net)) classify MFs.
arXiv Detail & Related papers (2024-07-17T17:53:37Z) - Analysis of the BraTS 2023 Intracranial Meningioma Segmentation Challenge [44.76736949127792]
We describe the design and results from the BraTS 2023 Intracranial Meningioma Challenge.
The BraTS Meningioma Challenge differed from prior BraTS Glioma challenges in that it focused on meningiomas.
The top ranked team had a lesion-wise median dice similarity coefficient (DSC) of 0.976, 0.976, and 0.964 for enhancing tumor, tumor core, and whole tumor.
arXiv Detail & Related papers (2024-05-16T03:23:57Z) - Breast Ultrasound Tumor Classification Using a Hybrid Multitask
CNN-Transformer Network [63.845552349914186]
Capturing global contextual information plays a critical role in breast ultrasound (BUS) image classification.
Vision Transformers have an improved capability of capturing global contextual information but may distort the local image patterns due to the tokenization operations.
In this study, we proposed a hybrid multitask deep neural network called Hybrid-MT-ESTAN, designed to perform BUS tumor classification and segmentation.
arXiv Detail & Related papers (2023-08-04T01:19:32Z) - Deep learning-based Subtyping of Atypical and Normal Mitoses using a
Hierarchical Anchor-Free Object Detector [0.802219018904343]
Atypical mitotic figures (MF) can be identified morphologically as having segregation abnormalities of the chromatids.
In this work, we perform, for the first time, automatic subtyping of mitotic figures into normal and atypical categories.
We set up a state-of-the-art object detection pipeline extending the anchor-free FCOS approach with a gated hierarchical subclassification branch.
arXiv Detail & Related papers (2022-12-12T13:57:38Z) - Federated Learning Enables Big Data for Rare Cancer Boundary Detection [98.5549882883963]
We present findings from the largest Federated ML study to-date, involving data from 71 healthcare institutions across 6 continents.
We generate an automatic tumor boundary detector for the rare disease of glioblastoma.
We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent.
arXiv Detail & Related papers (2022-04-22T17:27:00Z) - Deep learning-based approach to reveal tumor mutational burden status
from whole slide images across multiple cancer types [41.61294299606317]
Tumor mutational burden (TMB) is a potential genomic biomarker of immunotherapy.
TMB detected through whole exome sequencing lacks clinical penetration in low-resource settings.
In this study, we proposed a multi-scale deep learning framework to address the detection of TMB status from routinely used whole slide images.
arXiv Detail & Related papers (2022-04-07T07:02:32Z) - Multi-Scale Input Strategies for Medulloblastoma Tumor Classification
using Deep Transfer Learning [59.30734371401316]
Medulloblastoma is the most common malignant brain cancer among children.
CNN has shown promising results for MB subtype classification.
We study the impact of tile size and input strategy.
arXiv Detail & Related papers (2021-09-14T09:42:37Z) - Cancer Gene Profiling through Unsupervised Discovery [49.28556294619424]
We introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers.
Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm.
Our signature reports promising results on distinguishing immune inflammatory and immune desert tumors.
arXiv Detail & Related papers (2021-02-11T09:04:45Z) - A completely annotated whole slide image dataset of canine breast cancer
to aid human breast cancer research [6.960375869417005]
Current datasets on human breast cancer only provide annotations for small subsets of whole slide images (WSIs)
We present a novel dataset of 21 WSIs of CMC completely annotated for MF.
We used machine learning to identify previously undetected MF.
arXiv Detail & Related papers (2020-08-24T08:06:55Z) - Deep Learning-based Computational Pathology Predicts Origins for Cancers
of Unknown Primary [2.645435564532842]
Cancer of unknown primary (CUP) is an enigmatic group of diagnoses where the primary anatomical site of tumor origin cannot be determined.
Recent work has focused on using genomics and transcriptomics for identification of tumor origins.
We present a deep learning-based computational pathology algorithm that can provide a differential diagnosis for CUP.
arXiv Detail & Related papers (2020-06-24T17:59:36Z) - Machine-Learning-Based Multiple Abnormality Prediction with Large-Scale
Chest Computed Tomography Volumes [64.21642241351857]
We curated and analyzed a chest computed tomography (CT) data set of 36,316 volumes from 19,993 unique patients.
We developed a rule-based method for automatically extracting abnormality labels from free-text radiology reports.
We also developed a model for multi-organ, multi-disease classification of chest CT volumes.
arXiv Detail & Related papers (2020-02-12T00:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.