Related papers: A Lung Nodule Dataset with Histopathology-based Cancer Type Annotation

A Lung Nodule Dataset with Histopathology-based Cancer Type Annotation

URL: http://arxiv.org/abs/2406.18102v1
Date: Wed, 26 Jun 2024 06:39:11 GMT
Title: A Lung Nodule Dataset with Histopathology-based Cancer Type Annotation
Authors: Muwei Jian, Hongyu Chen, Zaiyong Zhang, Nan Yang, Haorang Zhang, Lifu Ma, Wenjing Xu, Huixiang Zhi,
Abstract summary: This research aims to bridge the gap by providing publicly accessible datasets and reliable tools for medical diagnosis. We curated a diverse dataset of lung Computed Tomography (CT) images, comprising 330 annotated nodules (nodules are labeled as bounding boxes) from 95 distinct patients. These promising results demonstrate that the dataset has a feasible application and further facilitate intelligent auxiliary diagnosis.
Score: 12.617587827105496
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, Computer-Aided Diagnosis (CAD) systems have emerged as indispensable tools in clinical diagnostic workflows, significantly alleviating the burden on radiologists. Nevertheless, despite their integration into clinical settings, CAD systems encounter limitations. Specifically, while CAD systems can achieve high performance in the detection of lung nodules, they face challenges in accurately predicting multiple cancer types. This limitation can be attributed to the scarcity of publicly available datasets annotated with expert-level cancer type information. This research aims to bridge this gap by providing publicly accessible datasets and reliable tools for medical diagnosis, facilitating a finer categorization of different types of lung diseases so as to offer precise treatment recommendations. To achieve this objective, we curated a diverse dataset of lung Computed Tomography (CT) images, comprising 330 annotated nodules (nodules are labeled as bounding boxes) from 95 distinct patients. The quality of the dataset was evaluated using a variety of classical classification and detection models, and these promising results demonstrate that the dataset has a feasible application and further facilitate intelligent auxiliary diagnosis.

Related papers

A Narrative Review on Large AI Models in Lung Cancer Screening, Diagnosis, and Treatment Planning [8.431488361911754]
Lung cancer remains one of the most prevalent and fatal diseases worldwide.<n>Recent advancements in large AI models have significantly enhanced medical image understanding and clinical decision-making.<n>This review systematically surveys the state-of-the-art in applying large AI models to lung cancer screening, diagnosis, prognosis, and treatment.
arXiv Detail & Related papers (2025-06-08T17:42:24Z)
PathOrchestra: A Comprehensive Foundation Model for Computational Pathology with Over 100 Diverse Clinical-Grade Tasks [39.97710183184273]
We present PathOrchestra, a versatile pathology foundation model trained via self-supervised learning on a dataset comprising 300K pathological slides. The model was rigorously evaluated on 112 clinical tasks using a combination of 61 private and 51 public datasets. PathOrchestra demonstrated exceptional performance across 27,755 WSIs and 9,415,729 ROIs, achieving over 0.950 accuracy in 47 tasks.
arXiv Detail & Related papers (2025-03-31T17:28:02Z)
Anatomy-Guided Radiology Report Generation with Pathology-Aware Regional Prompts [3.1019279528120363]
Radiology reporting generative AI holds significant potential to alleviate clinical workloads and streamline medical care. Existing systems often fall short due to their reliance on fixed size, patch-level image features and insufficient incorporation of pathological information. We propose an innovative approach that leverages pathology-aware regional prompts to explicitly integrate anatomical and pathological information of various scales.
arXiv Detail & Related papers (2024-11-16T12:36:20Z)
Large-scale Long-tailed Disease Diagnosis on Radiology Images [51.453990034460304]
RadDiag is a foundational model supporting 2D and 3D inputs across various modalities and anatomies. Our dataset, RP3D-DiagDS, contains 40,936 cases with 195,010 scans covering 5,568 disorders.
arXiv Detail & Related papers (2023-12-26T18:20:48Z)
CARE: A Large Scale CT Image Dataset and Clinical Applicable Benchmark Model for Rectal Cancer Segmentation [8.728236864462302]
Rectal cancer segmentation of CT image plays a crucial role in timely clinical diagnosis, radiotherapy treatment, and follow-up. These obstacles arise from the intricate anatomical structures of the rectum and the difficulties in performing differential diagnosis of rectal cancer. To address these issues, this work introduces a novel large scale rectal cancer CT image dataset CARE with pixel-level annotations for both normal and cancerous rectum. We also propose a novel medical cancer lesion segmentation benchmark model named U-SAM. The model is specifically designed to tackle the challenges posed by the intricate anatomical structures of abdominal organs by incorporating prompt information.
arXiv Detail & Related papers (2023-08-16T10:51:27Z)
Towards Reliable and Explainable AI Model for Solid Pulmonary Nodule Diagnosis [20.510918720980467]
Lung cancer has the highest mortality rate of deadly cancers in the world. Computer-aided diagnosis (CAD) systems have been developed to assist radiologists in nodule detection and diagnosis. Lack of model reliability and interpretability remains a major obstacle for its large-scale clinical application.
arXiv Detail & Related papers (2022-04-08T08:21:00Z)
A Personalized Diagnostic Generation Framework Based on Multi-source Heterogeneous Data [8.115713756776119]
We propose a framework that combines pathological images and medical reports to generate a personalized diagnosis result for individual patient. We use nuclei-level image feature similarity and content-based deep learning method to search for a personalized group of population with similar pathological characteristics.
arXiv Detail & Related papers (2021-10-26T13:12:52Z)
Variational Knowledge Distillation for Disease Classification in Chest X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays. We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z)
Inheritance-guided Hierarchical Assignment for Clinical Automatic Diagnosis [50.15205065710629]
Clinical diagnosis, which aims to assign diagnosis codes for a patient based on the clinical note, plays an essential role in clinical decision-making. We propose a novel framework to combine the inheritance-guided hierarchical assignment and co-occurrence graph propagation for clinical automatic diagnosis.
arXiv Detail & Related papers (2021-01-27T13:16:51Z)
Topological Data Analysis of copy number alterations in cancer [70.85487611525896]
We explore the potential to capture information contained in cancer genomic information using a novel topology-based approach. We find that this technique has the potential to extract meaningful low-dimensional representations in cancer somatic genetic data.
arXiv Detail & Related papers (2020-11-22T17:31:23Z)
Hierarchical Classification of Pulmonary Lesions: A Large-Scale Radio-Pathomics Study [38.78350161086617]
Diagnosis of pulmonary lesions from computed tomography (CT) is important but challenging for clinical decision making in lung cancer related diseases. Deep learning has achieved great success in computer aided diagnosis (CADx) area for lung cancer, whereas it suffers from label ambiguity due to the difficulty in the radiological diagnosis. Considering that invasive pathological analysis serves as the clinical golden standard of lung cancer diagnosis, in this study, we solve the label ambiguity issue via a large-scale radio-pathomics dataset. This retrospective dataset, named Pulmonary-RadPath, enables development and validation of accurate deep learning systems to predict invasive pathological labels with a non-
arXiv Detail & Related papers (2020-10-08T15:14:34Z)
Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients. We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks. Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
Trajectories, bifurcations and pseudotime in large clinical datasets: applications to myocardial infarction and diabetes data [94.37521840642141]
We suggest a semi-supervised methodology for the analysis of large clinical datasets, characterized by mixed data types and missing values. The methodology is based on application of elastic principal graphs which can address simultaneously the tasks of dimensionality reduction, data visualization, clustering, feature selection and quantifying the geodesic distances (pseudotime) in partially ordered sequences of observations.
arXiv Detail & Related papers (2020-07-07T21:04:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.