Class Imbalance Correction for Improved Universal Lesion Detection and Tagging in CT
- URL: http://arxiv.org/abs/2504.05591v1
- Date: Tue, 08 Apr 2025 00:58:26 GMT
- Title: Class Imbalance Correction for Improved Universal Lesion Detection and Tagging in CT
- Authors: Peter D. Erickson, Tejas Sudharshan Mathai, Ronald M. Summers,
- Abstract summary: We utilize a limited subset of DeepLesion (6%, 1331 lesions, 1309 slices) to train a VFNet model to detect lesions and tag them.<n>We are the first to report the class imbalance in DeepLesion, and have taken data-driven steps to address it.
- Score: 1.7098468543752943
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Radiologists routinely detect and size lesions in CT to stage cancer and assess tumor burden. To potentially aid their efforts, multiple lesion detection algorithms have been developed with a large public dataset called DeepLesion (32,735 lesions, 32,120 CT slices, 10,594 studies, 4,427 patients, 8 body part labels). However, this dataset contains missing measurements and lesion tags, and exhibits a severe imbalance in the number of lesions per label category. In this work, we utilize a limited subset of DeepLesion (6\%, 1331 lesions, 1309 slices) containing lesion annotations and body part label tags to train a VFNet model to detect lesions and tag them. We address the class imbalance by conducting three experiments: 1) Balancing data by the body part labels, 2) Balancing data by the number of lesions per patient, and 3) Balancing data by the lesion size. In contrast to a randomly sampled (unbalanced) data subset, our results indicated that balancing the body part labels always increased sensitivity for lesions >= 1cm for classes with low data quantities (Bone: 80\% vs. 46\%, Kidney: 77\% vs. 61\%, Soft Tissue: 70\% vs. 60\%, Pelvis: 83\% vs. 76\%). Similar trends were seen for three other models tested (FasterRCNN, RetinaNet, FoveaBox). Balancing data by lesion size also helped the VFNet model improve recalls for all classes in contrast to an unbalanced dataset. We also provide a structured reporting guideline for a ``Lesions'' subsection to be entered into the ``Findings'' section of a radiology report. To our knowledge, we are the first to report the class imbalance in DeepLesion, and have taken data-driven steps to address it in the context of joint lesion detection and tagging.
Related papers
- Correcting Class Imbalances with Self-Training for Improved Universal Lesion Detection and Tagging [43.06199185109424]
Universal lesion detection and tagging (ULDT) in CT studies is critical for tumor burden assessment and tracking the progression of lesion status (growth/shrinkage) over time.<n>Prior work used the DeepLesion dataset (4,427 patients, 10,594 studies, 32,120 CT slices, 32,735 lesions, 8 body part labels) for algorithmic development, but this dataset is not completely annotated and contains class imbalances.<n>We developed a self-training pipeline for ULDT using a limited 11.5% subset of DeepLesion.
arXiv Detail & Related papers (2025-04-07T15:57:03Z) - 3D Universal Lesion Detection and Tagging in CT with Self-Training [3.68620908362189]
We propose a self-training pipeline to detect 3D lesions and tag them according to the body part they occur in.<n>To our knowledge, we are the first to jointly detect lesions in 3D and tag them according to the body part label.
arXiv Detail & Related papers (2025-04-07T15:50:27Z) - Brain Tumor Classification on MRI in Light of Molecular Markers [61.77272414423481]
Co-deletion of the 1p/19q gene is associated with clinical outcomes in low-grade gliomas.<n>This study aims to utilize a specially MRI-based convolutional neural network for brain cancer detection.
arXiv Detail & Related papers (2024-09-29T07:04:26Z) - Weakly-Supervised Detection of Bone Lesions in CT [48.34559062736031]
The skeletal region is one of the common sites of metastatic spread of cancer in the breast and prostate.
We developed a pipeline to detect bone lesions in CT volumes via a proxy segmentation task.
Our method detected bone lesions in CT with a precision of 96.7% and recall of 47.3% despite the use of incomplete and partial training data.
arXiv Detail & Related papers (2024-01-31T21:05:34Z) - Improving Segmentation and Detection of Lesions in CT Scans Using
Intensity Distribution Supervision [5.162622771922123]
We build an intensity-based lesion probability function from an intensity histogram of the target lesion.
The computed ILP map of each input CT scan is provided as additional supervision for network training.
The effectiveness of the proposed method on a detection task was also investigated.
arXiv Detail & Related papers (2023-07-11T21:00:47Z) - Transfer learning with weak labels from radiology reports: application
to glioma change detection [0.2010294990327175]
We propose a combined use of weak labels (imprecise, but fast-to-create annotations) and Transfer Learning (TL)
Specifically, we explore inductive TL, where source and target domains are identical, but tasks are different due to a label shift.
We investigate the relationship between model size and TL, comparing a low-capacity VGG with a higher-capacity SEResNeXt.
arXiv Detail & Related papers (2022-10-18T09:15:27Z) - TotalSegmentator: robust segmentation of 104 anatomical structures in CT
images [48.50994220135258]
We present a deep learning segmentation model for body CT images.
The model can segment 104 anatomical structures relevant for use cases such as organ volumetry, disease characterization, and surgical or radiotherapy planning.
arXiv Detail & Related papers (2022-08-11T15:16:40Z) - Self-Supervised Learning as a Means To Reduce the Need for Labeled Data
in Medical Image Analysis [64.4093648042484]
We use a dataset of chest X-ray images with bounding box labels for 13 different classes of anomalies.
We show that it is possible to achieve similar performance to a fully supervised model in terms of mean average precision and accuracy with only 60% of the labeled data.
arXiv Detail & Related papers (2022-06-01T09:20:30Z) - Federated Learning Enables Big Data for Rare Cancer Boundary Detection [98.5549882883963]
We present findings from the largest Federated ML study to-date, involving data from 71 healthcare institutions across 6 continents.
We generate an automatic tumor boundary detector for the rare disease of glioblastoma.
We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent.
arXiv Detail & Related papers (2022-04-22T17:27:00Z) - Machine-Learning-Based Multiple Abnormality Prediction with Large-Scale
Chest Computed Tomography Volumes [64.21642241351857]
We curated and analyzed a chest computed tomography (CT) data set of 36,316 volumes from 19,993 unique patients.
We developed a rule-based method for automatically extracting abnormality labels from free-text radiology reports.
We also developed a model for multi-organ, multi-disease classification of chest CT volumes.
arXiv Detail & Related papers (2020-02-12T00:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.