From ACR O-RADS 2022 to Explainable Deep Learning: Comparative Performance of Expert Radiologists, Convolutional Neural Networks, Vision Transformers, and Fusion Models in Ovarian Masses
- URL: http://arxiv.org/abs/2511.06282v1
- Date: Sun, 09 Nov 2025 08:36:42 GMT
- Title: From ACR O-RADS 2022 to Explainable Deep Learning: Comparative Performance of Expert Radiologists, Convolutional Neural Networks, Vision Transformers, and Fusion Models in Ovarian Masses
- Authors: Ali Abbasian Ardakani, Afshin Mohammadi, Alisa Mohebbi, Anushya Vijayananthan, Sook Sam Leong, Lim Yi Ting, Mohd Kamil Bin Mohamad Fabell, U Rajendra Acharya, Sepideh Hatamikia,
- Abstract summary: Deep learning models have demonstrated promise in image-based ovarian lesion characterization.<n>This study evaluates radiologist performance applying O-RADS v2022 and compares it to leading convolutional neural network (CNN) and Vision Transformer (ViT) models.<n>CNN models yielded AUCs of 0.620 to 0.908 and accuracies of 59.2% to 86.4%, while ViT16-384 reached the best performance, with an AUC of 0.941 and an accuracy of 87.4%.
- Score: 8.734125009057918
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Background: The 2022 update of the Ovarian-Adnexal Reporting and Data System (O-RADS) ultrasound classification refines risk stratification for adnexal lesions, yet human interpretation remains subject to variability and conservative thresholds. Concurrently, deep learning (DL) models have demonstrated promise in image-based ovarian lesion characterization. This study evaluates radiologist performance applying O-RADS v2022, compares it to leading convolutional neural network (CNN) and Vision Transformer (ViT) models, and investigates the diagnostic gains achieved by hybrid human-AI frameworks. Methods: In this single-center, retrospective cohort study, a total of 512 adnexal mass images from 227 patients (110 with at least one malignant cyst) were included. Sixteen DL models, including DenseNets, EfficientNets, ResNets, VGGs, Xception, and ViTs, were trained and validated. A hybrid model integrating radiologist O-RADS scores with DL-predicted probabilities was also built for each scheme. Results: Radiologist-only O-RADS assessment achieved an AUC of 0.683 and an overall accuracy of 68.0%. CNN models yielded AUCs of 0.620 to 0.908 and accuracies of 59.2% to 86.4%, while ViT16-384 reached the best performance, with an AUC of 0.941 and an accuracy of 87.4%. Hybrid human-AI frameworks further significantly enhanced the performance of CNN models; however, the improvement for ViT models was not statistically significant (P-value >0.05). Conclusions: DL models markedly outperform radiologist-only O-RADS v2022 assessment, and the integration of expert scores with AI yields the highest diagnostic accuracy and discrimination. Hybrid human-AI paradigms hold substantial potential to standardize pelvic ultrasound interpretation, reduce false positives, and improve detection of high-risk lesions.
Related papers
- A DeepSeek-Powered AI System for Automated Chest Radiograph Interpretation in Clinical Practice [83.11942224668127]
Janus-Pro-CXR (1B) is a chest X-ray interpretation system based on DeepSeek Janus-Pro model.<n>Our system outperforms state-of-the-art X-ray report generation models in automated report generation.
arXiv Detail & Related papers (2025-12-23T13:26:13Z) - Data reuse enables cost-efficient randomized trials of medical AI models [38.36499561588967]
We propose BRIDGE, a data-reuse RCT design for AI-based risk models.<n>BRIDGE trials recycle participant-level data from completed trials of AI models when legacy and updated models make concordant predictions.<n>We simulate a series of breast cancer screening studies, where our design reduced required enrollment by 46.6%--saving over US$2.8 million--while maintaining 80% power.
arXiv Detail & Related papers (2025-11-12T05:09:00Z) - Validating Vision Transformers for Otoscopy: Performance and Data-Leakage Effects [42.465094107111646]
This study evaluates the efficacy of vision transformer models, specifically Swin transformers, in enhancing the diagnostic accuracy of ear diseases.<n>The research utilised a real-world dataset from the Department of Otolaryngology at the Clinical Hospital of the Universidad de Chile.
arXiv Detail & Related papers (2025-11-06T23:20:37Z) - Curriculum Learning with Synthetic Data for Enhanced Pulmonary Nodule Detection in Chest Radiographs [0.0]
This study evaluates whether integrating curriculum learning with synthetic augmentation can enhance the detection of difficult pulmonary nodules.<n>A Faster R-CNN with a Feature Pyramid Network (FPN) backbone was trained on a hybrid dataset.
arXiv Detail & Related papers (2025-10-09T02:06:13Z) - Development and validation of an AI foundation model for endoscopic diagnosis of esophagogastric junction adenocarcinoma: a cohort and deep learning study [33.84976409983329]
The early detection of esophagogastric junction adenocarcinoma (EGJA) is crucial for improving patient prognosis, yet its current diagnosis is highly operator-dependent.<n>This paper aims to make the first attempt to develop an artificial intelligence foundation model-based method for both screening and staging diagnosis of EGJA using endoscopic images.
arXiv Detail & Related papers (2025-09-22T12:03:40Z) - Design and Validation of a Responsible Artificial Intelligence-based System for the Referral of Diabetic Retinopathy Patients [65.57160385098935]
Early detection of Diabetic Retinopathy can reduce the risk of vision loss by up to 95%.<n>We developed RAIS-DR, a Responsible AI System for DR screening that incorporates ethical principles across the AI lifecycle.<n>We evaluated RAIS-DR against the FDA-approved EyeArt system on a local dataset of 1,046 patients, unseen by both systems.
arXiv Detail & Related papers (2025-08-17T21:54:11Z) - Handcrafted vs. Deep Radiomics vs. Fusion vs. Deep Learning: A Comprehensive Review of Machine Learning -Based Cancer Outcome Prediction in PET and SPECT Imaging [2.3507313809321233]
This systematic review analyzed 226 studies published from 2020 to 2025 that applied machine learning to PET or SPECT imaging for outcome prediction.<n> PET-based studies generally outperformed those using SPECT, likely due to higher spatial resolution and sensitivity.<n>Common limitations included inadequate handling of class imbalance, missing data, and low population diversity.
arXiv Detail & Related papers (2025-07-21T21:03:12Z) - Artificial Intelligence-Based Triaging of Cutaneous Melanocytic Lesions [0.8864540224289991]
Pathologists are facing an increasing workload due to a growing volume of cases and the need for more comprehensive diagnoses.
We developed an artificial intelligence (AI) model for triaging cutaneous melanocytic lesions based on whole slide images.
arXiv Detail & Related papers (2024-10-14T13:49:04Z) - Incorporating Anatomical Awareness for Enhanced Generalizability and Progression Prediction in Deep Learning-Based Radiographic Sacroiliitis Detection [0.8248058061511542]
The aim of this study was to examine whether incorporating anatomical awareness into a deep learning model can improve generalizability and enable prediction of disease progression.
The performance of the models was compared using the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity.
arXiv Detail & Related papers (2024-05-12T20:02:25Z) - Deep-Learning Tool for Early Identifying Non-Traumatic Intracranial
Hemorrhage Etiology based on CT Scan [40.51754649947294]
The deep learning model was developed with 1868 eligible NCCT scans with non-traumatic ICH collected between January 2011 and April 2018.
The model's diagnostic performance was compared with clinicians's performance.
The clinicians achieve significant improvements in the sensitivity, specificity, and accuracy of diagnoses of certain hemorrhage etiologies with proposed system augmentation.
arXiv Detail & Related papers (2023-02-02T08:45:17Z) - CIRCA: comprehensible online system in support of chest X-rays-based
COVID-19 diagnosis [37.41181188499616]
Deep learning techniques can help in the faster detection of COVID-19 cases and monitoring of disease progression.
Five different datasets were used to construct a representative dataset of 23 799 CXRs for model training.
A U-Net-based model was developed to identify a clinically relevant region of the CXR.
arXiv Detail & Related papers (2022-10-11T13:30:34Z) - Efficient and Visualizable Convolutional Neural Networks for COVID-19
Classification Using Chest CT [0.0]
COVID-19 has infected over 65 million people worldwide as of December 4, 2020.
Deep learning has emerged as a promising diagnosis technique.
In this paper, we evaluate and compare 40 different convolutional neural network architectures for COVID-19 diagnosis.
arXiv Detail & Related papers (2020-12-22T07:09:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.