Automated Machine Learning in Radiomics: A Comparative Evaluation of Performance, Efficiency and Accessibility
- URL: http://arxiv.org/abs/2601.08334v2
- Date: Tue, 20 Jan 2026 08:31:33 GMT
- Title: Automated Machine Learning in Radiomics: A Comparative Evaluation of Performance, Efficiency and Accessibility
- Authors: Jose Lozano-Montoya, Emilio Soria-Olivas, Almudena Fuster-Matanzo, Angel Alberich-Bayarri, Ana Jimenez-Pastor,
- Abstract summary: This study evaluates the performance, efficiency, and accessibility of general-purpose and radiomics-specific AutoML frameworks.<n>Ten public/private radiomics with varied imaging modalities (CT/MRI), sizes, anatomies and endpoints were used.<n>Simplatab provides an effective balance of performance, efficiency, and accessibility for radiomics classification problems.
- Score: 1.0631763329108546
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automated machine learning (AutoML) frameworks can lower technical barriers for predictive and prognostic model development in radiomics by enabling researchers without programming expertise to build models. However, their effectiveness in addressing radiomics-specific challenges remains unclear. This study evaluates the performance, efficiency, and accessibility of general-purpose and radiomics-specific AutoML frameworks on diverse radiomics classification tasks, thereby highlighting development needs for radiomics. Ten public/private radiomics datasets with varied imaging modalities (CT/MRI), sizes, anatomies and endpoints were used. Six general-purpose and five radiomics-specific frameworks were tested with predefined parameters using standardized cross-validation. Evaluation metrics included AUC, runtime, together with qualitative aspects related to software status, accessibility, and interpretability. Simplatab, a radiomics-specific tool with a no-code interface, achieved the highest average test AUC (81.81%) with a moderate runtime (~1 hour). LightAutoML, a general-purpose framework, showed the fastest execution with competitive performance (78.74% mean AUC in six minutes). Most radiomics-specific frameworks were excluded from the performance analysis due to obsolescence, extensive programming requirements, or computational inefficiency. Conversely, general-purpose frameworks demonstrated higher accessibility and ease of implementation. Simplatab provides an effective balance of performance, efficiency, and accessibility for radiomics classification problems. However, significant gaps remain, including the lack of accessible survival analysis support and the limited integration of feature reproducibility and harmonization within current AutoML frameworks. Future research should focus on adapting AutoML solutions to better address these radiomics-specific challenges.
Related papers
- An AI Framework for Microanastomosis Motion Assessment [3.9524886416531753]
We propose a novel AI framework for the automated assessment of microanastomosis instrument handling skills.<n>The system integrates four core components: (1) an instrument detection module based on the You Only Look Once (YOLO) architecture; (2) an instrument tracking module developed from Deep Simple Online and Realtime Tracking (DeepSORT); and (3) an instrument tip localization module employing shape descriptors.<n> Experimental results demonstrate the effectiveness of the framework, achieving an instrument detection precision of 97%, with a mean Average Precision (mAP) of 96%, measured by Intersection over Union (IoU) thresholds ranging from 50% to 95% (m
arXiv Detail & Related papers (2026-01-28T23:23:37Z) - A DeepSeek-Powered AI System for Automated Chest Radiograph Interpretation in Clinical Practice [83.11942224668127]
Janus-Pro-CXR (1B) is a chest X-ray interpretation system based on DeepSeek Janus-Pro model.<n>Our system outperforms state-of-the-art X-ray report generation models in automated report generation.
arXiv Detail & Related papers (2025-12-23T13:26:13Z) - Diagnostic Performance of Universal-Learning Ultrasound AI Across Multiple Organs and Tasks: the UUSIC25 Challenge [34.86849736082012]
Current ultrasound AI remains fragmented into single-task tools.<n>General-purpose AI models achieve high accuracy and efficiency across multiple tasks using a single architecture.
arXiv Detail & Related papers (2025-12-19T06:54:30Z) - FAIM: Frequency-Aware Interactive Mamba for Time Series Classification [87.84511960413715]
Time series classification (TSC) is crucial in numerous real-world applications, such as environmental monitoring, medical diagnosis, and posture recognition.<n>We propose FAIM, a lightweight Frequency-Aware Interactive Mamba model.<n>We show that FAIM consistently outperforms existing state-of-the-art (SOTA) methods, achieving a superior trade-off between accuracy and efficiency.
arXiv Detail & Related papers (2025-11-26T08:36:33Z) - A Neural Network Approach to Multi-radionuclide TDCR Beta Spectroscopy [12.470638217209851]
Liquid scintillation triple-to-doubly coincident ratio (TDCR) spectroscopy is widely adopted as a standard method for radionuclide quantification.<n>Here, we present an Artificial Intelligence framework that combines numerical spectral simulation and deep learning for standard-free automated analysis.
arXiv Detail & Related papers (2025-09-03T08:40:02Z) - An Interpretable Transformer-Based Foundation Model for Cross-Procedural Skill Assessment Using Raw fNIRS Signals [0.0]
We introduce an interpretable transformer-based foundation model trained on minimally processed fNIRS signals for cross-procedural skill assessment.<n>The model achieves greater than 88% classification accuracy on all tasks, with Matthews Correlation Coefficient exceeding 0.91 on ETI.<n>It generalizes to a novel emergency airway procedure--cricothyrotomy--using fewer than 30 labeled samples and a lightweight (less than 2k parameter) adapter module.
arXiv Detail & Related papers (2025-06-21T18:30:58Z) - End-to-End Deep Learning for Real-Time Neuroimaging-Based Assessment of Bimanual Motor Skills [1.710146779965826]
This study presents a novel end-to-end deep learning framework that processes raw fNIRS signals directly.<n>It achieved a mean classification accuracy of 93.9% (SD 4.4) and a generalization accuracy of 92.6% (SD 1.9) on unseen skill retention datasets.
arXiv Detail & Related papers (2025-03-21T22:56:54Z) - How Well Can Modern LLMs Act as Agent Cores in Radiology Environments? [54.36730060680139]
RadA-BenchPlat is an evaluation platform that benchmarks the performance of large language models (LLMs) in radiology environments.<n>The platform also defines ten categories of tools for agent-driven task solving and evaluates seven leading LLMs.
arXiv Detail & Related papers (2024-12-12T18:20:16Z) - A Hybrid Framework for Statistical Feature Selection and Image-Based Noise-Defect Detection [55.2480439325792]
This paper presents a hybrid framework that integrates both statistical feature selection and classification techniques to improve defect detection accuracy.<n>We present around 55 distinguished features that are extracted from industrial images, which are then analyzed using statistical methods.<n>By integrating these methods with flexible machine learning applications, the proposed framework improves detection accuracy and reduces false positives and misclassifications.
arXiv Detail & Related papers (2024-12-11T22:12:21Z) - Reproducible Machine Learning-based Voice Pathology Detection: Introducing the Pitch Difference Feature [1.7779568951268254]
We introduce a novel methodology for voice pathology detection using the publicly available Saarbr"ucken Voice Database.<n>We evaluate six machine learning (ML) algorithms -- support vector machine, k-nearest neighbors, naive Bayes, decision tree, random forest, and AdaBoost.<n>Our approach 85.61%, 84.69% and 85.22% unweighted average recall (UAR) for females, males and combined results respectively.
arXiv Detail & Related papers (2024-10-14T14:17:52Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - Position: AI Evaluation Should Learn from How We Test Humans [65.36614996495983]
We argue that psychometrics, a theory originating in the 20th century for human assessment, could be a powerful solution to the challenges in today's AI evaluations.
arXiv Detail & Related papers (2023-06-18T09:54:33Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.