Toward Automated Regulatory Decision-Making: Trustworthy Medical Device Risk Classification with Multimodal Transformers and Self-Training
- URL: http://arxiv.org/abs/2505.00422v1
- Date: Thu, 01 May 2025 09:41:41 GMT
- Title: Toward Automated Regulatory Decision-Making: Trustworthy Medical Device Risk Classification with Multimodal Transformers and Self-Training
- Authors: Yu Han, Aaron Ceross, Jeroen H. M. Bergmann,
- Abstract summary: Transformer-based framework integrates textual descriptions and visual information to predict device regulatory classification.<n>Our approach achieves up to 90.4% accuracy and 97.9% AUROC, significantly outperforming text-only (77.2%) and image-only (54.8%) baselines.
- Score: 3.439579933384111
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate classification of medical device risk levels is essential for regulatory oversight and clinical safety. We present a Transformer-based multimodal framework that integrates textual descriptions and visual information to predict device regulatory classification. The model incorporates a cross-attention mechanism to capture intermodal dependencies and employs a self-training strategy for improved generalization under limited supervision. Experiments on a real-world regulatory dataset demonstrate that our approach achieves up to 90.4% accuracy and 97.9% AUROC, significantly outperforming text-only (77.2%) and image-only (54.8%) baselines. Compared to standard multimodal fusion, the self-training mechanism improved SVM performance by 3.3 percentage points in accuracy (from 87.1% to 90.4%) and 1.4 points in macro-F1, suggesting that pseudo-labeling can effectively enhance generalization under limited supervision. Ablation studies further confirm the complementary benefits of both cross-modal attention and self-training.
Related papers
- Large Language Model's Multi-Capability Alignment in Biomedical Domain [3.1427813443719868]
BalancedBio is a framework for parameter-efficient biomedical reasoning.<n>It addresses multi-capability integration in domain-specific AI alignment.<n>It achieves state-of-the-art results in its parameter class.<n>Real-world deployment yields 78% cost reduction, 23% improved diagnostic accuracy, and 89% clinician acceptance.
arXiv Detail & Related papers (2025-08-06T10:06:11Z) - Beyond Benchmarks: Dynamic, Automatic And Systematic Red-Teaming Agents For Trustworthy Medical Language Models [87.66870367661342]
Large language models (LLMs) are used in AI applications in healthcare.<n>Red-teaming framework that continuously stress-test LLMs can reveal significant weaknesses in four safety-critical domains.<n>A suite of adversarial agents is applied to autonomously mutate test cases, identify/evolve unsafe-triggering strategies, and evaluate responses.<n>Our framework delivers an evolvable, scalable, and reliable safeguard for the next generation of medical AI.
arXiv Detail & Related papers (2025-07-30T08:44:22Z) - ASDA: Audio Spectrogram Differential Attention Mechanism for Self-Supervised Representation Learning [57.67273340380651]
Experimental results demonstrate that our ASDA model achieves state-of-the-art (SOTA) performance across multiple benchmarks.<n>These results highlight ASDA's effectiveness in audio tasks, paving the way for broader applications.
arXiv Detail & Related papers (2025-07-03T14:29:43Z) - Tiered Agentic Oversight: A Hierarchical Multi-Agent System for AI Safety in Healthcare [43.75158832964138]
Tiered Agentic Oversight (TAO) is a hierarchical multi-agent framework that enhances AI safety through layered, automated supervision.<n>Inspired by clinical hierarchies (e.g., nurse, physician, specialist), TAO conducts agent routing based on task complexity and agent roles.
arXiv Detail & Related papers (2025-06-14T12:46:10Z) - EXGnet: a single-lead explainable-AI guided multiresolution network with train-only quantitative features for trustworthy ECG arrhythmia classification [1.5162243843944596]
We propose EXGnet, a novel ECG arrhythmia classification network tailored for single-lead signals.<n>XAI supervision during training directs the model's attention to clinically relevant ECG regions.<n>We introduce an innovative multiresolution block to efficiently capture both short and long-term signal features.
arXiv Detail & Related papers (2025-06-14T08:48:44Z) - Multi-Modal Explainable Medical AI Assistant for Trustworthy Human-AI Collaboration [17.11245701879749]
Generalist Medical AI (GMAI) systems have demonstrated expert-level performance in biomedical perception tasks.<n>Here, we present XMedGPT, a clinician-centric, multi-modal AI assistant that integrates textual and visual interpretability.<n>We validate XMedGPT across four pillars: multi-modal interpretability, uncertainty quantification, and prognostic modeling, and rigorous benchmarking.
arXiv Detail & Related papers (2025-05-11T08:32:01Z) - Lie Detector: Unified Backdoor Detection via Cross-Examination Framework [68.45399098884364]
We propose a unified backdoor detection framework in the semi-honest setting.
Our method achieves superior detection performance, improving accuracy by 5.4%, 1.6%, and 11.9% over SoTA baselines.
Notably, it is the first to effectively detect backdoors in multimodal large language models.
arXiv Detail & Related papers (2025-03-21T06:12:06Z) - Beyond Confidence: Adaptive Abstention in Dual-Threshold Conformal Prediction for Autonomous System Perception [0.4124847249415279]
Safety-critical perception systems require reliable uncertainty quantification and principled abstention mechanisms to maintain safety.<n>We present a novel dual-threshold conformalization framework that provides statistically-guaranteed uncertainty estimates while enabling selective prediction in high-risk scenarios.
arXiv Detail & Related papers (2025-02-11T04:45:31Z) - Towards Robust Unsupervised Attention Prediction in Autonomous Driving [40.84001015982244]
We propose a robust unsupervised attention prediction method for self-driving systems.<n>An Uncertainty Mining Branch refines predictions by analyzing commonalities and differences across multiple pre-trained models on natural scenes.<n>A Knowledge Embedding Block bridges the domain gap by incorporating driving knowledge to adaptively enhance pseudo-labels.<n>A novel data augmentation method improves robustness against corruption through soft attention and dynamic augmentation.
arXiv Detail & Related papers (2025-01-25T03:01:26Z) - Enhancing Precision of Automated Teller Machines Network Quality Assessment: Machine Learning and Multi Classifier Fusion Approaches [2.2670946312994]
This study introduces a data fusion approach that utilizes multi-classifier fusion techniques to enhance ATM reliability.<n>The proposed framework integrates diverse classification models within a Stacking, achieving a dramatic reduction in false alarms from 3.56 percent to just 0.71 percent.<n>This multi-classifier fusion method synthesizes the strengths of individual models, leading to significant cost savings and improved operational decision-making.
arXiv Detail & Related papers (2025-01-02T05:33:01Z) - Re-evaluating Group Robustness via Adaptive Class-Specific Scaling [47.41034887474166]
Group distributionally robust optimization is a prominent algorithm used to mitigate spurious correlations and address dataset bias.<n>Existing approaches have reported improvements in robust accuracies but come at the cost of average accuracy due to inherent trade-offs.<n>We propose a class-specific scaling strategy, directly applicable to existing debiasing algorithms with no additional training.<n>We develop an instance-wise adaptive scaling technique to alleviate this trade-off, even leading to improvements in both robust and average accuracies.
arXiv Detail & Related papers (2024-12-19T16:01:51Z) - iFuzzyTL: Interpretable Fuzzy Transfer Learning for SSVEP BCI System [24.898026682692688]
This study explores advanced classification techniques leveraging interpretable fuzzy transfer learning (iFuzzyTL)
iFuzzyTL refines input signal processing and classification in a human-interpretable format by integrating fuzzy inference systems and attention mechanisms.
The model's efficacy is demonstrated across three datasets.
arXiv Detail & Related papers (2024-10-16T06:07:23Z) - MLAE: Masked LoRA Experts for Visual Parameter-Efficient Fine-Tuning [45.93128932828256]
Masked LoRA Experts (MLAE) is an innovative approach that applies the concept of masking to visual PEFT.
Our method incorporates a cellular decomposition strategy that transforms a low-rank matrix into independent rank-1 submatrices.
We show that MLAE achieves new state-of-the-art (SOTA) performance with an average accuracy score of 78.8% on the VTAB-1k benchmark and 90.9% on the FGVC benchmark.
arXiv Detail & Related papers (2024-05-29T08:57:23Z) - Confidence-aware multi-modality learning for eye disease screening [58.861421804458395]
We propose a novel multi-modality evidential fusion pipeline for eye disease screening.
It provides a measure of confidence for each modality and elegantly integrates the multi-modality information.
Experimental results on both public and internal datasets demonstrate that our model excels in robustness.
arXiv Detail & Related papers (2024-05-28T13:27:30Z) - Learning to diagnose cirrhosis from radiological and histological labels
with joint self and weakly-supervised pretraining strategies [62.840338941861134]
We propose to leverage transfer learning from large datasets annotated by radiologists, to predict the histological score available on a small annex dataset.
We compare different pretraining methods, namely weakly-supervised and self-supervised ones, to improve the prediction of the cirrhosis.
This method outperforms the baseline classification of the METAVIR score, reaching an AUC of 0.84 and a balanced accuracy of 0.75.
arXiv Detail & Related papers (2023-02-16T17:06:23Z) - ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through
Regularized Self-Attention [48.697458429460184]
Two factors, information bottleneck sensitivity and inconsistency between different attention topologies, could affect the performance of the Sparse Transformer.
This paper proposes a well-designed model named ERNIE-Sparse.
It consists of two distinctive parts: (i) Hierarchical Sparse Transformer (HST) to sequentially unify local and global information, and (ii) Self-Attention Regularization (SAR) to minimize the distance for transformers with different attention topologies.
arXiv Detail & Related papers (2022-03-23T08:47:01Z) - Pruning Redundant Mappings in Transformer Models via Spectral-Normalized
Identity Prior [54.629850694790036]
spectral-normalized identity priors (SNIP) is a structured pruning approach that penalizes an entire residual module in a Transformer model toward an identity mapping.
We conduct experiments with BERT on 5 GLUE benchmark tasks to demonstrate that SNIP achieves effective pruning results while maintaining comparable performance.
arXiv Detail & Related papers (2020-10-05T05:40:56Z) - Incorporating Effective Global Information via Adaptive Gate Attention
for Text Classification [13.45504908358177]
We show that simple statistical information can enhance classification performance both efficiently and significantly compared with several baseline models.
We propose a classifier with gate mechanism named Adaptive Gate Attention model with Global Information (AGA+GI) in which the adaptive gate mechanism incorporates global statistical features into latent semantic features.
Our experiments show that the proposed method can achieve better accuracy than CNN-based and RNN-based approaches without global information on several benchmarks.
arXiv Detail & Related papers (2020-02-22T10:06:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.