Related papers: Beyond the First Read: AI-Assisted Perceptual Error Detection in Chest Radiography Accounting for Interobserver Variability

Beyond the First Read: AI-Assisted Perceptual Error Detection in Chest Radiography Accounting for Interobserver Variability

URL: http://arxiv.org/abs/2506.13049v1
Date: Mon, 16 Jun 2025 02:36:38 GMT
Title: Beyond the First Read: AI-Assisted Perceptual Error Detection in Chest Radiography Accounting for Interobserver Variability
Authors: Adhrith Vutukuri, Akash Awasthi, David Yang, Carol C. Wu, Hien Van Nguyen,
Abstract summary: We introduce RADAR, a post-interpretation companion system.<n>RADAR ingests radiologist annotations and CXR images, then performs regional-level analysis.<n>RADAR achieved a recall of 0.78, precision of 0.44, and an F1 score of 0.56 in detecting missed abnormalities.
Score: 3.32947201358052
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Chest radiography is widely used in diagnostic imaging. However, perceptual errors -- especially overlooked but visible abnormalities -- remain common and clinically significant. Current workflows and AI systems provide limited support for detecting such errors after interpretation and often lack meaningful human--AI collaboration. We introduce RADAR (Radiologist--AI Diagnostic Assistance and Review), a post-interpretation companion system. RADAR ingests finalized radiologist annotations and CXR images, then performs regional-level analysis to detect and refer potentially missed abnormal regions. The system supports a "second-look" workflow and offers suggested regions of interest (ROIs) rather than fixed labels to accommodate inter-observer variation. We evaluated RADAR on a simulated perceptual-error dataset derived from de-identified CXR cases, using F1 score and Intersection over Union (IoU) as primary metrics. RADAR achieved a recall of 0.78, precision of 0.44, and an F1 score of 0.56 in detecting missed abnormalities in the simulated perceptual-error dataset. Although precision is moderate, this reduces over-reliance on AI by encouraging radiologist oversight in human--AI collaboration. The median IoU was 0.78, with more than 90% of referrals exceeding 0.5 IoU, indicating accurate regional localization. RADAR effectively complements radiologist judgment, providing valuable post-read support for perceptual-error detection in CXR interpretation. Its flexible ROI suggestions and non-intrusive integration position it as a promising tool in real-world radiology workflows. To facilitate reproducibility and further evaluation, we release a fully open-source web implementation alongside a simulated error dataset. All code, data, demonstration videos, and the application are publicly available at https://github.com/avutukuri01/RADAR.

Related papers

XAI-Driven Diagnosis of Generalization Failure in State-Space Cerebrovascular Segmentation Models: A Case Study on Domain Shift Between RSNA and TopCoW Datasets [0.5735035463793009]
We present a rigorous, two-phase approach to diagnose the generalization failure of state-of-the-art State-Space Models (SSMs)<n>We quantified the model's focus by measuring the overlap between its attention maps and the Ground Truth segmentations.<n>Our analysis proves the model failed to generalize because its attention mechanism abandoned true anatomical features in the Target domain.
arXiv Detail & Related papers (2025-12-16T00:34:32Z)
ActiTect: A Generalizable Machine Learning Pipeline for REM Sleep Behavior Disorder Screening through Standardized Actigraphy [1.2106870940376342]
Isolated rapid eye movement sleep behavior disorder (iRBD) is a major prodromal marker of $$-synucleinopathies.<n> wrist-worn actimeters hold significant potential for detecting RBD in large-scale screening efforts.<n>This study presents ActiTect, a fully automated, open-source machine learning tool to identify RBD from actigraphy recordings.
arXiv Detail & Related papers (2025-11-07T13:18:20Z)
U2AD: Uncertainty-based Unsupervised Anomaly Detection Framework for Detecting T2 Hyperintensity in MRI Spinal Cord [7.811634659561162]
T2 hyperintensities in spinal cord MR images are crucial biomarkers for conditions such as degenerative cervical myelopathy.<n>Deep learning methods have shown promise in lesion detection, but most supervised approaches are heavily dependent on large, annotated datasets.<n>We propose an Uncertainty-based Unsupervised Anomaly Detection framework, termed U2AD, to address these limitations.
arXiv Detail & Related papers (2025-03-17T17:33:32Z)
DDxT: Deep Generative Transformer Models for Differential Diagnosis [51.25660111437394]
We show that a generative approach trained with simpler supervised and self-supervised learning signals can achieve superior results on the current benchmark. The proposed Transformer-based generative network, named DDxT, autoregressively produces a set of possible pathologies, i.e., DDx, and predicts the actual pathology using a neural network.
arXiv Detail & Related papers (2023-12-02T22:57:25Z)
StenUNet: Automatic Stenosis Detection from X-ray Coronary Angiography [5.430434855741553]
The severity of coronary artery disease (CAD) is quantified by the location, degree of narrowing (stenosis) and number of arteries involved. The MICCAI grand challenge: Automatic Region-based Coronary Artery Disease diagnostics using the X-ray angiography imagEs (ARCADE) curated a dataset with stenosis annotations. We propose the architecture and algorithm StenUNet to accurately detect stenosis from X-ray Coronary Angiography.
arXiv Detail & Related papers (2023-10-23T14:04:18Z)
I-AI: A Controllable & Interpretable AI System for Decoding Radiologists' Intense Focus for Accurate CXR Diagnoses [9.260958560874812]
Interpretable Artificial Intelligence (I-AI) is a novel and unified controllable interpretable pipeline. Our I-AI addresses three key questions: where a radiologist looks, how long they focus on specific areas, and what findings they diagnose.
arXiv Detail & Related papers (2023-09-24T04:48:44Z)
Cross-Modal Causal Intervention for Medical Report Generation [107.76649943399168]
Radiology Report Generation (RRG) is essential for computer-aided diagnosis and medication guidance.<n> generating accurate lesion descriptions remains challenging due to spurious correlations from visual-linguistic biases.<n>We propose a two-stage framework named CrossModal Causal Representation Learning (CMCRL)<n> Experiments on IU-Xray and MIMIC-CXR show that our CMCRL pipeline significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-03-16T07:23:55Z)
Learning to diagnose cirrhosis from radiological and histological labels with joint self and weakly-supervised pretraining strategies [62.840338941861134]
We propose to leverage transfer learning from large datasets annotated by radiologists, to predict the histological score available on a small annex dataset. We compare different pretraining methods, namely weakly-supervised and self-supervised ones, to improve the prediction of the cirrhosis. This method outperforms the baseline classification of the METAVIR score, reaching an AUC of 0.84 and a balanced accuracy of 0.75.
arXiv Detail & Related papers (2023-02-16T17:06:23Z)
Are we certain it's anomalous? [57.729669157989235]
Anomaly detection in time series is a complex task since anomalies are rare due to highly non-linear temporal correlations. Here we propose the novel use of Hyperbolic uncertainty for Anomaly Detection (HypAD) HypAD learns self-supervisedly to reconstruct the input signal.
arXiv Detail & Related papers (2022-11-16T21:31:39Z)
Radiomics-Guided Global-Local Transformer for Weakly Supervised Pathology Localization in Chest X-Rays [65.88435151891369]
Radiomics-Guided Transformer (RGT) fuses textitglobal image information with textitlocal knowledge-guided radiomics information. RGT consists of an image Transformer branch, a radiomics Transformer branch, and fusion layers that aggregate image and radiomic information.
arXiv Detail & Related papers (2022-07-10T06:32:56Z)
StRegA: Unsupervised Anomaly Detection in Brain MRIs using a Compact Context-encoding Variational Autoencoder [48.2010192865749]
Unsupervised anomaly detection (UAD) can learn a data distribution from an unlabelled dataset of healthy subjects and then be applied to detect out of distribution samples. This research proposes a compact version of the "context-encoding" VAE (ceVAE) model, combined with pre and post-processing steps, creating a UAD pipeline (StRegA) The proposed pipeline achieved a Dice score of 0.642$pm$0.101 while detecting tumours in T2w images of the BraTS dataset and 0.859$pm$0.112 while detecting artificially induced anomalies.
arXiv Detail & Related papers (2022-01-31T14:27:35Z)
Generative Residual Attention Network for Disease Detection [51.60842580044539]
We present a novel approach for disease generation in X-rays using a conditional generative adversarial learning. We generate a corresponding radiology image in a target domain while preserving the identity of the patient. We then use the generated X-ray image in the target domain to augment our training to improve the detection performance.
arXiv Detail & Related papers (2021-10-25T14:15:57Z)
A clinical validation of VinDr-CXR, an AI system for detecting abnormal chest radiographs [0.0]
We demonstrate a mechanism for validating an AI-based system for detecting abnormalities on X-ray scans. The system achieves an F1 score - the harmonic average of the recall and the precision - of 0.653 CI 0.635, 0.671) for detecting any abnormalities on chest X-rays.
arXiv Detail & Related papers (2021-04-06T02:53:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.