Measuring and Predicting Where and When Pathologists Focus their Visual Attention while Grading Whole Slide Images of Cancer
- URL: http://arxiv.org/abs/2508.01668v1
- Date: Sun, 03 Aug 2025 08:53:45 GMT
- Title: Measuring and Predicting Where and When Pathologists Focus their Visual Attention while Grading Whole Slide Images of Cancer
- Authors: Souradeep Chakraborty, Ruoyu Xue, Rajarsi Gupta, Oksana Yaskiv, Constantin Friedman, Natallia Sheuka, Dana Perez, Paul Friedman, Won-Tak Choi, Waqas Mahmud, Beatrice Knudsen, Gregory Zelinsky, Joel Saltz, Dimitris Samaras,
- Abstract summary: We develop methods to predict movements of pathologists' attention as they grade slide images of prostate cancer.<n>Tools developed from this model could assist pathology trainees in learning to allocate their attention during reading like an expert.
- Score: 15.910358358714165
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The ability to predict the attention of expert pathologists could lead to decision support systems for better pathology training. We developed methods to predict the spatio-temporal (where and when) movements of pathologists' attention as they grade whole slide images (WSIs) of prostate cancer. We characterize a pathologist's attention trajectory by their x, y, and m (magnification) movements of a viewport as they navigate WSIs using a digital microscope. This information was obtained from 43 pathologists across 123 WSIs, and we consider the task of predicting the pathologist attention scanpaths constructed from the viewport centers. We introduce a fixation extraction algorithm that simplifies an attention trajectory by extracting fixations in the pathologist's viewing while preserving semantic information, and we use these pre-processed data to train and test a two-stage model to predict the dynamic (scanpath) allocation of attention during WSI reading via intermediate attention heatmap prediction. In the first stage, a transformer-based sub-network predicts the attention heatmaps (static attention) across different magnifications. In the second stage, we predict the attention scanpath by sequentially modeling the next fixation points in an autoregressive manner using a transformer-based approach, starting at the WSI center and leveraging multi-magnification feature representations from the first stage. Experimental results show that our scanpath prediction model outperforms chance and baseline models. Tools developed from this model could assist pathology trainees in learning to allocate their attention during WSI reading like an expert.
Related papers
- Human Scanpath Prediction in Target-Present Visual Search with Semantic-Foveal Bayesian Attention [49.99728312519117]
SemBA-FAST is a top-down framework designed for predicting human visual attention in target-present visual search.<n>We evaluate SemBA-FAST on the COCO-Search18 benchmark dataset, comparing its performance against other scanpath prediction models.<n>These findings provide valuable insights into the capabilities of semantic-foveal probabilistic frameworks for human-like attention modelling.
arXiv Detail & Related papers (2025-07-24T15:19:23Z) - Semantic Segmentation for Preoperative Planning in Transcatheter Aortic Valve Replacement [61.573750959726475]
We consider medical guidelines for preoperative planning of the transcatheter aortic valve replacement (TAVR) and identify tasks that may be supported via semantic segmentation models.<n>We first derive fine-grained TAVR-relevant pseudo-labels from coarse-grained anatomical information, in order to train segmentation models and quantify how well they are able to find these structures in the scans.
arXiv Detail & Related papers (2025-07-22T13:24:45Z) - EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance [79.66329903007869]
We present EchoWorld, a motion-aware world modeling framework for probe guidance.<n>It encodes anatomical knowledge and motion-induced visual dynamics.<n>It is trained on more than one million ultrasound images from over 200 routine scans.
arXiv Detail & Related papers (2025-04-17T16:19:05Z) - PathSegDiff: Pathology Segmentation using Diffusion model representations [63.20694440934692]
We propose PathSegDiff, a novel approach for histopathology image segmentation that leverages Latent Diffusion Models (LDMs) as pre-trained featured extractors.<n>Our method utilizes a pathology-specific LDM, guided by a self-supervised encoder, to extract rich semantic information from H&E stained histopathology images.<n>Our experiments demonstrate significant improvements over traditional methods on the BCSS and GlaS datasets.
arXiv Detail & Related papers (2025-04-09T14:58:21Z) - Multimodal Learning and Cognitive Processes in Radiology: MedGaze for Chest X-ray Scanpath Prediction [10.388541520456714]
Our proposed system aims to predict eye gaze sequences from radiology reports and CXR images.
Our model predicts fixation coordinates and durations critical for medical scanpath prediction, outperforming existing models in the computer vision community.
Based on the radiologist's evaluation, MedGaze can generate human-like gaze sequences with a high focus on relevant regions.
arXiv Detail & Related papers (2024-06-28T06:38:58Z) - Self-Contrastive Weakly Supervised Learning Framework for Prognostic Prediction Using Whole Slide Images [3.6330373579181927]
Prognostic prediction poses a unique challenge as the ground truth labels are inherently weak.
We propose a novel three-part framework comprising of a convolutional network based tissue segmentation algorithm for region of interest delineation.
Our best models yield an AUC of 0.721 and 0.678 for recurrence and treatment outcome prediction respectively.
arXiv Detail & Related papers (2024-05-24T06:45:36Z) - Decoding the visual attention of pathologists to reveal their level of expertise [20.552161727506235]
We present a method for classifying the expertise of a pathologist based on how they allocated their attention during a cancer reading.
Based solely on a pathologist's attention during a reading, our model was able to predict their level of expertise with 75.3%, 56.1%, and 77.2% accuracy.
arXiv Detail & Related papers (2024-03-25T23:03:51Z) - An Inter-observer consistent deep adversarial training for visual
scanpath prediction [66.46953851227454]
We propose an inter-observer consistent adversarial training approach for scanpath prediction through a lightweight deep neural network.
We show the competitiveness of our approach in regard to state-of-the-art methods.
arXiv Detail & Related papers (2022-11-14T13:22:29Z) - End-to-end Learning for Image-based Detection of Molecular Alterations
in Digital Pathology [1.916179040410189]
Current approaches for classification of whole slide images (WSI) in digital pathology predominantly utilize a two-stage learning pipeline.
A major drawback of such approaches is the requirement for task-specific auxiliary labels which are not acquired in clinical routine.
We propose a novel learning pipeline for WSI classification that is trainable end-to-end and does not require any auxiliary annotations.
arXiv Detail & Related papers (2022-06-30T20:30:33Z) - Scanpath Prediction on Information Visualisations [19.591855190022667]
We propose a model that learns to predict visual saliency and scanpaths on information visualisations.
We present in-depth analyses of gaze behaviour for different information visualisation elements on the popular MASSVIS dataset.
arXiv Detail & Related papers (2021-12-04T13:59:52Z) - BiteNet: Bidirectional Temporal Encoder Network to Predict Medical
Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey.
An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys.
We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.