Related papers: Mitigating Biases in Surgical Operating Rooms with Geometry

Mitigating Biases in Surgical Operating Rooms with Geometry

URL: http://arxiv.org/abs/2508.08028v2
Date: Wed, 27 Aug 2025 07:42:18 GMT
Title: Mitigating Biases in Surgical Operating Rooms with Geometry
Authors: Tony Danjun Wang, Tobias Czempiel, Nassir Navab, Lennart Bastian,
Abstract summary: Deep neural networks are prone to learning spurious correlations, exploiting dataset-specific artifacts for prediction.<n>In surgical operating rooms (OR), these manifest through the standardization of smocks and gowns that obscure robust identifying landmarks.<n>We address this problem by encoding personnel as 3D point cloud sequences, disentangling identity-relevant shape and motion patterns from appearance-based confounders.
Score: 40.5145973787288
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep neural networks are prone to learning spurious correlations, exploiting dataset-specific artifacts rather than meaningful features for prediction. In surgical operating rooms (OR), these manifest through the standardization of smocks and gowns that obscure robust identifying landmarks, introducing model bias for tasks related to modeling OR personnel. Through gradient-based saliency analysis on two public OR datasets, we reveal that CNN models succumb to such shortcuts, fixating on incidental visual cues such as footwear beneath surgical gowns, distinctive eyewear, or other role-specific identifiers. Avoiding such biases is essential for the next generation of intelligent assistance systems in the OR, which should accurately recognize personalized workflow traits, such as surgical skill level or coordination with other staff members. We address this problem by encoding personnel as 3D point cloud sequences, disentangling identity-relevant shape and motion patterns from appearance-based confounders. Our experiments demonstrate that while RGB and geometric methods achieve comparable performance on datasets with apparent simulation artifacts, RGB models suffer a 12% accuracy drop in realistic clinical settings with decreased visual diversity due to standardizations. This performance gap confirms that geometric representations capture more meaningful biometric features, providing an avenue to developing robust methods of modeling humans in the OR.

Related papers

TRELLIS-Enhanced Surface Features for Comprehensive Intracranial Aneurysm Analysis [2.624902795082451]
Intracranial aneurysms pose a significant clinical risk yet are difficult to detect, delineate and model due to limited annotated 3D data.<n>We propose a cross-domain feature-transfer approach that leverages the latent geometric embeddings learned by TRELLIS, a generative model trained on large-scale non-medical 3D datasets.
arXiv Detail & Related papers (2025-09-03T07:51:17Z)
Privacy-Preserving Operating Room Workflow Analysis using Digital Twins [38.744671293771695]
We propose a two-stage pipeline for privacy-preserving operating room (OR) video analysis and event detection.<n>First, we leverage vision foundation models for depth estimation and semantic segmentation to generate Digital Twins of the OR from conventional RGB videos.<n>Second, we employ the SafeOR model, a fused two-stream approach that processes segmentation masks and depth maps for OR event detection.
arXiv Detail & Related papers (2025-04-17T00:46:06Z)
Handling Geometric Domain Shifts in Semantic Segmentation of Surgical RGB and Hyperspectral Images [67.66644395272075]
We present first analysis of state-of-the-art semantic segmentation models when faced with geometric out-of-distribution data. We propose an augmentation technique called "Organ Transplantation" to enhance generalizability. Our augmentation technique improves SOA model performance by up to 67 % for RGB data and 90 % for HSI data, achieving performance at the level of in-distribution performance on real OOD test data.
arXiv Detail & Related papers (2024-08-27T19:13:15Z)
Uncertainty modeling for fine-tuned implicit functions [10.902709236602536]
Implicit functions have become pivotal in computer vision for reconstructing detailed object shapes from sparse views.<n>We introduce Dropsembles, a novel method for uncertainty estimation in tuned implicit functions.<n>Our results show that Dropsembles achieve the accuracy and calibration levels of deep ensembles but with significantly less computational cost.
arXiv Detail & Related papers (2024-06-17T20:46:18Z)
Learning Discriminative Representations for Skeleton Based Action Recognition [49.45405879193866]
We propose an auxiliary feature refinement head (FR Head) to obtain discriminative representations of skeletons. Our proposed models obtain competitive results from state-of-the-art methods and can help to discriminate those ambiguous samples.
arXiv Detail & Related papers (2023-03-07T08:37:48Z)
Improving Deep Facial Phenotyping for Ultra-rare Disorder Verification Using Model Ensembles [52.77024349608834]
We analyze the influence of replacing a DCNN with a state-of-the-art face recognition approach, iResNet with ArcFace. Our proposed ensemble model achieves state-of-the-art performance on both seen and unseen disorders.
arXiv Detail & Related papers (2022-11-12T23:28:54Z)
Mixed Effects Neural ODE: A Variational Approximation for Analyzing the Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data. We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem. We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z)
Facial Anatomical Landmark Detection using Regularized Transfer Learning with Application to Fetal Alcohol Syndrome Recognition [24.27777060287004]
Fetal alcohol syndrome (FAS) caused by prenatal alcohol exposure can result in a series of cranio-facial anomalies. Anatomical landmark detection is important to detect the presence of FAS associated facial anomalies. Current deep learning-based heatmap regression methods designed for facial landmark detection in natural images assume availability of large datasets. We develop a new regularized transfer learning approach that exploits the knowledge of a network learned on large facial recognition datasets.
arXiv Detail & Related papers (2021-09-12T11:05:06Z)
TSGCNet: Discriminative Geometric Feature Learning with Two-Stream GraphConvolutional Network for 3D Dental Model Segmentation [141.2690520327948]
We propose a two-stream graph convolutional network (TSGCNet) to learn multi-view information from different geometric attributes. We evaluate our proposed TSGCNet on a real-patient dataset of dental models acquired by 3D intraoral scanners.
arXiv Detail & Related papers (2020-12-26T08:02:56Z)
Classifying Eye-Tracking Data Using Saliency Maps [8.524684315458245]
This paper proposes a visual saliency based novel feature extraction method for automatic and quantitative classification of eye-tracking data. Comparing the saliency amplitudes, similarity and dissimilarity of saliency maps with the corresponding eye fixations maps gives an extra dimension of information which is effectively utilized to generate discriminative features to classify the eye-tracking data.
arXiv Detail & Related papers (2020-10-24T15:18:07Z)
Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients. We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks. Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
"Name that manufacturer". Relating image acquisition bias with task complexity when training deep learning models: experiments on head CT [0.0]
We analyze how the distribution of scanner manufacturers in a dataset can contribute to the overall bias of deep learning models. We demonstrate that CNNs can learn to distinguish the imaging scanner manufacturer and that this bias can substantially impact model performance.
arXiv Detail & Related papers (2020-08-19T16:05:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.