MURPHY: Relations Matter in Surgical Workflow Analysis
- URL: http://arxiv.org/abs/2212.12719v1
- Date: Sat, 24 Dec 2022 12:09:38 GMT
- Title: MURPHY: Relations Matter in Surgical Workflow Analysis
- Authors: Shang Zhao, Yanzhe Liu, Qiyuan Wang, Dai Sun, Rong Liu, S.Kevin Zhou
- Abstract summary: This paper systematically investigates the importance of relational cues in surgery.
We contribute the RLLS12M dataset, a large-scale collection of robotic left lateral sectionectomy (RLLS)
We propose a multi-relation purification hybrid network (MURPHY), which aptly incorporates novel relation modules to augment the feature representation.
- Score: 12.460554004034472
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autonomous robotic surgery has advanced significantly based on analysis of
visual and temporal cues in surgical workflow, but relational cues from domain
knowledge remain under investigation. Complex relations in surgical annotations
can be divided into intra- and inter-relations, both valuable to autonomous
systems to comprehend surgical workflows. Intra- and inter-relations describe
the relevance of various categories within a particular annotation type and the
relevance of different annotation types, respectively. This paper aims to
systematically investigate the importance of relational cues in surgery. First,
we contribute the RLLS12M dataset, a large-scale collection of robotic left
lateral sectionectomy (RLLS), by curating 50 videos of 50 patients operated by
5 surgeons and annotating a hierarchical workflow, which consists of 3 inter-
and 6 intra-relations, 6 steps, 15 tasks, and 38 activities represented as the
triplet of 11 instruments, 8 actions, and 16 objects, totaling 2,113,510 video
frames and 12,681,060 annotation entities. Correspondingly, we propose a
multi-relation purification hybrid network (MURPHY), which aptly incorporates
novel relation modules to augment the feature representation by purifying
relational features using the intra- and inter-relations embodied in
annotations. The intra-relation module leverages a R-GCN to implant visual
features in different graph relations, which are aggregated using a targeted
relation purification with affinity information measuring label consistency and
feature similarity. The inter-relation module is motivated by attention
mechanisms to regularize the influence of relational features based on the
hierarchy of annotation types from the domain knowledge. Extensive experimental
results on the curated RLLS dataset confirm the effectiveness of our approach,
demonstrating that relations matter in surgical workflow analysis.
Related papers
- Surgical Triplet Recognition via Diffusion Model [59.50938852117371]
Surgical triplet recognition is an essential building block to enable next-generation context-aware operating rooms.
We propose Difft, a new generative framework for surgical triplet recognition employing the diffusion model.
Experiments on the CholecT45 and CholecT50 datasets show the superiority of the proposed method in achieving a new state-of-the-art performance for surgical triplet recognition.
arXiv Detail & Related papers (2024-06-19T04:43:41Z) - Instrument-tissue Interaction Detection Framework for Surgical Video Understanding [31.822025965225016]
We present an Instrument-Tissue Interaction Detection Network (ITIDNet) to detect the quintuple for surgery videos understanding.
Specifically, we propose a Snippet Consecutive Feature (SCF) Layer to enhance features by modeling relationships of proposals in the current frame using global context information in the video snippet.
To reason relationships between instruments and tissues, a Temporal Graph (TG) Layer is proposed with intra-frame connections to exploit relationships between instruments and tissues in the same frame and inter-frame connections to model the temporal information for the same instance.
arXiv Detail & Related papers (2024-03-30T11:21:11Z) - Document-Level Relation Extraction with Relation Correlation Enhancement [10.684005956288347]
Document-level relation extraction (DocRE) is a task that focuses on identifying relations between entities within a document.
Existing DocRE models often overlook the correlation between relations and lack a quantitative analysis of relation correlations.
We propose a relation graph method, which aims to explicitly exploit the interdependency among relations.
arXiv Detail & Related papers (2023-10-06T10:59:00Z) - Learning Complete Topology-Aware Correlations Between Relations for Inductive Link Prediction [121.65152276851619]
We show that semantic correlations between relations are inherently edge-level and entity-independent.
We propose a novel subgraph-based method, namely TACO, to model Topology-Aware COrrelations between relations.
To further exploit the potential of RCN, we propose Complete Common Neighbor induced subgraph.
arXiv Detail & Related papers (2023-09-20T08:11:58Z) - Dynamic Interactive Relation Capturing via Scene Graph Learning for
Robotic Surgical Report Generation [14.711668177329244]
For robot-assisted surgery, an accurate surgical report reflects clinical operations during surgery and helps document entry tasks, post-operative analysis and follow-up treatment.
It is a challenging task due to many complex and diverse interactions between instruments and tissues in the surgical scene.
This paper presents a neural network to boost surgical report generation by explicitly exploring the interactive relation between tissues and surgical instruments.
arXiv Detail & Related papers (2023-06-05T07:34:41Z) - Nested Named Entity Recognition from Medical Texts: An Adaptive Shared
Network Architecture with Attentive CRF [53.55504611255664]
We propose a novel method, referred to as ASAC, to solve the dilemma caused by the nested phenomenon.
The proposed method contains two key modules: the adaptive shared (AS) part and the attentive conditional random field (ACRF) module.
Our model could learn better entity representations by capturing the implicit distinctions and relationships between different categories of entities.
arXiv Detail & Related papers (2022-11-09T09:23:56Z) - Unified Embeddings of Structural and Functional Connectome via a
Function-Constrained Structural Graph Variational Auto-Encoder [2.8719792727222364]
We propose a function-constrained structural graph variational autoencoder capable of incorporating information from both functional and structural connectomes in an unsupervised fashion.
This leads to a joint low-dimensional embedding that establishes a unified spatial coordinate system for comparing across different subjects.
arXiv Detail & Related papers (2022-07-05T21:39:13Z) - MedFACT: Modeling Medical Feature Correlations in Patient Health
Representation Learning via Feature Clustering [20.68759679109556]
In this paper, we propose a general patient health representation learning framework MedFACT.
We estimate correlations via measuring similarity between temporal patterns of medical features with kernel methods, and cluster features with strong correlations into groups.
We employ graph convolutional networks to conduct group-wise feature interactions for better representation learning.
arXiv Detail & Related papers (2022-04-21T10:27:24Z) - CholecTriplet2021: A benchmark challenge for surgical action triplet
recognition [66.51610049869393]
This paper presents CholecTriplet 2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos.
We present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge.
A total of 4 baseline methods and 19 new deep learning algorithms are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%.
arXiv Detail & Related papers (2022-04-10T18:51:55Z) - Cross-Supervised Joint-Event-Extraction with Heterogeneous Information
Networks [61.950353376870154]
Joint-event-extraction is a sequence-to-sequence labeling task with a tag set composed of tags of triggers and entities.
We propose a Cross-Supervised Mechanism (CSM) to alternately supervise the extraction of triggers or entities.
Our approach outperforms the state-of-the-art methods in both entity and trigger extraction.
arXiv Detail & Related papers (2020-10-13T11:51:17Z) - Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding.
At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network.
With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.