Tracking and Mapping in Medical Computer Vision: A Review
- URL: http://arxiv.org/abs/2310.11475v2
- Date: Fri, 1 Mar 2024 00:11:26 GMT
- Title: Tracking and Mapping in Medical Computer Vision: A Review
- Authors: Adam Schmidt, Omid Mohareri, Simon DiMaio, Michael C. Yip, Septimiu E.
Salcudean
- Abstract summary: As computer vision algorithms increase in capability, their applications in clinical systems will become more pervasive.
These applications include: diagnostics, such as colonoscopy and bronchoscopy; guiding biopsies, minimally invasive interventions, and surgery.
Many of these applications depend on the specific visual nature of medical scenes and require designing algorithms to perform in this environment.
- Score: 23.28261994515735
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: As computer vision algorithms increase in capability, their applications in
clinical systems will become more pervasive. These applications include:
diagnostics, such as colonoscopy and bronchoscopy; guiding biopsies, minimally
invasive interventions, and surgery; automating instrument motion; and
providing image guidance using pre-operative scans. Many of these applications
depend on the specific visual nature of medical scenes and require designing
algorithms to perform in this environment.
In this review, we provide an update to the field of camera-based tracking
and scene mapping in surgery and diagnostics in medical computer vision. We
begin with describing our review process, which results in a final list of 515
papers that we cover. We then give a high-level summary of the state of the art
and provide relevant background for those who need tracking and mapping for
their clinical applications. After which, we review datasets provided in the
field and the clinical needs that motivate their design. Then, we delve into
the algorithmic side, and summarize recent developments. This summary should be
especially useful for algorithm designers and to those looking to understand
the capability of off-the-shelf methods. We maintain focus on algorithms for
deformable environments while also reviewing the essential building blocks in
rigid tracking and mapping since there is a large amount of crossover in
methods. With the field summarized, we discuss the current state of the
tracking and mapping methods along with needs for future algorithms, needs for
quantification, and the viability of clinical applications. We then provide
some research directions and questions. We conclude that new methods need to be
designed or combined to support clinical applications in deformable
environments, and more focus needs to be put into collecting datasets for
training and evaluation.
Related papers
- Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding [53.629132242389716]
Vision-Language Models (VLM) can support clinicians by analyzing medical images and engaging in natural language interactions.
VLMs often exhibit "hallucinogenic" behavior, generating textual outputs not grounded in contextual multimodal information.
We propose a new alignment algorithm that uses symbolic representations of clinical reasoning to ground VLMs in medical knowledge.
arXiv Detail & Related papers (2024-05-29T23:19:28Z) - VISION: Toward a Standardized Process for Radiology Image Management at the National Level [3.793492459789475]
We describe our experiences in establishing a trusted collection of radiology images linked to the United States Department of Veterans Affairs (VA) electronic health record database.
Key insights include uncovering the specific procedures required for transferring images from a clinical to a research-ready environment.
arXiv Detail & Related papers (2024-04-29T16:30:24Z) - Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - Methods and datasets for segmentation of minimally invasive surgical
instruments in endoscopic images and videos: A review of the state of the art [0.0]
We identify and characterize datasets used for method development and evaluation.
The paper focuses on methods that work purely visually, without markers of any kind attached to the instruments.
A discussion of the reviewed literature is provided, highlighting existing shortcomings and emphasizing the available potential for future developments.
arXiv Detail & Related papers (2023-04-25T17:38:41Z) - Computer Vision on X-ray Data in Industrial Production and Security
Applications: A survey [89.45221564651145]
This survey reviews the recent research on using computer vision and machine learning for X-ray analysis in industrial production and security applications.
It covers the applications, techniques, evaluation metrics, datasets, and performance comparison of those techniques on publicly available datasets.
arXiv Detail & Related papers (2022-11-10T13:37:36Z) - Morphology-Aware Interactive Keypoint Estimation [32.52024944963992]
Diagnosis based on medical images often involves manual annotation of anatomical keypoints.
We propose a novel deep neural network that automatically detects and refines the anatomical keypoints through a user-interactive system.
arXiv Detail & Related papers (2022-09-15T09:27:14Z) - A survey on attention mechanisms for medical applications: are we moving
towards better algorithms? [2.8101673772585736]
This paper extensively reviews the use of attention mechanisms in machine learning for several medical applications.
It proposes a critical analysis of the claims and potentialities of attention mechanisms presented in the literature.
It proposes future research lines in medical applications that may benefit from these frameworks.
arXiv Detail & Related papers (2022-04-26T16:04:19Z) - PosePipe: Open-Source Human Pose Estimation Pipeline for Clinical
Research [0.0]
We develop a human pose estimation pipeline that facilitates running state-of-the-art algorithms on data acquired in clinical context.
Our goal in this work is not to train new algorithms, but to advance the use of cutting-edge human pose estimation algorithms for clinical and translation research.
arXiv Detail & Related papers (2022-03-16T17:54:37Z) - Leveraging Human Selective Attention for Medical Image Analysis with
Limited Training Data [72.1187887376849]
The selective attention mechanism helps the cognition system focus on task-relevant visual clues by ignoring the presence of distractors.
We propose a framework to leverage gaze for medical image analysis tasks with small training data.
Our method is demonstrated to achieve superior performance on both 3D tumor segmentation and 2D chest X-ray classification tasks.
arXiv Detail & Related papers (2021-12-02T07:55:25Z) - Domain Shift in Computer Vision models for MRI data analysis: An
Overview [64.69150970967524]
Machine learning and computer vision methods are showing good performance in medical imagery analysis.
Yet only a few applications are now in clinical use.
Poor transferability of themodels to data from different sources or acquisition domains is one of the reasons for that.
arXiv Detail & Related papers (2020-10-14T16:34:21Z) - Robust Medical Instrument Segmentation Challenge 2019 [56.148440125599905]
Intraoperative tracking of laparoscopic instruments is often a prerequisite for computer and robotic-assisted interventions.
Our challenge was based on a surgical data set comprising 10,040 annotated images acquired from a total of 30 surgical procedures.
The results confirm the initial hypothesis, namely that algorithm performance degrades with an increasing domain gap.
arXiv Detail & Related papers (2020-03-23T14:35:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.