ColonMapper: topological mapping and localization for colonoscopy
- URL: http://arxiv.org/abs/2305.05546v3
- Date: Wed, 10 Jul 2024 13:00:34 GMT
- Title: ColonMapper: topological mapping and localization for colonoscopy
- Authors: Javier Morlana, Juan D. Tardós, J. M. M. Montiel,
- Abstract summary: We propose a topological mapping and localization system able to operate on real human colonoscopies.
The map is a graph where each node codes a colon location by a set of real images, while edges represent traversability between nodes.
Experiments show that ColonMapper is able to autonomously build a map and localize against it in two important use cases.
- Score: 7.242530499990028
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a topological mapping and localization system able to operate on real human colonoscopies, despite significant shape and illumination changes. The map is a graph where each node codes a colon location by a set of real images, while edges represent traversability between nodes. For close-in-time images, where scene changes are minor, place recognition can be successfully managed with the recent transformers-based local feature matching algorithms. However, under long-term changes -- such as different colonoscopies of the same patient -- feature-based matching fails. To address this, we train on real colonoscopies a deep global descriptor achieving high recall with significant changes in the scene. The addition of a Bayesian filter boosts the accuracy of long-term place recognition, enabling relocalization in a previously built map. Our experiments show that ColonMapper is able to autonomously build a map and localize against it in two important use cases: localization within the same colonoscopy or within different colonoscopies of the same patient. Code: https://github.com/jmorlana/ColonMapper.
Related papers
- Frontiers in Intelligent Colonoscopy [96.57251132744446]
This study investigates the frontiers of intelligent colonoscopy techniques and their prospective implications for multimodal medical applications.
We assess the current data-centric and model-centric landscapes through four tasks for colonoscopic scene perception.
To embrace the coming multimodal era, we establish three foundational initiatives: a large-scale multimodal instruction tuning dataset ColonINST, a colonoscopy-designed multimodal language model ColonGPT, and a multimodal benchmark.
arXiv Detail & Related papers (2024-10-22T17:57:12Z) - Topological SLAM in colonoscopies leveraging deep features and topological priors [6.234802839923542]
ColonSLAM is a system that combines classical multiple-map metric SLAM with deep features and topological priors to create topological maps of the whole colon.
We demonstrate our approach in the Endomapper dataset, showing its potential for producing maps of the whole colon in real human explorations.
arXiv Detail & Related papers (2024-09-25T10:56:08Z) - Deep Homography Estimation for Visual Place Recognition [49.235432979736395]
We propose a transformer-based deep homography estimation (DHE) network.
It takes the dense feature map extracted by a backbone network as input and fits homography for fast and learnable geometric verification.
Experiments on benchmark datasets show that our method can outperform several state-of-the-art methods.
arXiv Detail & Related papers (2024-02-25T13:22:17Z) - LoCUS: Learning Multiscale 3D-consistent Features from Posed Images [18.648772607057175]
We train a versatile neural representation without supervision.
We find that it is possible to balance retrieval and reusability by constructing a retrieval set carefully.
We show results creating sparse, multi-scale, semantic spatial maps.
arXiv Detail & Related papers (2023-10-02T11:11:23Z) - GeoCLIP: Clip-Inspired Alignment between Locations and Images for
Effective Worldwide Geo-localization [61.10806364001535]
Worldwide Geo-localization aims to pinpoint the precise location of images taken anywhere on Earth.
Existing approaches divide the globe into discrete geographic cells, transforming the problem into a classification task.
We propose GeoCLIP, a novel CLIP-inspired Image-to-GPS retrieval approach that enforces alignment between the image and its corresponding GPS locations.
arXiv Detail & Related papers (2023-09-27T20:54:56Z) - Colonoscopy Coverage Revisited: Identifying Scanning Gaps in Real-Time [0.0]
Colonoscopy is the most widely used medical technique for preventing Colorectal Cancer, by detecting and removing polyps before they become malignant.
Recent studies show that around one quarter of the existing polyps are routinely missed.
While some of these do appear in the endoscopist's field of view, others are missed due to a partial coverage of the colon.
arXiv Detail & Related papers (2023-05-17T08:12:56Z) - SoftEnNet: Symbiotic Monocular Depth Estimation and Lumen Segmentation
for Colonoscopy Endorobots [2.9696400288366127]
Colorectal cancer is the third most common cause of cancer death worldwide.
A vision-based autonomous endorobot can improve colonoscopy procedures significantly.
arXiv Detail & Related papers (2023-01-19T16:22:17Z) - Colonoscopy Landmark Detection using Vision Transformers [0.0]
We have collected a dataset of 120 videos and 2416 snapshots taken during the procedure.
We have developed a novel, vision-transformer based landmark detection algorithm.
We report an accuracy of 82% with the vision transformer backbone on a test dataset of snapshots.
arXiv Detail & Related papers (2022-09-22T20:39:07Z) - Colonoscopy Polyp Detection: Domain Adaptation From Medical Report
Images to Real-time Videos [76.37907640271806]
We propose an Image-video-joint polyp detection network (Ivy-Net) to address the domain gap between colonoscopy images from historical medical reports and real-time videos.
Experiments on the collected dataset demonstrate that our Ivy-Net achieves the state-of-the-art result on colonoscopy video.
arXiv Detail & Related papers (2020-12-31T10:33:09Z) - Cross-Descriptor Visual Localization and Mapping [81.16435356103133]
Visual localization and mapping is the key technology underlying the majority of Mixed Reality and robotics systems.
We present three novel scenarios for localization and mapping which require the continuous update of feature representations.
Our data-driven approach is agnostic to the feature descriptor type, has low computational requirements, and scales linearly with the number of description algorithms.
arXiv Detail & Related papers (2020-12-02T18:19:51Z) - PraNet: Parallel Reverse Attention Network for Polyp Segmentation [155.93344756264824]
We propose a parallel reverse attention network (PraNet) for accurate polyp segmentation in colonoscopy images.
We first aggregate the features in high-level layers using a parallel partial decoder (PPD)
In addition, we mine the boundary cues using a reverse attention (RA) module, which is able to establish the relationship between areas and boundary cues.
arXiv Detail & Related papers (2020-06-13T08:13:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.