Localizing Anatomical Landmarks in Ocular Images using Zoom-In Attentive
Networks
- URL: http://arxiv.org/abs/2210.02445v1
- Date: Sun, 25 Sep 2022 15:08:20 GMT
- Title: Localizing Anatomical Landmarks in Ocular Images using Zoom-In Attentive
Networks
- Authors: Xiaofeng Lei, Shaohua Li, Xinxing Xu, Huazhu Fu, Yong Liu, Yih-Chung
Tham, Yangqin Feng, Mingrui Tan, Yanyu Xu, Jocelyn Hui Lin Goh, Rick Siow
Mong Goh, Ching-Yu Cheng
- Abstract summary: We propose a zoom-in attentive network (ZIAN) for anatomical landmark localization in ocular images.
Experiments show that ZIAN achieves promising performances and outperforms state-of-the-art localization methods.
- Score: 34.05575237813503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Localizing anatomical landmarks are important tasks in medical image
analysis. However, the landmarks to be localized often lack prominent visual
features. Their locations are elusive and easily confused with the background,
and thus precise localization highly depends on the context formed by their
surrounding areas. In addition, the required precision is usually higher than
segmentation and object detection tasks. Therefore, localization has its unique
challenges different from segmentation or detection. In this paper, we propose
a zoom-in attentive network (ZIAN) for anatomical landmark localization in
ocular images. First, a coarse-to-fine, or "zoom-in" strategy is utilized to
learn the contextualized features in different scales. Then, an attentive
fusion module is adopted to aggregate multi-scale features, which consists of
1) a co-attention network with a multiple regions-of-interest (ROIs) scheme
that learns complementary features from the multiple ROIs, 2) an
attention-based fusion module which integrates the multi-ROIs features and
non-ROI features. We evaluated ZIAN on two open challenge tasks, i.e., the
fovea localization in fundus images and scleral spur localization in AS-OCT
images. Experiments show that ZIAN achieves promising performances and
outperforms state-of-the-art localization methods. The source code and trained
models of ZIAN are available at
https://github.com/leixiaofeng-astar/OMIA9-ZIAN.
Related papers
- MSA$^2$Net: Multi-scale Adaptive Attention-guided Network for Medical Image Segmentation [8.404273502720136]
We introduce MSA$2$Net, a new deep segmentation framework featuring an expedient design of skip-connections.
We propose a Multi-Scale Adaptive Spatial Attention Gate (MASAG) to ensure that spatially relevant features are selectively highlighted.
Our MSA$2$Net outperforms state-of-the-art (SOTA) works or matches their performance.
arXiv Detail & Related papers (2024-07-31T14:41:10Z) - SA2-Net: Scale-aware Attention Network for Microscopic Image
Segmentation [36.286876343282565]
Microscopic image segmentation is a challenging task, wherein the objective is to assign semantic labels to each pixel in a given microscopic image.
We introduce SA2-Net, an attention-guided method that leverages multi-scale feature learning to handle diverse structures within microscopic images.
arXiv Detail & Related papers (2023-09-28T17:58:05Z) - Multi-View Vertebra Localization and Identification from CT Images [57.56509107412658]
We propose a multi-view vertebra localization and identification from CT images.
We convert the 3D problem into a 2D localization and identification task on different views.
Our method can learn the multi-view global information naturally.
arXiv Detail & Related papers (2023-07-24T14:43:07Z) - Retinal Structure Detection in OCTA Image via Voting-based Multi-task
Learning [27.637273690432608]
We propose a novel Voting-based Adaptive Feature Fusion multi-task network (VAFF-Net) for joint segmentation, detection, and classification of RV, FAZ, and RVJ.
A task-specific voting gate module is proposed to adaptively extract and fuse different features for specific tasks at two levels.
To facilitate further research, part of these datasets with the source code and evaluation benchmark have been released for public access.
arXiv Detail & Related papers (2022-08-23T05:53:04Z) - Hierarchical Attention Fusion for Geo-Localization [7.544917072241684]
We introduce a hierarchical attention fusion network using multi-scale features for geo-localization.
We extract the hierarchical feature maps from a convolutional neural network (CNN) and organically fuse the extracted features for image representations.
Our training is self-supervised using adaptive weights to control the attention of feature emphasis from each hierarchical level.
arXiv Detail & Related papers (2021-02-18T07:07:03Z) - Dual-Level Collaborative Transformer for Image Captioning [126.59298716978577]
We introduce a novel Dual-Level Collaborative Transformer (DLCT) network to realize the complementary advantages of the two features.
In addition, we propose a Locality-Constrained Cross Attention module to address the semantic noises caused by the direct fusion of these two features.
arXiv Detail & Related papers (2021-01-16T15:43:17Z) - PGL: Prior-Guided Local Self-supervised Learning for 3D Medical Image
Segmentation [87.50205728818601]
We propose a PriorGuided Local (PGL) self-supervised model that learns the region-wise local consistency in the latent feature space.
Our PGL model learns the distinctive representations of local regions, and hence is able to retain structural information.
arXiv Detail & Related papers (2020-11-25T11:03:11Z) - City-Scale Visual Place Recognition with Deep Local Features Based on
Multi-Scale Ordered VLAD Pooling [5.274399407597545]
We present a fully-automated system for place recognition at a city-scale based on content-based image retrieval.
Firstly, we take a comprehensive analysis of visual place recognition and sketch out the unique challenges of the task.
Next, we propose yet a simple pooling approach on top of convolutional neural network activations to embed the spatial information into the image representation vector.
arXiv Detail & Related papers (2020-09-19T15:21:59Z) - Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization [54.00111565818903]
Cross-view geo-localization is to spot images of the same geographic target from different platforms.
Existing methods usually concentrate on mining the fine-grained feature of the geographic target in the image center.
We introduce a simple and effective deep neural network, called Local Pattern Network (LPN), to take advantage of contextual information.
arXiv Detail & Related papers (2020-08-26T16:06:11Z) - Inter-Image Communication for Weakly Supervised Localization [77.2171924626778]
Weakly supervised localization aims at finding target object regions using only image-level supervision.
We propose to leverage pixel-level similarities across different objects for learning more accurate object locations.
Our method achieves the Top-1 localization error rate of 45.17% on the ILSVRC validation set.
arXiv Detail & Related papers (2020-08-12T04:14:11Z) - Geometrically Mappable Image Features [85.81073893916414]
Vision-based localization of an agent in a map is an important problem in robotics and computer vision.
We propose a method that learns image features targeted for image-retrieval-based localization.
arXiv Detail & Related papers (2020-03-21T15:36:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.