Coordinated Transformer with Position \& Sample-aware Central Loss for
Anatomical Landmark Detection
- URL: http://arxiv.org/abs/2305.11338v1
- Date: Thu, 18 May 2023 23:05:01 GMT
- Title: Coordinated Transformer with Position \& Sample-aware Central Loss for
Anatomical Landmark Detection
- Authors: Qikui Zhu, Yihui Bi, Danxin Wang, Xiangpeng Chu, Jie Chen, Yanqing
Wang
- Abstract summary: Heatmap-based anatomical landmark detection is still facing two unresolved challenges.
We propose a novel position-aware and sample-aware central loss.
A Coordinated Transformer, called CoorTransformer, is proposed to address the challenge of ignoring structure information.
- Score: 6.004522909994631
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Heatmap-based anatomical landmark detection is still facing two unresolved
challenges: 1) inability to accurately evaluate the distribution of heatmap; 2)
inability to effectively exploit global spatial structure information. To
address the computational inability challenge, we propose a novel
position-aware and sample-aware central loss. Specifically, our central loss
can absorb position information, enabling accurate evaluation of the heatmap
distribution. More advanced is that our central loss is sample-aware, which can
adaptively distinguish easy and hard samples and make the model more focused on
hard samples while solving the challenge of extreme imbalance between landmarks
and non-landmarks. To address the challenge of ignoring structure information,
a Coordinated Transformer, called CoorTransformer, is proposed, which
establishes long-range dependencies under the guidance of landmark coordination
information, making the attention more focused on the sparse landmarks while
taking advantage of global spatial structure. Furthermore, CoorTransformer can
speed up convergence, effectively avoiding the defect that Transformers have
difficulty converging in sparse representation learning. Using the advanced
CoorTransformer and central loss, we propose a generalized detection model that
can handle various scenarios, inherently exploiting the underlying relationship
between landmarks and incorporating rich structural knowledge around the target
landmarks. We analyzed and evaluated CoorTransformer and central loss on three
challenging landmark detection tasks. The experimental results show that our
CoorTransformer outperforms state-of-the-art methods, and the central loss
significantly improves the performance of the model with p-values< 0.05.
Related papers
- Kriformer: A Novel Spatiotemporal Kriging Approach Based on Graph Transformers [5.4381914710364665]
This study addresses posed by sparse sensor deployment and unreliable data by framing the problem as an environmental challenge.
A graphkriformer model, Kriformer, estimates data at locations without sensors by mining spatial and temporal correlations, even with limited resources.
arXiv Detail & Related papers (2024-09-23T11:01:18Z) - Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture.
We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z) - Unsupervised Landmark Discovery Using Consistency Guided Bottleneck [63.624186864522315]
We introduce a consistency-guided bottleneck in an image reconstruction-based pipeline.
We propose obtaining pseudo-supervision via forming landmark correspondence across images.
The consistency then modulates the uncertainty of the discovered landmarks in the generation of adaptive heatmaps.
arXiv Detail & Related papers (2023-09-19T10:57:53Z) - Weakly-supervised 3D Pose Transfer with Keypoints [57.66991032263699]
Main challenges of 3D pose transfer are: 1) Lack of paired training data with different characters performing the same pose; 2) Disentangling pose and shape information from the target mesh; 3) Difficulty in applying to meshes with different topologies.
We propose a novel weakly-supervised keypoint-based framework to overcome these difficulties.
arXiv Detail & Related papers (2023-07-25T12:40:24Z) - Semi-Supervised Building Footprint Generation with Feature and Output
Consistency Training [17.6179873429447]
State-of-the-art semi-supervised semantic segmentation networks with consistency training can help to deal with this issue.
We propose to integrate the consistency of both features and outputs in the end-to-end network training of unlabeled samples.
Experimental results show that the proposed approach can well extract more complete building structures.
arXiv Detail & Related papers (2022-05-17T14:55:13Z) - Robust Self-Supervised LiDAR Odometry via Representative Structure
Discovery and 3D Inherent Error Modeling [67.75095378830694]
We develop a two-stage odometry estimation network, where we obtain the ego-motion by estimating a set of sub-region transformations.
In this paper, we aim to alleviate the influence of unreliable structures in training, inference and mapping phases.
Our two-frame odometry outperforms the previous state of the arts by 16%/12% in terms of translational/rotational errors.
arXiv Detail & Related papers (2022-02-27T12:52:27Z) - The KFIoU Loss for Rotated Object Detection [115.334070064346]
In this paper, we argue that one effective alternative is to devise an approximate loss who can achieve trend-level alignment with SkewIoU loss.
Specifically, we model the objects as Gaussian distribution and adopt Kalman filter to inherently mimic the mechanism of SkewIoU.
The resulting new loss called KFIoU is easier to implement and works better compared with exact SkewIoU.
arXiv Detail & Related papers (2022-01-29T10:54:57Z) - Toward Minimal Misalignment at Minimal Cost in One-Stage and Anchor-Free
Object Detection [6.486325109549893]
classification and regression branches have different sensibility to the features from the same scale level and the same spatial location.
We propose a point-based prediction method, which is based on the assumption that the high classification confidence point has the high regression quality, leads to the misalignment problem.
We aim to resolve the phenomenon at minimal cost: a minor adjustment of the head network and a new label assignment method replacing the rigid one.
arXiv Detail & Related papers (2021-12-16T14:22:13Z) - Pretrained equivariant features improve unsupervised landmark discovery [69.02115180674885]
We formulate a two-step unsupervised approach that overcomes this challenge by first learning powerful pixel-based features.
Our method produces state-of-the-art results in several challenging landmark detection datasets.
arXiv Detail & Related papers (2021-04-07T05:42:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.