AMD-HookNet++: Evolution of AMD-HookNet with Hybrid CNN-Transformer Feature Enhancement for Glacier Calving Front Segmentation
- URL: http://arxiv.org/abs/2512.14639v1
- Date: Tue, 16 Dec 2025 17:57:52 GMT
- Title: AMD-HookNet++: Evolution of AMD-HookNet with Hybrid CNN-Transformer Feature Enhancement for Glacier Calving Front Segmentation
- Authors: Fei Wu, Marcel Dreier, Nora Gourmelon, Sebastian Wind, Jianlin Zhang, Thorsten Seehaus, Matthias Braun, Andreas Maier, Vincent Christlein,
- Abstract summary: AMD-HookNet firstly introduces a pure two-branch convolutional neural network (CNN) for glacier segmentation.<n>We propose AMD-HookNet++, a novel advanced hybrid CNN-Transformer feature enhancement method for segmenting glaciers.<n>We show that AMD-HookNet++ sets a new state of the art with an IoU of 78.2 and a HD95 of 1,318 m, while maintaining a competitive MDE of 367 m.
- Score: 12.638003974433554
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The dynamics of glaciers and ice shelf fronts significantly impact the mass balance of ice sheets and coastal sea levels. To effectively monitor glacier conditions, it is crucial to consistently estimate positional shifts of glacier calving fronts. AMD-HookNet firstly introduces a pure two-branch convolutional neural network (CNN) for glacier segmentation. Yet, the local nature and translational invariance of convolution operations, while beneficial for capturing low-level details, restricts the model ability to maintain long-range dependencies. In this study, we propose AMD-HookNet++, a novel advanced hybrid CNN-Transformer feature enhancement method for segmenting glaciers and delineating calving fronts in synthetic aperture radar images. Our hybrid structure consists of two branches: a Transformer-based context branch to capture long-range dependencies, which provides global contextual information in a larger view, and a CNN-based target branch to preserve local details. To strengthen the representation of the connected hybrid features, we devise an enhanced spatial-channel attention module to foster interactions between the hybrid CNN-Transformer branches through dynamically adjusting the token relationships from both spatial and channel perspectives. Additionally, we develop a pixel-to-pixel contrastive deep supervision to optimize our hybrid model by integrating pixelwise metric learning into glacier segmentation. Through extensive experiments and comprehensive quantitative and qualitative analyses on the challenging glacier segmentation benchmark dataset CaFFe, we show that AMD-HookNet++ sets a new state of the art with an IoU of 78.2 and a HD95 of 1,318 m, while maintaining a competitive MDE of 367 m. More importantly, our hybrid model produces smoother delineations of calving fronts, resolving the issue of jagged edges typically seen in pure Transformer-based approaches.
Related papers
- A Rapid GeoSAM-Based Workflow for Multi-Temporal Glacier Delineation: Case Study from Svalbard [51.56484100374058]
We present a GeoSAM-based, semi-automatic workflow for rapid glacier delineation from Sentinel-2 imagery.<n>Results show that the approach produces spatially coherent and temporally consistent outlines for major glacier bodies.<n>The reliance on derived RGB imagery makes the method flexible and transferable to other optical datasets.
arXiv Detail & Related papers (2025-12-28T09:42:01Z) - KAN-GCN: Combining Kolmogorov-Arnold Network with Graph Convolution Network for an Accurate Ice Sheet Emulator [1.433758865948252]
We introduce KAN-GCN, a fast and accurate emulator for ice sheet modeling.<n>KAN-GCN places a Kolmogorov-Arnold Network (KAN) as a feature-wise calibrator before graph convolution networks (GCNs)
arXiv Detail & Related papers (2025-10-28T19:55:29Z) - SSL4SAR: Self-Supervised Learning for Glacier Calving Front Extraction from SAR Imagery [5.385992181979485]
Glaciers are losing ice mass at unprecedented rates, increasing the need for accurate, year-round monitoring.<n>Deep learning models can extract calving front positions to track seasonal ice losses at the calving fronts of marine- and lake-terminating glaciers.<n>We introduce a novel hybrid model architecture that combines a Swin Transformer encoder with a residual Convolutional Neural Network (CNN) decoder.
arXiv Detail & Related papers (2025-07-02T14:24:23Z) - VRS-UIE: Value-Driven Reordering Scanning for Underwater Image Enhancement [104.78586859995333]
State Space Models (SSMs) have emerged as a promising backbone for vision tasks due to their linear complexity and global receptive field.<n>The predominance of large-portion, homogeneous but useless oceanic backgrounds can dilute the feature representation responses of sparse yet valuable targets.<n>We propose a novel Value-Driven Reordering Scanning framework for Underwater Image Enhancement (UIE)<n>Our framework sets a new state-of-the-art, delivering superior enhancement performance (surpassing WMamba by 0.89 dB on average) by effectively suppressing water bias and preserving structural and color fidelity.
arXiv Detail & Related papers (2025-05-02T12:21:44Z) - STEAM: Squeeze and Transform Enhanced Attention Module [1.3370933421481221]
We propose a graph-based approach for modeling both channel and spatial attention, utilizing concepts from multi-head graph transformers.<n>STEAM achieves a 2% increase in accuracy over the standard ResNet-50 model with only a meager increase in GFLOPs.<n>STEAM outperforms leading modules ECA and GCT in terms of accuracy while achieving a three-fold reduction in GFLOPs.
arXiv Detail & Related papers (2024-12-12T07:38:10Z) - A Hybrid Transformer-Mamba Network for Single Image Deraining [70.64069487982916]
Existing deraining Transformers employ self-attention mechanisms with fixed-range windows or along channel dimensions.
We introduce a novel dual-branch hybrid Transformer-Mamba network, denoted as TransMamba, aimed at effectively capturing long-range rain-related dependencies.
arXiv Detail & Related papers (2024-08-31T10:03:19Z) - Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising [54.110544509099526]
Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data.
We propose a hybrid convolution and attention network (HCANet) to enhance HSI denoising.
Experimental results on mainstream HSI datasets demonstrate the rationality and effectiveness of the proposed HCANet.
arXiv Detail & Related papers (2024-03-15T07:18:43Z) - AMD-HookNet for Glacier Front Segmentation [17.60067480799222]
knowledge on changes in glacier calving front positions is important for assessing the status of glaciers.
Deep learning-based methods have shown great potential for glacier calving front delineation from optical and radar satellite imagery.
We propose Attention-Multi-hooking-Deep-supervision HookNet (AMD-HookNet), a novel glacier calving front segmentation framework.
arXiv Detail & Related papers (2023-02-06T12:39:40Z) - Adaptive Split-Fusion Transformer [90.04885335911729]
We propose an Adaptive Split-Fusion Transformer (ASF-former) to treat convolutional and attention branches differently with adaptive weights.
Experiments on standard benchmarks, such as ImageNet-1K, show that our ASF-former outperforms its CNN, transformer counterparts, and hybrid pilots in terms of accuracy.
arXiv Detail & Related papers (2022-04-26T10:00:28Z) - DepthFormer: Exploiting Long-Range Correlation and Local Information for
Accurate Monocular Depth Estimation [50.08080424613603]
Long-range correlation is essential for accurate monocular depth estimation.
We propose to leverage the Transformer to model this global context with an effective attention mechanism.
Our proposed model, termed DepthFormer, surpasses state-of-the-art monocular depth estimation methods with prominent margins.
arXiv Detail & Related papers (2022-03-27T05:03:56Z) - Bayesian U-Net for Segmenting Glaciers in SAR Imagery [7.960675807187592]
We propose to compute uncertainty and use it in an Uncertainty Optimization regime as a novel two-stage process.
We show that feeding the uncertainty map to the network leads to 95.24% Dice similarity.
This is an overall improvement in the segmentation performance compared to the state-of-the-art deterministic U-Net-based glacier segmentation pipelines.
arXiv Detail & Related papers (2021-01-08T23:17:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.