UKANFormer: Noise-Robust Semantic Segmentation for Coral Reef Mapping via a Kolmogorov-Arnold Network-Transformer Hybrid
- URL: http://arxiv.org/abs/2510.16730v2
- Date: Mon, 27 Oct 2025 06:44:38 GMT
- Title: UKANFormer: Noise-Robust Semantic Segmentation for Coral Reef Mapping via a Kolmogorov-Arnold Network-Transformer Hybrid
- Authors: Tianyang Dou, Ming Li, Jiangying Qin, Xuan Liao, Jiageng Zhong, Armin Gruen, Mengyi Deng,
- Abstract summary: We propose a novel se-mantic segmentation model to achieve high-precision mapping under noisy supervision.<n>UKANFormer incorporates a Global-Local Transformer (GL-Trans) block in the decoder, enabling the extraction of both global semantic structures and local boundary details.<n>In experiments, UKANFormer achieved a coral-class IoU of 67.00% and pixel accuracy of 83.98%, outperforming conventional baselines under the same noisy labels setting.
- Score: 4.147037463809748
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Coral reefs are vital yet fragile ecosystems that require accurate large-scale mapping for effective conservation. Although global products such as the Allen Coral Atlas provide unprecedented coverage of global coral reef distri-bution, their predictions are frequently limited in spatial precision and semantic consistency, especially in regions requiring fine-grained boundary delineation. To address these challenges, we propose UKANFormer, a novel se-mantic segmentation model designed to achieve high-precision mapping under noisy supervision derived from Allen Coral Atlas. Building upon the UKAN architecture, UKANFormer incorporates a Global-Local Transformer (GL-Trans) block in the decoder, enabling the extraction of both global semantic structures and local boundary details. In experiments, UKANFormer achieved a coral-class IoU of 67.00% and pixel accuracy of 83.98%, outperforming conventional baselines under the same noisy labels setting. Remarkably, the model produces predictions that are visually and structurally more accurate than the noisy labels used for training. These results challenge the notion that data quality directly limits model performance, showing that architectural design can mitigate label noise and sup-port scalable mapping under imperfect supervision. UKANFormer provides a foundation for ecological monitoring where reliable labels are scarce.
Related papers
- StepVAR: Structure-Texture Guided Pruning for Visual Autoregressive Models [98.72926158261937]
We propose a training-free token pruning framework for Visual AutoRegressive models.<n>We employ a lightweight high-pass filter to capture local texture details, while leveraging Principal Component Analysis (PCA) to preserve global structural information.<n>To maintain valid next-scale prediction under sparse tokens, we introduce a nearest neighbor feature propagation strategy.
arXiv Detail & Related papers (2026-03-02T11:35:05Z) - Breaking the Regional Barrier: Inductive Semantic Topology Learning for Worldwide Air Quality Forecasting [99.4484686548807]
We propose OmniAir, a semantic topology learning framework tailored for global station-level prediction.<n>Our approach effectively captures long-range non-Euclidean correlations and physical diffusion patterns across unevenly distributed global networks.<n>Experiments show that OmniAir achieves state-of-the-art performance against 18 baselines, maintaining high efficiency and scalability with speeds nearly 10 times faster than existing models.
arXiv Detail & Related papers (2026-01-29T15:58:07Z) - UAGLNet: Uncertainty-Aggregated Global-Local Fusion Network with Cooperative CNN-Transformer for Building Extraction [83.48950950780554]
Building extraction from remote sensing images is a challenging task due to the complex structure variations of buildings.<n>Existing methods employ convolutional or self-attention blocks to capture the multi-scale features in the segmentation models.<n>We present an Uncertainty-Aggregated Global-Local Fusion Network (UAGLNet) to exploit high-quality global-local visual semantics.
arXiv Detail & Related papers (2025-12-15T02:59:16Z) - ReefNet: A Large scale, Taxonomically Enriched Dataset and Benchmark for Hard Coral Classification [32.33672517443114]
We introduce ReefNet, a large public coral reef image dataset with point-label annotations mapped to the World Register of Marine Species (WoRMS)<n>ReefNet aggregates imagery from 76 curated CoralNet sources and an additional site from Al Wajh in the Red Sea.<n>We analyze both supervised and zero-shot classification performance on ReefNet and find that while supervised within-source performance is promising, supervised performance drops sharply across domains.
arXiv Detail & Related papers (2025-10-19T13:18:44Z) - Needles in the Landscape: Semi-Supervised Pseudolabeling for Archaeological Site Discovery under Label Scarcity [0.3957768262206626]
Archaeological predictive modelling estimates where undiscovered sites are likely to occur by combining known locations with environmental, cultural, and geospatial variables.<n>We address this challenge using a deep learning approach but must contend with structural label scarcity inherent to archaeology: positives are rare, and most locations are unlabeled.<n>Our results indicate that semi-supervised learning offers a promising approach to identifying undiscovered sites across large, sparsely annotated landscapes.
arXiv Detail & Related papers (2025-10-19T12:54:38Z) - Ecologically Valid Benchmarking and Adaptive Attention: Scalable Marine Bioacoustic Monitoring [2.558238597112103]
GetNetUPAM is a nested cross-validation framework to model stability under realistic variability.<n>Data are partitioned into distinct site-year segments, preserving recording and ensuring each validation fold reflects a unique environmental subset.<n>ARPA-N achieves a 14.4% gain in average precision over DenseNet baselines and a log2-scale order-of-magnitude drop in variability across all metrics.
arXiv Detail & Related papers (2025-09-04T22:03:05Z) - TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation [65.74990259650984]
We introduce TerraFM, a scalable self-supervised learning model that leverages globally distributed Sentinel-1 and Sentinel-2 imagery.<n>Our training strategy integrates local-global contrastive learning and introduces a dual-centering mechanism.<n>TerraFM achieves strong generalization on both classification and segmentation tasks, outperforming prior models on GEO-Bench and Copernicus-Bench.
arXiv Detail & Related papers (2025-06-06T17:59:50Z) - VRS-UIE: Value-Driven Reordering Scanning for Underwater Image Enhancement [104.78586859995333]
State Space Models (SSMs) have emerged as a promising backbone for vision tasks due to their linear complexity and global receptive field.<n>The predominance of large-portion, homogeneous but useless oceanic backgrounds can dilute the feature representation responses of sparse yet valuable targets.<n>We propose a novel Value-Driven Reordering Scanning framework for Underwater Image Enhancement (UIE)<n>Our framework sets a new state-of-the-art, delivering superior enhancement performance (surpassing WMamba by 0.89 dB on average) by effectively suppressing water bias and preserving structural and color fidelity.
arXiv Detail & Related papers (2025-05-02T12:21:44Z) - The Coralscapes Dataset: Semantic Scene Understanding in Coral Reefs [4.096374910845255]
We release the first general-purpose dense semantic segmentation dataset for coral reefs, covering 2075 images, 39 benthic classes, and 174k segmentation masks annotated by experts.<n>We benchmark a wide range of semantic segmentation models, and find that transfer learning from Coralscapes to existing smaller datasets consistently leads to state-of-the-art performance.<n>Coralscapes will catalyze research on efficient, scalable, and standardized coral reef surveying methods based on computer vision, and holds the potential to streamline the development of underwater ecological robotics.
arXiv Detail & Related papers (2025-03-25T18:33:59Z) - Global High Categorical Resolution Land Cover Mapping via Weak Supervision [19.52604717907002]
We propose to combine fully labeled source domain and weakly labeled target domain for weakly supervised domain adaptation (WSDA)
This is beneficial as the utilization of sparse and coarse weak labels can considerably alleviate the labor required for precise and detailed land cover annotation.
We carry out high categorical resolution land cover mapping for 10 cities in different regions around the world, severally using PlanetScope, Gaofen-1, and Sentinel-2 satellite images.
arXiv Detail & Related papers (2024-06-02T23:18:12Z) - Progressive Feature Self-reinforcement for Weakly Supervised Semantic
Segmentation [55.69128107473125]
We propose a single-stage approach for Weakly Supervised Semantic (WSSS) with image-level labels.
We adaptively partition the image content into deterministic regions (e.g., confident foreground and background) and uncertain regions (e.g., object boundaries and misclassified categories) for separate processing.
Building upon this, we introduce a complementary self-enhancement method that constrains the semantic consistency between these confident regions and an augmented image with the same class labels.
arXiv Detail & Related papers (2023-12-14T13:21:52Z) - Confident Sinkhorn Allocation for Pseudo-Labeling [40.883130133661304]
Semi-supervised learning is a critical tool in reducing machine learning's dependence on labeled data.
This paper studies theoretically the role of uncertainty to pseudo-labeling and proposes Confident Sinkhorn Allocation (CSA)
CSA identifies the best pseudo-label allocation via optimal transport to only samples with high confidence scores.
arXiv Detail & Related papers (2022-06-13T02:16:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.