Related papers: Enhanced LULC Segmentation via Lightweight Model Refinements on ALOS-2 SAR Data

Enhanced LULC Segmentation via Lightweight Model Refinements on ALOS-2 SAR Data

URL: http://arxiv.org/abs/2601.15705v1
Date: Thu, 22 Jan 2026 07:18:06 GMT
Title: Enhanced LULC Segmentation via Lightweight Model Refinements on ALOS-2 SAR Data
Authors: Ali Caglayan, Nevrez Imamoglu, Toru Kouyama,
Abstract summary: This work focuses on national-scale land-use/land-cover (LULC) semantic segmentation using ALOS-2 single-polarization (HH) SAR data over Japan.<n>We address common SAR dense-prediction failure modes, boundary over-smoothing, missed thin/slender structures, and rare-class degradation under long-tailed labels.<n>The resulting model yields consistent improvements on the Japan-wide ALOS-2 LULC benchmark, particularly for under-represented classes.
Score: 1.4401311275746886
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work focuses on national-scale land-use/land-cover (LULC) semantic segmentation using ALOS-2 single-polarization (HH) SAR data over Japan, together with a companion binary water detection task. Building on SAR-W-MixMAE self-supervised pretraining [1], we address common SAR dense-prediction failure modes, boundary over-smoothing, missed thin/slender structures, and rare-class degradation under long-tailed labels, without increasing pipeline complexity. We introduce three lightweight refinements: (i) injecting high-resolution features into multi-scale decoding, (ii) a progressive refine-up head that alternates convolutional refinement and stepwise upsampling, and (iii) an $α$-scale factor that tempers class reweighting within a focal+dice objective. The resulting model yields consistent improvements on the Japan-wide ALOS-2 LULC benchmark, particularly for under-represented classes, and improves water detection across standard evaluation metrics.

Related papers

Multi-modal Loop Closure Detection with Foundation Models in Severely Unstructured Environments [10.028232479762075]
This paper presents MPRF, a multimodal pipeline that leverages transformer-based foundation models for both vision and LiDAR modalities to achieve robust loop closure.<n> Experiments on the S3LI dataset and S3LI Vulcano dataset show that MPRF outperforms state-of-the-art retrieval methods in precision.<n>By providing interpretable correspondences suitable for SLAM back-ends, MPRF achieves a favorable trade-off between accuracy, efficiency, and reliability.
arXiv Detail & Related papers (2025-11-07T16:30:35Z)
Dual-Stage Reweighted MoE for Long-Tailed Egocentric Mistake Detection [85.0189917888094]
We propose a Dual-Stage Reweighted Mixture-of-Experts (DR-MoE) framework to handle the challenges posed by subtle and infrequent mistakes.<n>The proposed method achieves strong performance, particularly in identifying rare and ambiguous mistake instances.
arXiv Detail & Related papers (2025-09-16T12:00:42Z)
O2Former:Direction-Aware and Multi-Scale Query Enhancement for SAR Ship Instance Segmentation [0.3611754783778107]
Instance segmentation of ships in synthetic aperture radar (SAR) imagery is critical for applications such as maritime monitoring, environmental analysis, and national security.<n> SAR ship images present challenges including scale variation, object density, and fuzzy target boundary.<n>We propose O2Former, a tailored instance segmentation framework that extends Mask2Former by fully leveraging the structural characteristics of SAR imagery.
arXiv Detail & Related papers (2025-06-13T16:06:51Z)
R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference [77.47238561728459]
R-Sparse is a training-free activation sparsity approach capable of achieving high sparsity levels in advanced LLMs.<n> Experiments on Llama-2/3 and Mistral models across ten diverse tasks demonstrate that R-Sparse achieves comparable performance at 50% model-level sparsity.
arXiv Detail & Related papers (2025-04-28T03:30:32Z)
Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning [51.525891360380285]
HDMIL is a hierarchical distillation multi-instance learning framework that achieves fast and accurate classification by eliminating irrelevant patches.<n> HDMIL consists of two key components: the dynamic multi-instance network (DMIN) and the lightweight instance pre-screening network (LIPN)
arXiv Detail & Related papers (2025-02-28T15:10:07Z)
X2-DFD: A framework for eXplainable and eXtendable Deepfake Detection [55.77552681618732]
X2-DFD is an eXplainable and eXtendable framework based on multimodal large-language models (MLLMs) for deepfake detection.<n>The first stage, Model Feature Assessment, systematically evaluates the detectability of forgery-related features for the MLLM.<n>The second stage, Explainable dataset Construction, consists of two key modules: Strong Feature Strengthening and Weak Feature Supplementing.<n>The third stage, Fine-tuning and Inference, involves fine-tuning the MLLM on the constructed dataset and deploying it for final detection and explanation.
arXiv Detail & Related papers (2024-10-08T15:28:33Z)
SARATR-X: Toward Building A Foundation Model for SAR Target Recognition [22.770010893572973]
We make the first attempt towards building a foundation model for SAR ATR, termed SARATR-X.<n>SARATR-X learns generalizable representations via self-supervised learning (SSL) and provides a cornerstone for label-efficient model adaptation to generic SAR target detection and classification tasks.<n>SARATR-X is trained on 0.18 M unlabelled SAR target samples, which are curated by combining contemporary benchmarks and constitute the largest publicly available dataset till now.
arXiv Detail & Related papers (2024-05-15T14:17:44Z)
Heterogeneous Network Based Contrastive Learning Method for PolSAR Land Cover Classification [18.37842655634498]
Supervised learning (SL) requires a large amount of labeled PolSAR data with high quality to achieve better performance. This article proposes a Heterogeneous Network based Contrastive Learning method(HCLNet) It aims to learn high-level representation from unlabeled PolSAR data for few-shot classification according to multi-features and superpixels.
arXiv Detail & Related papers (2024-03-29T01:05:23Z)
Small Object Detection via Coarse-to-fine Proposal Generation and Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning. CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z)
A Global Model Approach to Robust Few-Shot SAR Automatic Target Recognition [6.260916845720537]
It may not always be possible to collect hundreds of labeled samples per class for training deep learning-based SAR Automatic Target Recognition (ATR) models. This work specifically tackles the few-shot SAR ATR problem, where only a handful of labeled samples may be available to support the task of interest.
arXiv Detail & Related papers (2023-03-20T00:24:05Z)
The KFIoU Loss for Rotated Object Detection [115.334070064346]
In this paper, we argue that one effective alternative is to devise an approximate loss who can achieve trend-level alignment with SkewIoU loss. Specifically, we model the objects as Gaussian distribution and adopt Kalman filter to inherently mimic the mechanism of SkewIoU. The resulting new loss called KFIoU is easier to implement and works better compared with exact SkewIoU.
arXiv Detail & Related papers (2022-01-29T10:54:57Z)
DML-GANR: Deep Metric Learning With Generative Adversarial Network Regularization for High Spatial Resolution Remote Sensing Image Retrieval [9.423185775609426]
We develop a deep metric learning approach with generative adversarial network regularization (DML-GANR) for HSR-RSI retrieval. The experimental results on the three data sets demonstrate the superior performance of DML-GANR over state-of-the-art techniques in HSR-RSI retrieval.
arXiv Detail & Related papers (2020-10-07T02:26:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.