Baltimore Atlas: FreqWeaver Adapter for Semi-supervised Ultra-high Spatial Resolution Land Cover Classification
- URL: http://arxiv.org/abs/2506.15565v1
- Date: Wed, 18 Jun 2025 15:41:29 GMT
- Title: Baltimore Atlas: FreqWeaver Adapter for Semi-supervised Ultra-high Spatial Resolution Land Cover Classification
- Authors: Junhao Wu, Aboagye-Ntow Stephen, Chuyuan Wang, Gang Chen, Xin Huang,
- Abstract summary: The proposed method delivers robust segmentation results with superior structural consistency.<n>It achieves a 1.78% improvement over existing parameter-efficient tuning strategies and a 3.44% gain compared to state-of-the-art high-resolution remote sensing segmentation approaches.
- Score: 6.922029098479534
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ultra-high Spatial Resolution Land Cover Classification is essential for fine-grained land cover analysis, yet it remains challenging due to the high cost of pixel-level annotations, significant scale variation, and the limited adaptability of large-scale vision models. Existing methods typically focus on 1-meter spatial resolution imagery and rely heavily on annotated data, whereas practical applications often require processing higher-resolution imagery under weak supervision. To address this, we propose a parameter-efficient semi-supervised segmentation framework for 0.3 m spatial resolution imagery, which leverages the knowledge of SAM2 and introduces a remote sensing-specific FreqWeaver Adapter to enhance fine-grained detail modeling while maintaining a lightweight design at only 5.96% of the total model parameters. By effectively leveraging unlabeled data and maintaining minimal parameter overhead, the proposed method delivers robust segmentation results with superior structural consistency, achieving a 1.78% improvement over existing parameter-efficient tuning strategies and a 3.44% gain compared to state-of-the-art high-resolution remote sensing segmentation approaches.
Related papers
- Minimal High-Resolution Patches Are Sufficient for Whole Slide Image Representation via Cascaded Dual-Scale Reconstruction [13.897013242536849]
Whole-slide image (WSI) analysis remains challenging due to gigapixel scale and sparsely distributed diagnostic regions.<n>We propose a Cascaded Dual-Scale Reconstruction framework, demonstrating that only an average of 9 high-resolution patches per WSI are sufficient for robust slide-level representation.
arXiv Detail & Related papers (2025-08-03T08:01:30Z) - One Look is Enough: A Novel Seamless Patchwise Refinement for Zero-Shot Monocular Depth Estimation Models on High-Resolution Images [25.48185527420231]
We propose Patch Refine Once (PRO), an efficient and generalizable tile-based framework.<n>Our PRO consists of two key components: (i) Grouped Patch Consistency Training that enhances test-time efficiency while mitigating the depth discontinuity problem.<n>Our PRO can be well harmonized, making their DE capabilities still effective for the grid input of high-resolution images with little depth discontinuities at the grid boundaries.
arXiv Detail & Related papers (2025-03-28T11:46:50Z) - Towards Degradation-Robust Reconstruction in Generalizable NeRF [58.33351079982745]
Generalizable Radiance Field (GNeRF) across scenes has been proven to be an effective way to avoid per-scene optimization.
There has been limited research on the robustness of GNeRFs to different types of degradation present in the source images.
arXiv Detail & Related papers (2024-11-18T16:13:47Z) - High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity [69.32473738284374]
Diffusion models have revolutionized text-to-image synthesis by delivering exceptional quality, fine detail resolution, and strong contextual awareness.<n>We propose DiffDIS, a diffusion-driven segmentation model that taps into the potential of the pre-trained U-Net within diffusion models.<n>Experiments on the DIS5K dataset demonstrate the superiority of DiffDIS, achieving state-of-the-art results through a streamlined inference process.
arXiv Detail & Related papers (2024-10-14T02:49:23Z) - PatchRefiner: Leveraging Synthetic Data for Real-Domain High-Resolution Monocular Metric Depth Estimation [42.29746147944489]
PatchRefiner is an advanced framework for metric single image depth estimation aimed at high-resolution real-domain inputs.
PatchRefiner adopts a tile-based methodology, reconceptualizing high-resolution depth estimation as a refinement process.
Our evaluations demonstrate PatchRefiner's superior performance, significantly outperforming existing benchmarks on the Unreal4KStereo dataset.
arXiv Detail & Related papers (2024-06-10T18:00:03Z) - 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception [56.84921040837699]
Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results.
We propose an orthogonal plane disentanglement network (termed DOPNet) to distinguish ambiguous semantics.
We also present an unsupervised adaptation technique tailored for horizon-depth and ratio representations.
Our solution outperforms other SoTA models on both monocular layout estimation and multi-view layout estimation tasks.
arXiv Detail & Related papers (2023-12-26T12:16:03Z) - Low-Resolution Self-Attention for Semantic Segmentation [93.30597515880079]
We introduce the Low-Resolution Self-Attention (LRSA) mechanism to capture global context at a significantly reduced computational cost.<n>Our approach involves computing self-attention in a fixed low-resolution space regardless of the input image's resolution.<n>We demonstrate the effectiveness of our LRSA approach by building the LRFormer, a vision transformer with an encoder-decoder structure.
arXiv Detail & Related papers (2023-10-08T06:10:09Z) - Multi-tiling Neural Radiance Field (NeRF) -- Geometric Assessment on Large-scale Aerial Datasets [5.391764618878545]
In this paper, we aim to scale the Neural Radiance Fields (NeRF) on large-scael aerial datasets.
Specifically, we introduce a location-specific sampling technique as well as a multi-camera tiling (MCT) strategy to reduce memory consumption.
We implement our method on a representative approach, Mip-NeRF, and compare its geometry performance with threephotgrammetric MVS pipelines.
arXiv Detail & Related papers (2023-10-01T00:21:01Z) - Rethinking Lightweight Salient Object Detection via Network Depth-Width
Tradeoff [26.566339984225756]
Existing salient object detection methods often adopt deeper and wider networks for better performance.
We propose a novel trilateral decoder framework by decoupling the U-shape structure into three complementary branches.
We show that our method achieves better efficiency-accuracy balance across five benchmarks.
arXiv Detail & Related papers (2023-01-17T03:43:25Z) - High Quality Segmentation for Ultra High-resolution Images [72.97958314291648]
We propose the Continuous Refinement Model for the ultra high-resolution segmentation refinement task.
Our proposed method is fast and effective on image segmentation refinement.
arXiv Detail & Related papers (2021-11-29T11:53:06Z) - Best-Buddy GANs for Highly Detailed Image Super-Resolution [71.13466303340192]
We consider the single image super-resolution (SISR) problem, where a high-resolution (HR) image is generated based on a low-resolution (LR) input.
Most methods along this line rely on a predefined single-LR-single-HR mapping, which is not flexible enough for the SISR task.
We propose best-buddy GANs (Beby-GAN) for rich-detail SISR. Relaxing the immutable one-to-one constraint, we allow the estimated patches to dynamically seek the best supervision.
arXiv Detail & Related papers (2021-03-29T02:58:27Z) - Foveation for Segmentation of Ultra-High Resolution Images [8.037287701125832]
We introduce foveation module, a learnable "dataloader" which adaptively chooses the appropriate configuration (FoV/resolution trade-off) of the input patch to feed to the downstream segmentation model.
We demonstrate that the foveation module consistently improves segmentation performance over the cases trained with patches of fixed FoV/resolution trade-off.
arXiv Detail & Related papers (2020-07-29T21:44:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.