Zooming Out on Zooming In: Advancing Super-Resolution for Remote Sensing
- URL: http://arxiv.org/abs/2311.18082v1
- Date: Wed, 29 Nov 2023 21:06:45 GMT
- Title: Zooming Out on Zooming In: Advancing Super-Resolution for Remote Sensing
- Authors: Piper Wolters, Favyen Bastani, Aniruddha Kembhavi
- Abstract summary: Super-Resolution for remote sensing has the potential for huge impact on planet monitoring.
Despite a lot of attention, several inconsistencies and challenges have prevented it from being deployed in practice.
This work presents a new metric for super-resolution, CLIPScore, that corresponds far better with human judgments than previous metrics.
- Score: 31.409817016287704
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Super-Resolution for remote sensing has the potential for huge impact on
planet monitoring by producing accurate and realistic high resolution imagery
on a frequent basis and a global scale. Despite a lot of attention, several
inconsistencies and challenges have prevented it from being deployed in
practice. These include the lack of effective metrics, fragmented and
relatively small-scale datasets for training, insufficient comparisons across a
suite of methods, and unclear evidence for the use of super-resolution outputs
for machine consumption. This work presents a new metric for super-resolution,
CLIPScore, that corresponds far better with human judgments than previous
metrics on an extensive study. We use CLIPScore to evaluate four standard
methods on a new large-scale dataset, S2-NAIP, and three existing benchmark
datasets, and find that generative adversarial networks easily outperform more
traditional L2 loss-based models and are more semantically accurate than modern
diffusion models. We also find that using CLIPScore as an auxiliary loss can
speed up the training of GANs by 18x and lead to improved outputs, resulting in
an effective model in diverse geographies across the world which we will
release publicly. The dataset, pre-trained model weights, and code are
available at https://github.com/allenai/satlas-super-resolution/.
Related papers
- On Improving the Algorithm-, Model-, and Data- Efficiency of Self-Supervised Learning [18.318758111829386]
We propose an efficient single-branch SSL method based on non-parametric instance discrimination.
We also propose a novel self-distillation loss that minimizes the KL divergence between the probability distribution and its square root version.
arXiv Detail & Related papers (2024-04-30T06:39:04Z) - Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data [87.61900472933523]
This work presents Depth Anything, a highly practical solution for robust monocular depth estimation.
We scale up the dataset by designing a data engine to collect and automatically annotate large-scale unlabeled data.
We evaluate its zero-shot capabilities extensively, including six public datasets and randomly captured photos.
arXiv Detail & Related papers (2024-01-19T18:59:52Z) - Low-Resolution Self-Attention for Semantic Segmentation [96.81482872022237]
We introduce the Low-Resolution Self-Attention (LRSA) mechanism to capture global context at a significantly reduced computational cost.
Our approach involves computing self-attention in a fixed low-resolution space regardless of the input image's resolution.
We demonstrate the effectiveness of our LRSA approach by building the LRFormer, a vision transformer with an encoder-decoder structure.
arXiv Detail & Related papers (2023-10-08T06:10:09Z) - SqueezeLLM: Dense-and-Sparse Quantization [80.32162537942138]
Main bottleneck for generative inference with LLMs is memory bandwidth, rather than compute, for single batch inference.
We introduce SqueezeLLM, a post-training quantization framework that enables lossless compression to ultra-low precisions of up to 3-bit.
Our framework incorporates two novel ideas: (i) sensitivity-based non-uniform quantization, which searches for the optimal bit precision assignment based on second-order information; and (ii) the Dense-and-Sparse decomposition that stores outliers and sensitive weight values in an efficient sparse format.
arXiv Detail & Related papers (2023-06-13T08:57:54Z) - Evaluating the Label Efficiency of Contrastive Self-Supervised Learning
for Multi-Resolution Satellite Imagery [0.0]
Self-supervised learning has been applied in the remote sensing domain to exploit readily-available unlabeled data.
In this paper, we study self-supervised visual representation learning through the lens of label efficiency.
arXiv Detail & Related papers (2022-10-13T06:54:13Z) - Few-Shot Non-Parametric Learning with Deep Latent Variable Model [50.746273235463754]
We propose Non-Parametric learning by Compression with Latent Variables (NPC-LV)
NPC-LV is a learning framework for any dataset with abundant unlabeled data but very few labeled ones.
We show that NPC-LV outperforms supervised methods on all three datasets on image classification in low data regime.
arXiv Detail & Related papers (2022-06-23T09:35:03Z) - a novel attention-based network for fast salient object detection [14.246237737452105]
In the current salient object detection network, the most popular method is using U-shape structure.
We propose a new deep convolution network architecture with three contributions.
Results demonstrate that the proposed method can compress the model to 1/3 of the original size nearly without losing the accuracy.
arXiv Detail & Related papers (2021-12-20T12:30:20Z) - MogFace: Rethinking Scale Augmentation on the Face Detector [17.570686622370403]
We investigate the difference among the previous solutions, including the fore-ground and back-ground information of an image and the scale information.
We propose a Selective Scale Enhancement (SSE) strategy which can assimilate these two information efficiently and simultaneously.
Our method achieves state-of-the-art detection performance on all common face detection benchmarks.
arXiv Detail & Related papers (2021-03-20T09:17:04Z) - HR-Depth: High Resolution Self-Supervised Monocular Depth Estimation [14.81943833870932]
We present an improvedDepthNet, HR-Depth, with two effective strategies.
Using Resnet-18 as the encoder, HR-Depth surpasses all pre-vious state-of-the-art(SoTA) methods with the least param-eters at both high and low resolution.
arXiv Detail & Related papers (2020-12-14T09:15:15Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - Pixel-in-Pixel Net: Towards Efficient Facial Landmark Detection in the
Wild [104.61677518999976]
We propose Pixel-in-Pixel Net (PIPNet) for facial landmark detection.
The proposed model is equipped with a novel detection head based on heatmap regression.
To further improve the cross-domain generalization capability of PIPNet, we propose self-training with curriculum.
arXiv Detail & Related papers (2020-03-08T12:23:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.