Sampling Strategy for Fine-Tuning Segmentation Models to Crisis Area
under Scarcity of Data
- URL: http://arxiv.org/abs/2202.04766v1
- Date: Wed, 9 Feb 2022 23:16:58 GMT
- Title: Sampling Strategy for Fine-Tuning Segmentation Models to Crisis Area
under Scarcity of Data
- Authors: Adrianna Janik and Kris Sankaran
- Abstract summary: We propose a method to guide data collection during fine-tuning, based on estimated model and sample properties.
We have applied our method to a deep learning model for semantic segmentation, U-Net, in a remote sensing application of building detection.
- Score: 0.76146285961466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The use of remote sensing in humanitarian crisis response missions is
well-established and has proven relevant repeatedly. One of the problems is
obtaining gold annotations as it is costly and time consuming which makes it
almost impossible to fine-tune models to new regions affected by the crisis.
Where time is critical, resources are limited and environment is constantly
changing, models has to evolve and provide flexible ways to adapt to a new
situation. The question that we want to answer is if prioritization of samples
provide better results in fine-tuning vs other classical sampling methods under
annotated data scarcity? We propose a method to guide data collection during
fine-tuning, based on estimated model and sample properties, like predicted IOU
score. We propose two formulas for calculating sample priority. Our approach
blends techniques from interpretability, representation learning and active
learning. We have applied our method to a deep learning model for semantic
segmentation, U-Net, in a remote sensing application of building detection -
one of the core use cases of remote sensing in humanitarian applications.
Preliminary results shows utility in prioritization of samples for tuning
semantic segmentation models under scarcity of data condition.
Related papers
- Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance.
DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator.
Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z) - Self-Guided Generation of Minority Samples Using Diffusion Models [57.319845580050924]
We present a novel approach for generating minority samples that live on low-density regions of a data manifold.
Our framework is built upon diffusion models, leveraging the principle of guided sampling.
Experiments on benchmark real datasets demonstrate that our approach can greatly improve the capability of creating realistic low-likelihood minority instances.
arXiv Detail & Related papers (2024-07-16T10:03:29Z) - Model-Free Active Exploration in Reinforcement Learning [53.786439742572995]
We study the problem of exploration in Reinforcement Learning and present a novel model-free solution.
Our strategy is able to identify efficient policies faster than state-of-the-art exploration approaches.
arXiv Detail & Related papers (2024-06-30T19:00:49Z) - Federated Continual Learning Goes Online: Leveraging Uncertainty for Modality-Agnostic Class-Incremental Learning [13.867793835583463]
We propose a new modality-agnostic approach to deal with the online scenario where new data arrive in streams of mini-batches that can only be processed once.
In particular, we suggest using an estimator based on the Bregman Information (BI) to compute the model's variance at the sample level.
arXiv Detail & Related papers (2024-05-29T09:29:39Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Spatial-temporal Forecasting for Regions without Observations [13.805203053973772]
We study spatial-temporal forecasting for a region of interest without any historical observations.
We propose a model named STSM for the task.
Our key insight is to learn from the locations that resemble those in the region of interest.
arXiv Detail & Related papers (2024-01-19T06:26:05Z) - Gradient and Uncertainty Enhanced Sequential Sampling for Global Fit [0.0]
This paper proposes a new sampling strategy for global fit called Gradient and Uncertainty Enhanced Sequential Sampling (GUESS)
We show that GUESS achieved on average the highest sample efficiency compared to other surrogate-based strategies on the tested examples.
arXiv Detail & Related papers (2023-09-29T19:49:39Z) - Informative Data Mining for One-Shot Cross-Domain Semantic Segmentation [84.82153655786183]
We propose a novel framework called Informative Data Mining (IDM) to enable efficient one-shot domain adaptation for semantic segmentation.
IDM provides an uncertainty-based selection criterion to identify the most informative samples, which facilitates quick adaptation and reduces redundant training.
Our approach outperforms existing methods and achieves a new state-of-the-art one-shot performance of 56.7%/55.4% on the GTA5/SYNTHIA to Cityscapes adaptation tasks.
arXiv Detail & Related papers (2023-09-25T15:56:01Z) - Optimal Sample Selection Through Uncertainty Estimation and Its
Application in Deep Learning [22.410220040736235]
We present a theoretically optimal solution for addressing both coreset selection and active learning.
Our proposed method, COPS, is designed to minimize the expected loss of a model trained on subsampled data.
arXiv Detail & Related papers (2023-09-05T14:06:33Z) - Less is More: Mitigate Spurious Correlations for Open-Domain Dialogue
Response Generation Models by Causal Discovery [52.95935278819512]
We conduct the first study on spurious correlations for open-domain response generation models based on a corpus CGDIALOG curated in our work.
Inspired by causal discovery algorithms, we propose a novel model-agnostic method for training and inference of response generation model.
arXiv Detail & Related papers (2023-03-02T06:33:48Z) - Identifying Wrongly Predicted Samples: A Method for Active Learning [6.976600214375139]
We propose a simple sample selection criterion that moves beyond uncertainty.
We show state-of-the-art results and better rates at identifying wrongly predicted samples.
arXiv Detail & Related papers (2020-10-14T09:00:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.