Mind the Gap: Bridging Prior Shift in Realistic Few-Shot Crop-Type Classification
- URL: http://arxiv.org/abs/2511.16218v1
- Date: Thu, 20 Nov 2025 10:39:25 GMT
- Title: Mind the Gap: Bridging Prior Shift in Realistic Few-Shot Crop-Type Classification
- Authors: Joana Reuss, Ekaterina Gikalo, Marco Körner,
- Abstract summary: Real-world agricultural distributions often suffer from severe class imbalance, typically following a long-tailed distribution.<n>We propose Dirichlet Prior Augmentation (DirPA), a novel method that simulates an unknown label distribution skew of the target domain proactively during model training.<n>Our experiments show that DirPA successfully shifts the decision boundary and stabilizes the training process by acting as a dynamic feature regularizer.
- Score: 0.3823356975862005
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Real-world agricultural distributions often suffer from severe class imbalance, typically following a long-tailed distribution. Labeled datasets for crop-type classification are inherently scarce and remain costly to obtain. When working with such limited data, training sets are frequently constructed to be artificially balanced -- in particular in the case of few-shot learning -- failing to reflect real-world conditions. This mismatch induces a shift between training and test label distributions, degrading real-world generalization. To address this, we propose Dirichlet Prior Augmentation (DirPA), a novel method that simulates an unknown label distribution skew of the target domain proactively during model training. Specifically, we model the real-world distribution as Dirichlet-distributed random variables, effectively performing a prior augmentation during few-shot learning. Our experiments show that DirPA successfully shifts the decision boundary and stabilizes the training process by acting as a dynamic feature regularizer.
Related papers
- PAC Learnability in the Presence of Performativity [5.996298190476913]
We study whether and when performative binary classification problems are learnable, via the lens of the classic PAC (Probably Approximately Correct) learning framework.<n>We construct a performative empirical risk function, which depends on data from the original distribution and on the type performative effect.<n>Minimizing this notion of performative risk allows us to show that any PAC-learnable hypothesis space in the standard binary classification setting remains PAC-learnable.
arXiv Detail & Related papers (2025-10-09T15:22:52Z) - Towards the Mitigation of Confirmation Bias in Semi-supervised Learning: a Debiased Training Perspective [6.164100243945264]
Semi-supervised learning (SSL) commonly exhibits confirmation bias, where models disproportionately favor certain classes.
We introduce TaMatch, a unified framework for debiased training in SSL.
We show that TaMatch significantly outperforms existing state-of-the-art methods across a range of challenging image classification tasks.
arXiv Detail & Related papers (2024-09-26T21:50:30Z) - SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning [49.94607673097326]
We propose a highly adaptable framework, designated as SimPro, which does not rely on any predefined assumptions about the distribution of unlabeled data.
Our framework, grounded in a probabilistic model, innovatively refines the expectation-maximization algorithm.
Our method showcases consistent state-of-the-art performance across diverse benchmarks and data distribution scenarios.
arXiv Detail & Related papers (2024-02-21T03:39:04Z) - Exploring Vacant Classes in Label-Skewed Federated Learning [113.65301899666645]
This paper introduces FedVLS, a novel approach to label-skewed federated learning.<n>It integrates vacant-class distillation and logit suppression simultaneously.<n>Experiments validate the efficacy of FedVLS, demonstrating superior performance compared to previous state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2024-01-04T16:06:31Z) - Class Distribution Shifts in Zero-Shot Learning: Learning Robust Representations [3.8980564330208662]
We propose and analyze a model that assumes that the attribute responsible for the shift is unknown in advance.<n>We show that our algorithm improves generalization to diverse class distributions in both simulations and experiments on real-world datasets.
arXiv Detail & Related papers (2023-11-30T14:14:31Z) - Time-series Generation by Contrastive Imitation [87.51882102248395]
We study a generative framework that seeks to combine the strengths of both: Motivated by a moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) transition policy.
At inference, the learned policy serves as the generator for iterative sampling, and the learned energy serves as a trajectory-level measure for evaluating sample quality.
arXiv Detail & Related papers (2023-11-02T16:45:25Z) - Fairness Transferability Subject to Bounded Distribution Shift [5.62716254065607]
Given an algorithmic predictor that is "fair" on some source distribution, will it still be fair on an unknown target distribution that differs from the source within some bound?
We study the transferability of statistical group fairness for machine learning predictors subject to bounded distribution shifts.
arXiv Detail & Related papers (2022-05-31T22:16:44Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Evaluating Predictive Uncertainty and Robustness to Distributional Shift
Using Real World Data [0.0]
We propose metrics for general regression tasks using the Shifts Weather Prediction dataset.
We also present an evaluation of the baseline methods using these metrics.
arXiv Detail & Related papers (2021-11-08T17:32:10Z) - WILDS: A Benchmark of in-the-Wild Distribution Shifts [157.53410583509924]
Distribution shifts can substantially degrade the accuracy of machine learning systems deployed in the wild.
We present WILDS, a curated collection of 8 benchmark datasets that reflect a diverse range of distribution shifts.
We show that standard training results in substantially lower out-of-distribution than in-distribution performance.
arXiv Detail & Related papers (2020-12-14T11:14:56Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z) - Estimating Generalization under Distribution Shifts via Domain-Invariant
Representations [75.74928159249225]
We use a set of domain-invariant predictors as a proxy for the unknown, true target labels.
The error of the resulting risk estimate depends on the target risk of the proxy model.
arXiv Detail & Related papers (2020-07-06T17:21:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.