On-the-Fly OVD Adaptation with FLAME: Few-shot Localization via Active Marginal-Samples Exploration
- URL: http://arxiv.org/abs/2510.17670v2
- Date: Thu, 30 Oct 2025 12:05:58 GMT
- Title: On-the-Fly OVD Adaptation with FLAME: Few-shot Localization via Active Marginal-Samples Exploration
- Authors: Yehonathan Refael, Amit Aides, Aviad Barzilai, George Leifman, Genady Beryozkin, Vered Silverman, Bolous Jaber, Tomer Shekel,
- Abstract summary: Open-vocabulary object detection (OVD) models offer remarkable flexibility by detecting objects from arbitrary text queries.<n>Their zero-shot performance in specialized domains like Remote Sensing (RS) is often compromised by the inherent ambiguity of natural language.<n>We propose a cascaded approach that couples the broad generalization of a large pre-trained OVD model with a lightweight few-shot classifier.
- Score: 1.7975230539002824
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Open-vocabulary object detection (OVD) models offer remarkable flexibility by detecting objects from arbitrary text queries. However, their zero-shot performance in specialized domains like Remote Sensing (RS) is often compromised by the inherent ambiguity of natural language, limiting critical downstream applications. For instance, an OVD model may struggle to distinguish between fine-grained classes such as "fishing boat" and "yacht" since their embeddings are similar and often inseparable. This can hamper specific user goals, such as monitoring illegal fishing, by producing irrelevant detections. To address this, we propose a cascaded approach that couples the broad generalization of a large pre-trained OVD model with a lightweight few-shot classifier. Our method first employs the zero-shot model to generate high-recall object proposals. These proposals are then refined for high precision by a compact classifier trained in real-time on only a handful of user-annotated examples - drastically reducing the high costs of RS imagery annotation.The core of our framework is FLAME, a one-step active learning strategy that selects the most informative samples for training. FLAME identifies, on the fly, uncertain marginal candidates near the decision boundary using density estimation, followed by clustering to ensure sample diversity. This efficient sampling technique achieves high accuracy without costly full-model fine-tuning and enables instant adaptation, within less then a minute, which is significantly faster than state-of-the-art alternatives.Our method consistently surpasses state-of-the-art performance on RS benchmarks, establishing a practical and resource-efficient framework for adapting foundation models to specific user needs.
Related papers
- MARS: Unleashing the Power of Speculative Decoding via Margin-Aware Verification [7.935725883885573]
Speculative Decoding (SD) accelerates autoregressive large language model (LLM) inference by decoupling generation and verification.<n>We propose Margin-Aware Speculative Verification, a training-free and domain-agnostic verification strategy that adapts to the target model's local decisiveness.<n>Our method conditions verification on decision stability measured directly from the target logits and relaxes rejection only when strict verification provides minimal benefit.
arXiv Detail & Related papers (2026-01-21T22:03:06Z) - Efficient Adaptive Rejection Sampling for Accelerating Speculative Decoding in Large Language Models [2.4065240342323384]
This paper introduces Efficient Adaptive Rejection Sampling (EARS)<n>EARS dynamically adjusts the acceptance threshold by incorporating the target model's own predictive uncertainty, measured as 1 - max(P_target)<n>It significantly enhances the efficiency of speculative decoding, achieving up to an 18.12% increase in throughput with a negligible 0.84% accuracy drop on the GSM8K benchmark.
arXiv Detail & Related papers (2025-12-15T11:08:56Z) - Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [81.93945602120453]
We introduce an approach that is both general and parameter-efficient for face forgery detection.<n>We design a forgery-style mixture formulation that augments the diversity of forgery source domains.<n>We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z) - Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks.
We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z) - A Closer Look at the Few-Shot Adaptation of Large Vision-Language Models [19.17722702457403]
We show that state-of-the-artETL approaches exhibit strong performance only in narrowly-defined experimental setups.
We propose a CLass-Adaptive linear Probe (CLAP) objective, whose balancing term is optimized via an adaptation of the general Augmented Lagrangian method.
arXiv Detail & Related papers (2023-12-20T02:58:25Z) - Soft ascent-descent as a stable and flexible alternative to flooding [6.527016551650139]
We propose a softened, pointwise mechanism called SoftAD that downweights points on the borderline, limits the effects of outliers, and retains the ascent-descent effect of flooding.
We demonstrate how SoftAD can realize classification accuracy competitive with flooding while enjoying a much smaller loss generalization gap and model norm.
arXiv Detail & Related papers (2023-10-16T02:02:56Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Discriminative Nearest Neighbor Few-Shot Intent Detection by
Transferring Natural Language Inference [150.07326223077405]
Few-shot learning is attracting much attention to mitigate data scarcity.
We present a discriminative nearest neighbor classification with deep self-attention.
We propose to boost the discriminative ability by transferring a natural language inference (NLI) model.
arXiv Detail & Related papers (2020-10-25T00:39:32Z) - Scope Head for Accurate Localization in Object Detection [135.9979405835606]
We propose a novel detector coined as ScopeNet, which models anchors of each location as a mutually dependent relationship.
With our concise and effective design, the proposed ScopeNet achieves state-of-the-art results on COCO.
arXiv Detail & Related papers (2020-05-11T04:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.