Prototype-Driven Adaptation for Few-Shot Object Detection
- URL: http://arxiv.org/abs/2510.25318v1
- Date: Wed, 29 Oct 2025 09:32:42 GMT
- Title: Prototype-Driven Adaptation for Few-Shot Object Detection
- Authors: Yushen Huang, Zhiming Wang,
- Abstract summary: Prototype-Driven Alignment (PDA) is a lightweight, plug-in metric head for DeFRCN.<n>PDA maintains support-only prototypes and applies prototype-conditioned RoI alignment to reduce mismatch.<n> Experiments on FSOD and GFSOD benchmarks show that PDA consistently improves novel-class performance with minimal impact on base classes and negligible computational overhead.
- Score: 9.557416198772673
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot object detection (FSOD) often suffers from base-class bias and unstable calibration when only a few novel samples are available. We propose Prototype-Driven Alignment (PDA), a lightweight, plug-in metric head for DeFRCN that provides a prototype-based "second opinion" complementary to the linear classifier. PDA maintains support-only prototypes in a learnable identity-initialized projection space and optionally applies prototype-conditioned RoI alignment to reduce geometric mismatch. During fine-tuning, prototypes can be adapted via exponential moving average(EMA) updates on labeled foreground RoIs-without introducing class-specific parameters-and are frozen at inference to ensure strict protocol compliance. PDA employs a best-of-K matching scheme to capture intra-class multi-modality and temperature-scaled fusion to combine metric similarities with detector logits. Experiments on VOC FSOD and GFSOD benchmarks show that PDA consistently improves novel-class performance with minimal impact on base classes and negligible computational overhead.
Related papers
- FSP-DETR: Few-Shot Prototypical Parasitic Ova Detection [6.732071883787906]
We present FSP-DETR, a unified detection framework that enables robust few-shot detection, open-set recognition, and generalization to unseen biomedical tasks.<n>Built upon a class-agnostic DETR backbone, our approach constructs class prototypes from original support images and learns an embedding space using augmented views and a lightweight transformer decoder.<n>Tests across ova, blood cell, and malaria detection tasks demonstrate that FSP-DETR significantly outperforms prior few-shot and prototype-based detectors.
arXiv Detail & Related papers (2025-10-10T17:38:40Z) - Multi-Cache Enhanced Prototype Learning for Test-Time Generalization of Vision-Language Models [7.542207462593201]
In zero-shot setting, test-time adaptation adjusts pre-trained models using unlabeled data from the test phase to enhance performance on unknown test distributions.<n>This study identifies a positive correlation between cache-enhanced performance and intra-class compactness.<n>We propose a Multi-Cache enhanced Prototype-based Test-Time Adaptation (MCP) featuring three caches.
arXiv Detail & Related papers (2025-08-02T06:43:43Z) - Probabilistic Prototype Calibration of Vision-Language Models for Generalized Few-shot Semantic Segmentation [75.18058114915327]
Generalized Few-Shot Semanticnative (GFSS) aims to extend a segmentation model to novel classes with only a few annotated examples.<n>We propose FewCLIP, a probabilistic prototype calibration framework over multi-modal prototypes from the pretrained CLIP.<n>We show FewCLIP significantly outperforms state-of-the-art approaches across both GFSS and class-incremental setting.
arXiv Detail & Related papers (2025-06-28T18:36:22Z) - FastRef:Fast Prototype Refinement for Few-Shot Industrial Anomaly Detection [18.487111110151115]
Few-shot industrial anomaly detection (FS-IAD) presents a critical challenge for practical automated inspection systems.<n>We propose FastRef, a novel and efficient prototype refinement framework for FS-IAD.<n>For comprehensive evaluation, we integrate FastRef with three competitive prototype-based FS-IAD methods: PatchCore, FastRecon, WinCLIP, and AnomalyDINO.
arXiv Detail & Related papers (2025-06-26T15:46:28Z) - Demystifying Catastrophic Forgetting in Two-Stage Incremental Object Detector [42.40881712297689]
Catastrophic forgetting is predominantly localized to the RoI Head.<n>NSGP-RePRE mitigates forgetting via replay of two types of prototypes.<n>NSGP-RePRE achieves state-of-the-art performance on the Pascal VOC and MS COCO datasets.
arXiv Detail & Related papers (2025-02-08T12:10:02Z) - Prototypical Calibrating Ambiguous Samples for Micro-Action Recognition [34.4463059961465]
Micro-Action Recognition (MAR) has gained increasing attention due to its crucial role as a form of non-verbal communication in social interactions.<n>Current approaches often overlook the inherent ambiguity in micro-actions, which arises from the wide category range and subtle visual differences between categories.<n>We propose a novel Prototypical Calibrating Ambiguous Network (PCAN) to unleash and mitigate the ambiguity of MAR.
arXiv Detail & Related papers (2024-12-19T10:41:24Z) - PCoTTA: Continual Test-Time Adaptation for Multi-Task Point Cloud Understanding [40.42904797189929]
We present PCoTTA, an innovative framework for Continual Test-Time Adaptation (CoTTA) in multi-task point cloud understanding.
Our PCoTTA involves three key components: automatic prototype mixture (APM), Gaussian Splatted feature shifting (GSFS), and contrastive prototype repulsion (CPR)
CPR is proposed to pull the nearest learnable prototype close to the testing feature and push it away from other prototypes, making each prototype distinguishable during the adaptation.
arXiv Detail & Related papers (2024-11-01T14:41:36Z) - Rethinking Few-shot 3D Point Cloud Semantic Segmentation [62.80639841429669]
This paper revisits few-shot 3D point cloud semantic segmentation (FS-PCS)
We focus on two significant issues in the state-of-the-art: foreground leakage and sparse point distribution.
To address these issues, we introduce a standardized FS-PCS setting, upon which a new benchmark is built.
arXiv Detail & Related papers (2024-03-01T15:14:47Z) - Decoupled Prototype Learning for Reliable Test-Time Adaptation [50.779896759106784]
Test-time adaptation (TTA) is a task that continually adapts a pre-trained source model to the target domain during inference.
One popular approach involves fine-tuning model with cross-entropy loss according to estimated pseudo-labels.
This study reveals that minimizing the classification error of each sample causes the cross-entropy loss's vulnerability to label noise.
We propose a novel Decoupled Prototype Learning (DPL) method that features prototype-centric loss computation.
arXiv Detail & Related papers (2024-01-15T03:33:39Z) - AdjointDPM: Adjoint Sensitivity Method for Gradient Backpropagation of Diffusion Probabilistic Models [103.41269503488546]
Existing customization methods require access to multiple reference examples to align pre-trained diffusion probabilistic models with user-provided concepts.
This paper aims to address the challenge of DPM customization when the only available supervision is a differentiable metric defined on the generated contents.
We propose a novel method AdjointDPM, which first generates new samples from diffusion models by solving the corresponding probability-flow ODEs.
It then uses the adjoint sensitivity method to backpropagate the gradients of the loss to the models' parameters.
arXiv Detail & Related papers (2023-07-20T09:06:21Z) - UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of
Diffusion Models [92.43617471204963]
Diffusion probabilistic models (DPMs) have demonstrated a very promising ability in high-resolution image synthesis.
We develop a unified corrector (UniC) that can be applied after any existing DPM sampler to increase the order of accuracy.
We propose a unified predictor-corrector framework called UniPC for the fast sampling of DPMs.
arXiv Detail & Related papers (2023-02-09T18:59:48Z) - Contrastive Prototype Learning with Augmented Embeddings for Few-Shot
Learning [58.2091760793799]
We propose a novel contrastive prototype learning with augmented embeddings (CPLAE) model.
With a class prototype as an anchor, CPL aims to pull the query samples of the same class closer and those of different classes further away.
Extensive experiments on several benchmarks demonstrate that our proposed CPLAE achieves new state-of-the-art.
arXiv Detail & Related papers (2021-01-23T13:22:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.