Related papers: Re-Scoring Using Image-Language Similarity for Few-Shot Object Detection

Re-Scoring Using Image-Language Similarity for Few-Shot Object Detection

URL: http://arxiv.org/abs/2311.00278v1
Date: Wed, 1 Nov 2023 04:04:34 GMT
Title: Re-Scoring Using Image-Language Similarity for Few-Shot Object Detection
Authors: Min Jae Jung, Seung Dae Han and Joohee Kim
Abstract summary: Few-shot object detection, which focuses on detecting novel objects with few labels, is an emerging challenge in the community. Recent studies show that adapting a pre-trained model or modified loss function can improve performance. We propose Re-scoring using Image-language Similarity for Few-shot object detection (RISF) which extends Faster R-CNN.
Score: 4.0208298639821525
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Few-shot object detection, which focuses on detecting novel objects with few labels, is an emerging challenge in the community. Recent studies show that adapting a pre-trained model or modified loss function can improve performance. In this paper, we explore leveraging the power of Contrastive Language-Image Pre-training (CLIP) and hard negative classification loss in low data setting. Specifically, we propose Re-scoring using Image-language Similarity for Few-shot object detection (RISF) which extends Faster R-CNN by introducing Calibration Module using CLIP (CM-CLIP) and Background Negative Re-scale Loss (BNRL). The former adapts CLIP, which performs zero-shot classification, to re-score the classification scores of a detector using image-class similarities, the latter is modified classification loss considering the punishment for fake backgrounds as well as confusing categories on a generalized few-shot object detection dataset. Extensive experiments on MS-COCO and PASCAL VOC show that the proposed RISF substantially outperforms the state-of-the-art approaches. The code will be available.

Related papers

CSPCL: Category Semantic Prior Contrastive Learning for Deformable DETR-Based Prohibited Item Detectors [8.23801404004195]
Prohibited item detection based on X-ray images is one of the most effective security inspection methods. foreground-background feature coupling makes general detectors designed for natural images perform poorly. We propose a Category Semantic Prior Contrastive Learning mechanism to align the class prototypes perceived by the classifier with the content queries.
arXiv Detail & Related papers (2025-01-28T03:04:22Z)
Few-shot Algorithm Assurance [11.924406021826606]
deep learning models are vulnerable to image distortion. Model Assurance under Image Distortion is a classification task. We propose a novel Conditional Level Set Estimation algorithm.
arXiv Detail & Related papers (2024-12-28T21:11:55Z)
CLIP-FSAC++: Few-Shot Anomaly Classification with Anomaly Descriptor Based on CLIP [22.850815902535988]
We propose an effective few-shot anomaly classification framework with one-stage training, dubbed CLIP-FSAC++. In anomaly descriptor, image-to-text cross-attention module is used to obtain image-specific text embeddings. Comprehensive experiment results are provided for evaluating our method in few-normal shot anomaly classification on VisA and MVTEC-AD for 1, 2, 4 and 8-shot settings.
arXiv Detail & Related papers (2024-12-05T02:44:45Z)
Multi-Level Correlation Network For Few-Shot Image Classification [36.44416763952161]
Few-shot image classification aims to recognize novel classes given few labeled images from base classes. We propose a multi-level correlation network (MLCN) for FSIC to tackle this problem by effectively capturing local information.
arXiv Detail & Related papers (2024-12-04T09:36:24Z)
RAFIC: Retrieval-Augmented Few-shot Image Classification [0.0]
Few-shot image classification is the task of classifying unseen images to one of N mutually exclusive classes. We have developed a method for augmenting the set of K with an addition set of A retrieved images. We demonstrate that RAFIC markedly improves performance of few-shot image classification across two challenging datasets.
arXiv Detail & Related papers (2023-12-11T22:28:51Z)
Spuriosity Rankings for Free: A Simple Framework for Last Layer Retraining Based on Object Detection [5.199218657137718]
We propose a novel ranking framework to identify images without spurious cues. We use the object detector as a measure to score the presence of the target object in the images. Next, the images are sorted based on this score, and the last-layer of the model is retrained on a subset of the data with the highest scores.
arXiv Detail & Related papers (2023-10-31T18:44:03Z)
Zero-Shot Visual Classification with Guided Cropping [9.321383320998262]
We propose an off-the-shelf zero-shot object detection model in a preprocessing step to increase focus of zero-shot classifier to the object of interest. We empirically show that our approach improves zero-shot classification results across architectures and datasets, favorably for small objects.
arXiv Detail & Related papers (2023-09-12T20:09:12Z)
Image-free Classifier Injection for Zero-Shot Classification [72.66409483088995]
Zero-shot learning models achieve remarkable results on image classification for samples from classes that were not seen during training. We aim to equip pre-trained models with zero-shot classification capabilities without the use of image data. We achieve this with our proposed Image-free Injection with Semantics (ICIS)
arXiv Detail & Related papers (2023-08-21T09:56:48Z)
Prefix Conditioning Unifies Language and Label Supervision [84.11127588805138]
We show that dataset biases negatively affect pre-training by reducing the generalizability of learned representations. In experiments, we show that this simple technique improves the performance in zero-shot image recognition accuracy and robustness to the image-level distribution shift.
arXiv Detail & Related papers (2022-06-02T16:12:26Z)
Incremental-DETR: Incremental Few-Shot Object Detection via Self-Supervised Learning [60.64535309016623]
We propose the Incremental-DETR that does incremental few-shot object detection via fine-tuning and self-supervised learning on the DETR object detector. To alleviate severe over-fitting with few novel class data, we first fine-tune the class-specific components of DETR with self-supervision. We further introduce a incremental few-shot fine-tuning strategy with knowledge distillation on the class-specific components of DETR to encourage the network in detecting novel classes without catastrophic forgetting.
arXiv Detail & Related papers (2022-05-09T05:08:08Z)
No Token Left Behind: Explainability-Aided Image Classification and Generation [79.4957965474334]
We present a novel explainability-based approach, which adds a loss term to ensure that CLIP focuses on all relevant semantic parts of the input. Our method yields an improvement in the recognition rate, without additional training or fine-tuning.
arXiv Detail & Related papers (2022-04-11T07:16:39Z)
Experience feedback using Representation Learning for Few-Shot Object Detection on Aerial Images [2.8560476609689185]
The performance of our method is assessed on DOTA, a large-scale remote sensing images dataset. It highlights in particular some intrinsic weaknesses for the few-shot object detection task.
arXiv Detail & Related papers (2021-09-27T13:04:53Z)
Rectifying the Shortcut Learning of Background: Shared Object Concentration for Few-Shot Image Recognition [101.59989523028264]
Few-Shot image classification aims to utilize pretrained knowledge learned from a large-scale dataset to tackle a series of downstream classification tasks. We propose COSOC, a novel Few-Shot Learning framework, to automatically figure out foreground objects at both pretraining and evaluation stage.
arXiv Detail & Related papers (2021-07-16T07:46:41Z)
Meta Faster R-CNN: Towards Accurate Few-Shot Object Detection with Attentive Feature Alignment [33.446875089255876]
Few-shot object detection (FSOD) aims to detect objects using only few examples. We propose a meta-learning based few-shot object detection method by transferring meta-knowledge learned from data-abundant base classes to data-scarce novel classes.
arXiv Detail & Related papers (2021-04-15T19:01:27Z)
One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module. We also propose novel training strategies that effectively improve detection performance. Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.