Zero-Shot Object Counting with Language-Vision Models
- URL: http://arxiv.org/abs/2309.13097v1
- Date: Fri, 22 Sep 2023 14:48:42 GMT
- Title: Zero-Shot Object Counting with Language-Vision Models
- Authors: Jingyi Xu, Hieu Le, Dimitris Samaras
- Abstract summary: Class-agnostic object counting aims to count object instances of an arbitrary class at test time.
Current methods require human-annotated exemplars as inputs which are often unavailable for novel categories.
We propose zero-shot object counting (ZSC), a new setting where only the class name is available during test time.
- Score: 50.1159882903028
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Class-agnostic object counting aims to count object instances of an arbitrary
class at test time. It is challenging but also enables many potential
applications. Current methods require human-annotated exemplars as inputs which
are often unavailable for novel categories, especially for autonomous systems.
Thus, we propose zero-shot object counting (ZSC), a new setting where only the
class name is available during test time. This obviates the need for human
annotators and enables automated operation. To perform ZSC, we propose finding
a few object crops from the input image and use them as counting exemplars. The
goal is to identify patches containing the objects of interest while also being
visually representative for all instances in the image. To do this, we first
construct class prototypes using large language-vision models, including CLIP
and Stable Diffusion, to select the patches containing the target objects.
Furthermore, we propose a ranking model that estimates the counting error of
each patch to select the most suitable exemplars for counting. Experimental
results on a recent class-agnostic counting dataset, FSC-147, validate the
effectiveness of our method.
Related papers
- Mind the Prompt: A Novel Benchmark for Prompt-based Class-Agnostic Counting [8.000723123087473]
Class-agnostic counting (CAC) is a recent task in computer vision that aims to estimate the number of instances of arbitrary object classes never seen during model training.
We introduce the Prompt-Aware Counting benchmark, which comprises two targeted tests, each accompanied by appropriate evaluation metrics.
arXiv Detail & Related papers (2024-09-24T10:35:42Z) - Learning from Pseudo-labeled Segmentation for Multi-Class Object
Counting [35.652092907690694]
Class-agnostic counting (CAC) has numerous potential applications across various domains.
The goal is to count objects of an arbitrary category during testing, based on only a few annotated exemplars.
We show that the segmentation model trained on these pseudo-labeled masks can effectively localize objects of interest for an arbitrary multi-class image.
arXiv Detail & Related papers (2023-07-15T01:33:19Z) - Zero-shot Object Counting [31.192588671258775]
Class-agnostic object counting aims to count object instances of an arbitrary class at test time.
Current methods require human-annotated exemplars as inputs which are often unavailable for novel categories.
We propose zero-shot object counting (ZSC), a new setting where only the class name is available during test time.
arXiv Detail & Related papers (2023-03-03T15:14:36Z) - Exploiting Unlabeled Data with Vision and Language Models for Object
Detection [64.94365501586118]
Building robust and generic object detection frameworks requires scaling to larger label spaces and bigger training datasets.
We propose a novel method that leverages the rich semantics available in recent vision and language models to localize and classify objects in unlabeled images.
We demonstrate the value of the generated pseudo labels in two specific tasks, open-vocabulary detection and semi-supervised object detection.
arXiv Detail & Related papers (2022-07-18T21:47:15Z) - Exemplar Free Class Agnostic Counting [28.41525571128706]
Class agnostic counting aims to count objects in a novel object category at test time without access to labeled training data for that category.
Our proposed approach first identifies exemplars from repeating objects in an image, and then counts the repeating objects.
We evaluate our proposed approach on FSC-147 dataset, and show that it achieves superior performance compared to the existing approaches.
arXiv Detail & Related papers (2022-05-27T19:44:39Z) - Synthesizing the Unseen for Zero-shot Object Detection [72.38031440014463]
We propose to synthesize visual features for unseen classes, so that the model learns both seen and unseen objects in the visual domain.
We use a novel generative model that uses class-semantics to not only generate the features but also to discriminatively separate them.
arXiv Detail & Related papers (2020-10-19T12:36:11Z) - A Few-Shot Sequential Approach for Object Counting [63.82757025821265]
We introduce a class attention mechanism that sequentially attends to objects in the image and extracts their relevant features.
The proposed technique is trained on point-level annotations and uses a novel loss function that disentangles class-dependent and class-agnostic aspects of the model.
We present our results on a variety of object-counting/detection datasets, including FSOD and MS COCO.
arXiv Detail & Related papers (2020-07-03T18:23:39Z) - Any-Shot Object Detection [81.88153407655334]
'Any-shot detection' is where totally unseen and few-shot categories can simultaneously co-occur during inference.
We propose a unified any-shot detection model, that can concurrently learn to detect both zero-shot and few-shot object classes.
Our framework can also be used solely for Zero-shot detection and Few-shot detection tasks.
arXiv Detail & Related papers (2020-03-16T03:43:15Z) - Incremental Few-Shot Object Detection [96.02543873402813]
OpeN-ended Centre nEt is a detector for incrementally learning to detect class objects with few examples.
ONCE fully respects the incremental learning paradigm, with novel class registration requiring only a single forward pass of few-shot training samples.
arXiv Detail & Related papers (2020-03-10T12:56:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.