OmniCount: Multi-label Object Counting with Semantic-Geometric Priors
- URL: http://arxiv.org/abs/2403.05435v4
- Date: Tue, 20 Aug 2024 18:08:48 GMT
- Title: OmniCount: Multi-label Object Counting with Semantic-Geometric Priors
- Authors: Anindya Mondal, Sauradip Nag, Xiatian Zhu, Anjan Dutta,
- Abstract summary: This paper introduces a more practical approach enabling simultaneous counting of multiple object categories using an open-vocabulary framework.
Our solution, OmniCount, stands out by using semantic and geometric insights (priors) from pre-trained models to count multiple categories of objects as specified by users.
Our comprehensive evaluation in OmniCount-191, alongside other leading benchmarks, demonstrates OmniCount's exceptional performance, significantly outpacing existing solutions.
- Score: 42.38571663534819
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object counting is pivotal for understanding the composition of scenes. Previously, this task was dominated by class-specific methods, which have gradually evolved into more adaptable class-agnostic strategies. However, these strategies come with their own set of limitations, such as the need for manual exemplar input and multiple passes for multiple categories, resulting in significant inefficiencies. This paper introduces a more practical approach enabling simultaneous counting of multiple object categories using an open-vocabulary framework. Our solution, OmniCount, stands out by using semantic and geometric insights (priors) from pre-trained models to count multiple categories of objects as specified by users, all without additional training. OmniCount distinguishes itself by generating precise object masks and leveraging varied interactive prompts via the Segment Anything Model for efficient counting. To evaluate OmniCount, we created the OmniCount-191 benchmark, a first-of-its-kind dataset with multi-label object counts, including points, bounding boxes, and VQA annotations. Our comprehensive evaluation in OmniCount-191, alongside other leading benchmarks, demonstrates OmniCount's exceptional performance, significantly outpacing existing solutions.
Related papers
- CountGD: Multi-Modal Open-World Counting [54.88804890463491]
This paper aims to improve the generality and accuracy of open-vocabulary object counting in images.
We introduce the first open-world counting model, CountGD, where the prompt can be specified by a text description or visual exemplars or both.
arXiv Detail & Related papers (2024-07-05T16:20:48Z) - AFreeCA: Annotation-Free Counting for All [17.581015609730017]
We introduce an unsupervised sorting methodology to learn object-related features that are subsequently refined and anchored for counting purposes.
We also present a density classifier-guided method for dividing an image into patches containing objects that can be reliably counted.
arXiv Detail & Related papers (2024-03-07T23:18:34Z) - Zero-Shot Object Counting with Language-Vision Models [50.1159882903028]
Class-agnostic object counting aims to count object instances of an arbitrary class at test time.
Current methods require human-annotated exemplars as inputs which are often unavailable for novel categories.
We propose zero-shot object counting (ZSC), a new setting where only the class name is available during test time.
arXiv Detail & Related papers (2023-09-22T14:48:42Z) - Learning from Pseudo-labeled Segmentation for Multi-Class Object
Counting [35.652092907690694]
Class-agnostic counting (CAC) has numerous potential applications across various domains.
The goal is to count objects of an arbitrary category during testing, based on only a few annotated exemplars.
We show that the segmentation model trained on these pseudo-labeled masks can effectively localize objects of interest for an arbitrary multi-class image.
arXiv Detail & Related papers (2023-07-15T01:33:19Z) - Universal Instance Perception as Object Discovery and Retrieval [90.96031157557806]
UNI reformulates diverse instance perception tasks into a unified object discovery and retrieval paradigm.
It can flexibly perceive different types of objects by simply changing the input prompts.
UNI shows superior performance on 20 challenging benchmarks from 10 instance-level tasks.
arXiv Detail & Related papers (2023-03-12T14:28:24Z) - Learning to Count Anything: Reference-less Class-agnostic Counting with
Weak Supervision [11.037585450795357]
We show that counting is, at its core, a repetition-recognition task.
We demonstrate that self-supervised vision transformer features combined with a lightweight count regression head achieve competitive results.
arXiv Detail & Related papers (2022-05-20T14:26:38Z) - Dilated-Scale-Aware Attention ConvNet For Multi-Class Object Counting [18.733301622920102]
Multi-class object counting expands the scope of application of object counting task.
The multi-target detection task can achieve multi-class object counting in some scenarios.
We propose a simple yet efficient counting network based on point-level annotations.
arXiv Detail & Related papers (2020-12-15T08:38:28Z) - A Few-Shot Sequential Approach for Object Counting [63.82757025821265]
We introduce a class attention mechanism that sequentially attends to objects in the image and extracts their relevant features.
The proposed technique is trained on point-level annotations and uses a novel loss function that disentangles class-dependent and class-agnostic aspects of the model.
We present our results on a variety of object-counting/detection datasets, including FSOD and MS COCO.
arXiv Detail & Related papers (2020-07-03T18:23:39Z) - Rethinking Object Detection in Retail Stores [55.359582952686175]
We propose a new task, simultaneously object localization and counting, abbreviated as Locount.
Locount requires algorithms to localize groups of objects of interest with the number of instances.
We collect a large-scale object localization and counting dataset with rich annotations in retail stores.
arXiv Detail & Related papers (2020-03-18T14:01:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.