SQLNet: Scale-Modulated Query and Localization Network for Few-Shot
Class-Agnostic Counting
- URL: http://arxiv.org/abs/2311.10011v1
- Date: Thu, 16 Nov 2023 16:50:56 GMT
- Title: SQLNet: Scale-Modulated Query and Localization Network for Few-Shot
Class-Agnostic Counting
- Authors: Hefeng Wu, Yandong Chen, Lingbo Liu, Tianshui Chen, Keze Wang, Liang
Lin
- Abstract summary: The class-agnostic counting (CAC) task has recently been proposed to solve the problem of counting all objects of an arbitrary class with several exemplars given in the input image.
We propose a novel localization-based CAC approach, termed Scale-modulated Query and Localization Network (Net)
It fully explores the scales of exemplars in both the query and localization stages and achieves effective counting by accurately locating each object and predicting its approximate size.
- Score: 71.38754976584009
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The class-agnostic counting (CAC) task has recently been proposed to solve
the problem of counting all objects of an arbitrary class with several
exemplars given in the input image. To address this challenging task, existing
leading methods all resort to density map regression, which renders them
impractical for downstream tasks that require object locations and restricts
their ability to well explore the scale information of exemplars for
supervision. To address the limitations, we propose a novel localization-based
CAC approach, termed Scale-modulated Query and Localization Network (SQLNet).
It fully explores the scales of exemplars in both the query and localization
stages and achieves effective counting by accurately locating each object and
predicting its approximate size. Specifically, during the query stage, rich
discriminative representations of the target class are acquired by the
Hierarchical Exemplars Collaborative Enhancement (HECE) module from the few
exemplars through multi-scale exemplar cooperation with equifrequent size
prompt embedding. These representations are then fed into the Exemplars-Unified
Query Correlation (EUQC) module to interact with the query features in a
unified manner and produce the correlated query tensor. In the localization
stage, the Scale-aware Multi-head Localization (SAML) module utilizes the query
tensor to predict the confidence, location, and size of each potential object.
Moreover, a scale-aware localization loss is introduced, which exploits
flexible location associations and exemplar scales for supervision to optimize
the model performance. Extensive experiments demonstrate that SQLNet
outperforms state-of-the-art methods on popular CAC benchmarks, achieving
excellent performance not only in counting accuracy but also in localization
and bounding box generation. Our codes will be available at
https://github.com/HCPLab-SYSU/SQLNet
Related papers
- Few-shot Object Localization [37.347898735345574]
This paper defines a novel task named Few-Shot Object localization (FSOL)
It aims to achieve precise localization with limited samples.
This task achieves generalized object localization by leveraging a small number of labeled support samples to query the positional information of objects within corresponding images.
Experimental results demonstrate a significant performance improvement of our approach in the FSOL task, establishing an efficient benchmark for further research.
arXiv Detail & Related papers (2024-03-19T05:50:48Z) - Transferability Metrics for Object Detection [0.0]
Transfer learning aims to make the most of existing pre-trained models to achieve better performance on a new task in limited data scenarios.
We extend transferability metrics to object detection using ROI-Align and TLogME.
We show that TLogME provides a robust correlation with transfer performance and outperforms other transferability metrics on local and global level features.
arXiv Detail & Related papers (2023-06-27T08:49:31Z) - GCNet: Probing Self-Similarity Learning for Generalized Counting Network [24.09746233447471]
Generalized Counting Network (GCNet) is developed to recognize adaptive exemplars within the whole images.
GCNet is capable of adaptively capturing them through a carefully-designed self-similarity learning strategy.
It performs on par with existing exemplar-dependent methods and shows stunning cross-dataset generality on crowd-specific datasets.
arXiv Detail & Related papers (2023-02-10T09:31:37Z) - Learning to Evaluate Performance of Multi-modal Semantic Localization [9.584659231769416]
Semantic localization (SeLo) refers to the task of obtaining the most relevant locations in large-scale remote sensing (RS) images using semantic information such as text.
In this paper, we thoroughly study this field and provide a complete benchmark in terms of metrics and testdata to advance the SeLo task.
arXiv Detail & Related papers (2022-09-14T09:39:03Z) - Learning to Aggregate Multi-Scale Context for Instance Segmentation in
Remote Sensing Images [28.560068780733342]
A novel context aggregation network (CATNet) is proposed to improve the feature extraction process.
The proposed model exploits three lightweight plug-and-play modules, namely dense feature pyramid network (DenseFPN), spatial context pyramid ( SCP), and hierarchical region of interest extractor (HRoIE)
arXiv Detail & Related papers (2021-11-22T08:55:25Z) - CREPO: An Open Repository to Benchmark Credal Network Algorithms [78.79752265884109]
Credal networks are imprecise probabilistic graphical models based on, so-called credal, sets of probability mass functions.
A Java library called CREMA has been recently released to model, process and query credal networks.
We present CREPO, an open repository of synthetic credal networks, provided together with the exact results of inference tasks on these models.
arXiv Detail & Related papers (2021-05-10T07:31:59Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z) - Scope Head for Accurate Localization in Object Detection [135.9979405835606]
We propose a novel detector coined as ScopeNet, which models anchors of each location as a mutually dependent relationship.
With our concise and effective design, the proposed ScopeNet achieves state-of-the-art results on COCO.
arXiv Detail & Related papers (2020-05-11T04:00:09Z) - Crowd Counting via Hierarchical Scale Recalibration Network [61.09833400167511]
We propose a novel Hierarchical Scale Recalibration Network (HSRNet) to tackle the task of crowd counting.
HSRNet models rich contextual dependencies and recalibrating multiple scale-associated information.
Our approach can ignore various noises selectively and focus on appropriate crowd scales automatically.
arXiv Detail & Related papers (2020-03-07T10:06:47Z) - Improving Few-shot Learning by Spatially-aware Matching and
CrossTransformer [116.46533207849619]
We study the impact of scale and location mismatch in the few-shot learning scenario.
We propose a novel Spatially-aware Matching scheme to effectively perform matching across multiple scales and locations.
arXiv Detail & Related papers (2020-01-06T14:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.