Related papers: SQLNet: Scale-Modulated Query and Localization Network for Few-Shot Class-Agnostic Counting

SQLNet: Scale-Modulated Query and Localization Network for Few-Shot Class-Agnostic Counting

URL: http://arxiv.org/abs/2311.10011v1
Date: Thu, 16 Nov 2023 16:50:56 GMT
Title: SQLNet: Scale-Modulated Query and Localization Network for Few-Shot Class-Agnostic Counting
Authors: Hefeng Wu, Yandong Chen, Lingbo Liu, Tianshui Chen, Keze Wang, Liang Lin
Abstract summary: The class-agnostic counting (CAC) task has recently been proposed to solve the problem of counting all objects of an arbitrary class with several exemplars given in the input image. We propose a novel localization-based CAC approach, termed Scale-modulated Query and Localization Network (Net) It fully explores the scales of exemplars in both the query and localization stages and achieves effective counting by accurately locating each object and predicting its approximate size.
Score: 71.38754976584009
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The class-agnostic counting (CAC) task has recently been proposed to solve the problem of counting all objects of an arbitrary class with several exemplars given in the input image. To address this challenging task, existing leading methods all resort to density map regression, which renders them impractical for downstream tasks that require object locations and restricts their ability to well explore the scale information of exemplars for supervision. To address the limitations, we propose a novel localization-based CAC approach, termed Scale-modulated Query and Localization Network (SQLNet). It fully explores the scales of exemplars in both the query and localization stages and achieves effective counting by accurately locating each object and predicting its approximate size. Specifically, during the query stage, rich discriminative representations of the target class are acquired by the Hierarchical Exemplars Collaborative Enhancement (HECE) module from the few exemplars through multi-scale exemplar cooperation with equifrequent size prompt embedding. These representations are then fed into the Exemplars-Unified Query Correlation (EUQC) module to interact with the query features in a unified manner and produce the correlated query tensor. In the localization stage, the Scale-aware Multi-head Localization (SAML) module utilizes the query tensor to predict the confidence, location, and size of each potential object. Moreover, a scale-aware localization loss is introduced, which exploits flexible location associations and exemplar scales for supervision to optimize the model performance. Extensive experiments demonstrate that SQLNet outperforms state-of-the-art methods on popular CAC benchmarks, achieving excellent performance not only in counting accuracy but also in localization and bounding box generation. Our codes will be available at https://github.com/HCPLab-SYSU/SQLNet

Related papers

UniLoc: Towards Universal Place Recognition Using Any Single Modality [46.056160460726396]
We develop a universal solution to place recognition, UniLoc, that works with any single query modality. UniLoc learns by matching hierarchically at two levels: instance-level matching and scene-level matching. Experiments on the KITTI-360 dataset demonstrate the benefits of cross-modality for place recognition.
arXiv Detail & Related papers (2024-12-16T18:48:58Z)
Few-shot Object Localization [37.347898735345574]
This paper defines a novel task named Few-Shot Object localization (FSOL) It aims to achieve precise localization with limited samples. This task achieves generalized object localization by leveraging a small number of labeled support samples to query the positional information of objects within corresponding images. Experimental results demonstrate a significant performance improvement of our approach in the FSOL task, establishing an efficient benchmark for further research.
arXiv Detail & Related papers (2024-03-19T05:50:48Z)
Transferability Metrics for Object Detection [0.0]
Transfer learning aims to make the most of existing pre-trained models to achieve better performance on a new task in limited data scenarios. We extend transferability metrics to object detection using ROI-Align and TLogME. We show that TLogME provides a robust correlation with transfer performance and outperforms other transferability metrics on local and global level features.
arXiv Detail & Related papers (2023-06-27T08:49:31Z)
GCNet: Probing Self-Similarity Learning for Generalized Counting Network [24.09746233447471]
Generalized Counting Network (GCNet) is developed to recognize adaptive exemplars within the whole images. GCNet is capable of adaptively capturing them through a carefully-designed self-similarity learning strategy. It performs on par with existing exemplar-dependent methods and shows stunning cross-dataset generality on crowd-specific datasets.
arXiv Detail & Related papers (2023-02-10T09:31:37Z)
Learning to Evaluate Performance of Multi-modal Semantic Localization [9.584659231769416]
Semantic localization (SeLo) refers to the task of obtaining the most relevant locations in large-scale remote sensing (RS) images using semantic information such as text. In this paper, we thoroughly study this field and provide a complete benchmark in terms of metrics and testdata to advance the SeLo task.
arXiv Detail & Related papers (2022-09-14T09:39:03Z)
Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images [28.560068780733342]
A novel context aggregation network (CATNet) is proposed to improve the feature extraction process. The proposed model exploits three lightweight plug-and-play modules, namely dense feature pyramid network (DenseFPN), spatial context pyramid ( SCP), and hierarchical region of interest extractor (HRoIE)
arXiv Detail & Related papers (2021-11-22T08:55:25Z)
CREPO: An Open Repository to Benchmark Credal Network Algorithms [78.79752265884109]
Credal networks are imprecise probabilistic graphical models based on, so-called credal, sets of probability mass functions. A Java library called CREMA has been recently released to model, process and query credal networks. We present CREPO, an open repository of synthetic credal networks, provided together with the exact results of inference tasks on these models.
arXiv Detail & Related papers (2021-05-10T07:31:59Z)
Region Comparison Network for Interpretable Few-shot Image Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes. We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works. We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z)
Scope Head for Accurate Localization in Object Detection [135.9979405835606]
We propose a novel detector coined as ScopeNet, which models anchors of each location as a mutually dependent relationship. With our concise and effective design, the proposed ScopeNet achieves state-of-the-art results on COCO.
arXiv Detail & Related papers (2020-05-11T04:00:09Z)
Crowd Counting via Hierarchical Scale Recalibration Network [61.09833400167511]
We propose a novel Hierarchical Scale Recalibration Network (HSRNet) to tackle the task of crowd counting. HSRNet models rich contextual dependencies and recalibrating multiple scale-associated information. Our approach can ignore various noises selectively and focus on appropriate crowd scales automatically.
arXiv Detail & Related papers (2020-03-07T10:06:47Z)
Improving Few-shot Learning by Spatially-aware Matching and CrossTransformer [116.46533207849619]
We study the impact of scale and location mismatch in the few-shot learning scenario. We propose a novel Spatially-aware Matching scheme to effectively perform matching across multiple scales and locations.
arXiv Detail & Related papers (2020-01-06T14:10:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.