Learning Spatial Similarity Distribution for Few-shot Object Counting
- URL: http://arxiv.org/abs/2405.11770v1
- Date: Mon, 20 May 2024 04:15:59 GMT
- Title: Learning Spatial Similarity Distribution for Few-shot Object Counting
- Authors: Yuanwu Xu, Feifan Song, Haofeng Zhang,
- Abstract summary: Few-shot object counting aims to count the number of objects in a query image that belong to the same class as the given exemplar images.
Existing methods compute the similarity between the query image and exemplars in the 2D spatial domain and perform regression to obtain the counting number.
We propose a network learning Spatial Similarity Distribution (SSD) for few-shot object counting, which preserves the spatial structure of exemplar features.
- Score: 17.28147599627954
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Few-shot object counting aims to count the number of objects in a query image that belong to the same class as the given exemplar images. Existing methods compute the similarity between the query image and exemplars in the 2D spatial domain and perform regression to obtain the counting number. However, these methods overlook the rich information about the spatial distribution of similarity on the exemplar images, leading to significant impact on matching accuracy. To address this issue, we propose a network learning Spatial Similarity Distribution (SSD) for few-shot object counting, which preserves the spatial structure of exemplar features and calculates a 4D similarity pyramid point-to-point between the query features and exemplar features, capturing the complete distribution information for each point in the 4D similarity space. We propose a Similarity Learning Module (SLM) which applies the efficient center-pivot 4D convolutions on the similarity pyramid to map different similarity distributions to distinct predicted density values, thereby obtaining accurate count. Furthermore, we also introduce a Feature Cross Enhancement (FCE) module that enhances query and exemplar features mutually to improve the accuracy of feature matching. Our approach outperforms state-of-the-art methods on multiple datasets, including FSC-147 and CARPK. Code is available at https://github.com/CBalance/SSD.
Related papers
- Layer-Wise Feature Metric of Semantic-Pixel Matching for Few-Shot Learning [14.627378118194933]
In Few-Shot Learning, traditional metric-based approaches often rely on global metrics to compute similarity.
In natural scenes, the spatial arrangement of key instances is often inconsistent across images.
We propose a novel method called the Layer-Wise Features Metric of Semantic-Pixel Matching to make finer comparisons.
arXiv Detail & Related papers (2024-11-10T05:12:24Z) - Out of Sight, Out of Mind: A Source-View-Wise Feature Aggregation for
Multi-View Image-Based Rendering [26.866141260616793]
We propose a source-view-wise feature aggregation method, which facilitates us to find out the consensus in a robust way.
We validate the proposed method on various benchmark datasets, including synthetic and real image scenes.
arXiv Detail & Related papers (2022-06-10T07:06:05Z) - LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of
Feature Similarity [49.84167231111667]
Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image.
We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion.
We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
arXiv Detail & Related papers (2022-04-06T17:48:18Z) - Attributable Visual Similarity Learning [90.69718495533144]
This paper proposes an attributable visual similarity learning (AVSL) framework for a more accurate and explainable similarity measure between images.
Motivated by the human semantic similarity cognition, we propose a generalized similarity learning paradigm to represent the similarity between two images with a graph.
Experiments on the CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate significant improvements over existing deep similarity learning methods.
arXiv Detail & Related papers (2022-03-28T17:35:31Z) - Deep Relational Metric Learning [84.95793654872399]
This paper presents a deep relational metric learning framework for image clustering and retrieval.
We learn an ensemble of features that characterizes an image from different aspects to model both interclass and intraclass distributions.
Experiments on the widely-used CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate that our framework improves existing deep metric learning methods and achieves very competitive results.
arXiv Detail & Related papers (2021-08-23T09:31:18Z) - Similarity-Aware Fusion Network for 3D Semantic Segmentation [87.51314162700315]
We propose a similarity-aware fusion network (SAFNet) to adaptively fuse 2D images and 3D point clouds for 3D semantic segmentation.
We employ a late fusion strategy where we first learn the geometric and contextual similarities between the input and back-projected (from 2D pixels) point clouds.
We show that SAFNet significantly outperforms existing state-of-the-art fusion-based approaches across various data integrity.
arXiv Detail & Related papers (2021-07-04T09:28:18Z) - Semantic Distribution-aware Contrastive Adaptation for Semantic
Segmentation [50.621269117524925]
Domain adaptive semantic segmentation refers to making predictions on a certain target domain with only annotations of a specific source domain.
We present a semantic distribution-aware contrastive adaptation algorithm that enables pixel-wise representation alignment.
We evaluate SDCA on multiple benchmarks, achieving considerable improvements over existing algorithms.
arXiv Detail & Related papers (2021-05-11T13:21:25Z) - Multi-level Metric Learning for Few-shot Image Recognition [5.861206243996454]
We argue that if query images can simultaneously be well classified via three level similarity metrics, the query images within a class can be more tightly distributed in a smaller feature space.
Motivated by this, we propose a novel Multi-level Metric Learning (MML) method for few-shot learning, which not only calculates the pixel-level similarity but also considers the similarity of part-level features and the similarity of distributions.
arXiv Detail & Related papers (2021-03-21T12:49:07Z) - BSNet: Bi-Similarity Network for Few-shot Fine-grained Image
Classification [35.50808687239441]
We propose a so-called textitBi-Similarity Network (textitBSNet)
The bi-similarity module learns feature maps according to two similarity measures of diverse characteristics.
In this way, the model is enabled to learn more discriminative and less similarity-biased features from few shots of fine-grained images.
arXiv Detail & Related papers (2020-11-29T08:38:17Z) - SimPropNet: Improved Similarity Propagation for Few-shot Image
Segmentation [14.419517737536706]
Recent deep neural network based FSS methods leverage high-dimensional feature similarity between the foreground features of the support images and the query image features.
We propose to jointly predict the support and query masks to force the support features to share characteristics with the query features.
Our method achieves state-of-the-art results for one-shot and five-shot segmentation on the PASCAL-5i dataset.
arXiv Detail & Related papers (2020-04-30T17:56:48Z) - Improving Few-shot Learning by Spatially-aware Matching and
CrossTransformer [116.46533207849619]
We study the impact of scale and location mismatch in the few-shot learning scenario.
We propose a novel Spatially-aware Matching scheme to effectively perform matching across multiple scales and locations.
arXiv Detail & Related papers (2020-01-06T14:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.