Rethinking Spatial Invariance of Convolutional Networks for Object
Counting
- URL: http://arxiv.org/abs/2206.05253v1
- Date: Fri, 10 Jun 2022 17:51:25 GMT
- Title: Rethinking Spatial Invariance of Convolutional Networks for Object
Counting
- Authors: Zhi-Qi Cheng, Qi Dai, Hong Li, JingKuan Song, Xiao Wu, Alexander G.
Hauptmann
- Abstract summary: We try to use locally connected Gaussian kernels to replace the original convolution filter to estimate the spatial position in the density map.
Inspired by previous work, we propose a low-rank approximation accompanied with translation invariance to favorably implement the approximation of massive Gaussian convolution.
Our methods significantly outperform other state-of-the-art methods and achieve promising learning of the spatial position of objects.
- Score: 119.83017534355842
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous work generally believes that improving the spatial invariance of
convolutional networks is the key to object counting. However, after verifying
several mainstream counting networks, we surprisingly found too strict
pixel-level spatial invariance would cause overfit noise in the density map
generation. In this paper, we try to use locally connected Gaussian kernels to
replace the original convolution filter to estimate the spatial position in the
density map. The purpose of this is to allow the feature extraction process to
potentially stimulate the density map generation process to overcome the
annotation noise. Inspired by previous work, we propose a low-rank
approximation accompanied with translation invariance to favorably implement
the approximation of massive Gaussian convolution. Our work points a new
direction for follow-up research, which should investigate how to properly
relax the overly strict pixel-level spatial invariance for object counting. We
evaluate our methods on 4 mainstream object counting networks (i.e., MCNN,
CSRNet, SANet, and ResNet-50). Extensive experiments were conducted on 7
popular benchmarks for 3 applications (i.e., crowd, vehicle, and plant
counting). Experimental results show that our methods significantly outperform
other state-of-the-art methods and achieve promising learning of the spatial
position of objects.
Related papers
- Deep Homography Estimation for Visual Place Recognition [49.235432979736395]
We propose a transformer-based deep homography estimation (DHE) network.
It takes the dense feature map extracted by a backbone network as input and fits homography for fast and learnable geometric verification.
Experiments on benchmark datasets show that our method can outperform several state-of-the-art methods.
arXiv Detail & Related papers (2024-02-25T13:22:17Z) - Hyperspectral Target Detection Based on Low-Rank Background Subspace
Learning and Graph Laplacian Regularization [2.9626402880497267]
Hyperspectral target detection is good at finding dim and small objects based on spectral characteristics.
Existing representation-based methods are hindered by the problem of the unknown background dictionary.
This paper proposes an efficient optimizing approach based on low-rank representation (LRR) and graph Laplacian regularization (GLR)
arXiv Detail & Related papers (2023-06-01T13:51:08Z) - Binarizing Sparse Convolutional Networks for Efficient Point Cloud
Analysis [93.55896765176414]
We propose binary sparse convolutional networks called BSC-Net for efficient point cloud analysis.
We employ the differentiable search strategies to discover the optimal opsitions for active site matching in the shifted sparse convolution.
Our BSC-Net achieves significant improvement upon our srtong baseline and outperforms the state-of-the-art network binarization methods.
arXiv Detail & Related papers (2023-03-27T13:47:06Z) - Improved Counting and Localization from Density Maps for Object
Detection in 2D and 3D Microscopy Imaging [4.746727774540763]
We propose an alternative method to count and localize objects from the density map.
Our results show improved performance in counting and localization of objects in 2D and 3D microscopy data.
arXiv Detail & Related papers (2022-03-29T15:54:19Z) - Object-aware Monocular Depth Prediction with Instance Convolutions [72.98771405534937]
We propose a novel convolutional operator which is explicitly tailored to avoid feature aggregation.
Our method is based on estimating per-part depth values by means of superpixels.
Our evaluation with respect to the NYUv2 as well as the iBims dataset clearly demonstrates the superiority of Instance Convolutions.
arXiv Detail & Related papers (2021-12-02T18:59:48Z) - Point Cloud Upsampling via Disentangled Refinement [86.3641957163818]
Point clouds produced by 3D scanning are often sparse, non-uniform, and noisy.
Recent upsampling approaches aim to generate a dense point set, while achieving both distribution uniformity and proximity-to-surface.
We formulate two cascaded sub-networks, a dense generator and a spatial refiner.
arXiv Detail & Related papers (2021-06-09T02:58:42Z) - FFD: Fast Feature Detector [22.51804239092462]
We show that robust and accurate keypoints exist in the specific scale-space domain.
It is proved that setting the scale-space pyramid's smoothness ratio and blurring to 2 and 0.627, respectively, facilitates the detection of reliable keypoints.
arXiv Detail & Related papers (2020-12-01T21:56:35Z) - Making Affine Correspondences Work in Camera Geometry Computation [62.7633180470428]
Local features provide region-to-region rather than point-to-point correspondences.
We propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline.
Experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times.
arXiv Detail & Related papers (2020-07-20T12:07:48Z) - Learning Gaussian Maps for Dense Object Detection [1.8275108630751844]
We review common and highly accurate object detection methods on the scenes where numerous similar looking objects are placed in close proximity with each other.
We show that, multi-task learning of gaussian maps along with classification and bounding box regression gives us a significant boost in accuracy over the baseline.
Our method also achieves the state of the art accuracy on the SKU110K citesku110k dataset.
arXiv Detail & Related papers (2020-04-24T17:01:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.