Interlayer and Intralayer Scale Aggregation for Scale-invariant Crowd
Counting
- URL: http://arxiv.org/abs/2005.11943v1
- Date: Mon, 25 May 2020 06:59:31 GMT
- Title: Interlayer and Intralayer Scale Aggregation for Scale-invariant Crowd
Counting
- Authors: Mingjie Wang and Hao Cai and Jun Zhou and Minglun Gong
- Abstract summary: Single-column Scale-invariant Network (ScSiNet) is presented in this paper.
It extracts sophisticated scale-invariant features via the combination of interlayer multi-scale integration and a novel intralayer scale-invariant transformation (SiT)
Experiments on public datasets demonstrate that the proposed method consistently outperforms state-of-the-art approaches in counting accuracy and scale-invariant property.
- Score: 19.42355176075503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Crowd counting is an important vision task, which faces challenges on
continuous scale variation within a given scene and huge density shift both
within and across images. These challenges are typically addressed using
multi-column structures in existing methods. However, such an approach does not
provide consistent improvement and transferability due to limited ability in
capturing multi-scale features, sensitiveness to large density shift, and
difficulty in training multi-branch models. To overcome these limitations, a
Single-column Scale-invariant Network (ScSiNet) is presented in this paper,
which extracts sophisticated scale-invariant features via the combination of
interlayer multi-scale integration and a novel intralayer scale-invariant
transformation (SiT). Furthermore, in order to enlarge the diversity of
densities, a randomly integrated loss is presented for training our
single-branch method. Extensive experiments on public datasets demonstrate that
the proposed method consistently outperforms state-of-the-art approaches in
counting accuracy and achieves remarkable transferability and scale-invariant
property.
Related papers
- Multi-scale Unified Network for Image Classification [33.560003528712414]
CNNs face notable challenges in performance and computational efficiency when dealing with real-world, multi-scale image inputs.
We propose Multi-scale Unified Network (MUSN) consisting of multi-scales, a unified network, and scale-invariant constraint.
MUSN yields an accuracy increase up to 44.53% and diminishes FLOPs by 7.01-16.13% in multi-scale scenarios.
arXiv Detail & Related papers (2024-03-27T06:40:26Z) - Boosting the Transferability of Adversarial Examples via Local Mixup and
Adaptive Step Size [5.04766995613269]
Adversarial examples are one critical security threat to various visual applications, where injected human-imperceptible perturbations can confuse the output.
Existing input-diversity-based methods adopt different image transformations, but may be inefficient due to insufficient input diversity and an identical perturbation step size.
This paper proposes a black-box adversarial generative framework by jointly designing enhanced input diversity and adaptive step sizes.
arXiv Detail & Related papers (2024-01-24T03:26:34Z) - A Novel Cross-Perturbation for Single Domain Generalization [54.612933105967606]
Single domain generalization aims to enhance the ability of the model to generalize to unknown domains when trained on a single source domain.
The limited diversity in the training data hampers the learning of domain-invariant features, resulting in compromised generalization performance.
We propose CPerb, a simple yet effective cross-perturbation method to enhance the diversity of the training data.
arXiv Detail & Related papers (2023-08-02T03:16:12Z) - Deep Neural Networks with Efficient Guaranteed Invariances [77.99182201815763]
We address the problem of improving the performance and in particular the sample complexity of deep neural networks.
Group-equivariant convolutions are a popular approach to obtain equivariant representations.
We propose a multi-stream architecture, where each stream is invariant to a different transformation.
arXiv Detail & Related papers (2023-03-02T20:44:45Z) - Self-similarity Driven Scale-invariant Learning for Weakly Supervised
Person Search [66.95134080902717]
We propose a novel one-step framework, named Self-similarity driven Scale-invariant Learning (SSL)
We introduce a Multi-scale Exemplar Branch to guide the network in concentrating on the foreground and learning scale-invariant features.
Experiments on PRW and CUHK-SYSU databases demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2023-02-25T04:48:11Z) - Auto-regressive Image Synthesis with Integrated Quantization [55.51231796778219]
This paper presents a versatile framework for conditional image generation.
It incorporates the inductive bias of CNNs and powerful sequence modeling of auto-regression.
Our method achieves superior diverse image generation performance as compared with the state-of-the-art.
arXiv Detail & Related papers (2022-07-21T22:19:17Z) - Resource-Efficient Invariant Networks: Exponential Gains by Unrolled
Optimization [8.37077056358265]
We propose a new computational primitive for building invariant networks based instead on optimization.
We provide empirical and theoretical corroboration of the efficiency gains and soundness of our proposed method.
We demonstrate its utility in constructing an efficient invariant network for a simple hierarchical object detection task.
arXiv Detail & Related papers (2022-03-09T19:04:08Z) - Consistency and Diversity induced Human Motion Segmentation [231.36289425663702]
We propose a novel Consistency and Diversity induced human Motion (CDMS) algorithm.
Our model factorizes the source and target data into distinct multi-layer feature spaces.
A multi-mutual learning strategy is carried out to reduce the domain gap between the source and target data.
arXiv Detail & Related papers (2022-02-10T06:23:56Z) - Invariance-based Multi-Clustering of Latent Space Embeddings for
Equivariant Learning [12.770012299379099]
We present an approach to disentangle equivariance feature maps in a Lie group manifold by enforcing deep, group-invariant learning.
Our experiments show that this model effectively learns to disentangle the invariant and equivariant representations with significant improvements in the learning rate.
arXiv Detail & Related papers (2021-07-25T03:27:47Z) - Weakly supervised segmentation with cross-modality equivariant
constraints [7.757293476741071]
Weakly supervised learning has emerged as an appealing alternative to alleviate the need for large labeled datasets in semantic segmentation.
We present a novel learning strategy that leverages self-supervision in a multi-modal image scenario to significantly enhance original CAMs.
Our approach outperforms relevant recent literature under the same learning conditions.
arXiv Detail & Related papers (2021-04-06T13:14:20Z) - Crowd Counting via Hierarchical Scale Recalibration Network [61.09833400167511]
We propose a novel Hierarchical Scale Recalibration Network (HSRNet) to tackle the task of crowd counting.
HSRNet models rich contextual dependencies and recalibrating multiple scale-associated information.
Our approach can ignore various noises selectively and focus on appropriate crowd scales automatically.
arXiv Detail & Related papers (2020-03-07T10:06:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.