Weighing Counts: Sequential Crowd Counting by Reinforcement Learning
- URL: http://arxiv.org/abs/2007.08260v1
- Date: Thu, 16 Jul 2020 11:16:12 GMT
- Title: Weighing Counts: Sequential Crowd Counting by Reinforcement Learning
- Authors: Liang Liu, Hao Lu, Hongwei Zou, Haipeng Xiong, Zhiguo Cao, Chunhua
Shen
- Abstract summary: We formulate counting as a sequential decision problem and present a novel crowd counting model solvable by deep reinforcement learning.
We propose a novel 'counting scale' termed LibraNet where the count value is analogized by weight.
We show that LibraNet exactly implements scale weighing by visualizing the decision process how LibraNet chooses actions.
- Score: 84.39624429527987
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We formulate counting as a sequential decision problem and present a novel
crowd counting model solvable by deep reinforcement learning. In contrast to
existing counting models that directly output count values, we divide one-step
estimation into a sequence of much easier and more tractable sub-decision
problems. Such sequential decision nature corresponds exactly to a physical
process in reality scale weighing. Inspired by scale weighing, we propose a
novel 'counting scale' termed LibraNet where the count value is analogized by
weight. By virtually placing a crowd image on one side of a scale, LibraNet
(agent) sequentially learns to place appropriate weights on the other side to
match the crowd count. At each step, LibraNet chooses one weight (action) from
the weight box (the pre-defined action pool) according to the current crowd
image features and weights placed on the scale pan (state). LibraNet is
required to learn to balance the scale according to the feedback of the needle
(Q values). We show that LibraNet exactly implements scale weighing by
visualizing the decision process how LibraNet chooses actions. Extensive
experiments demonstrate the effectiveness of our design choices and report
state-of-the-art results on a few crowd counting benchmarks. We also
demonstrate good cross-dataset generalization of LibraNet. Code and models are
made available at: https://git.io/libranet
Related papers
- Stochastic Approximation Approach to Federated Machine Learning [0.0]
This paper examines Federated learning (FL) in a Approximation (SA) framework.
FL is a collaborative way to train neural network models across various participants or clients.
It is observed that the proposed algorithm is robust and gives more reliable estimates of the weights.
arXiv Detail & Related papers (2024-02-20T12:00:25Z) - STEERER: Resolving Scale Variations for Counting and Localization via
Selective Inheritance Learning [74.2343877907438]
Scale variation is a deep-rooted problem in object counting, which has not been effectively addressed by existing scale-aware algorithms.
We propose a novel method termed STEERER that addresses the issue of scale variations in object counting.
STEERER selects the most suitable scale for patch objects to boost feature extraction and only inherits discriminative features from lower to higher resolution progressively.
arXiv Detail & Related papers (2023-08-21T05:09:07Z) - Self-similarity Driven Scale-invariant Learning for Weakly Supervised
Person Search [66.95134080902717]
We propose a novel one-step framework, named Self-similarity driven Scale-invariant Learning (SSL)
We introduce a Multi-scale Exemplar Branch to guide the network in concentrating on the foreground and learning scale-invariant features.
Experiments on PRW and CUHK-SYSU databases demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2023-02-25T04:48:11Z) - Scale Attention for Learning Deep Face Representation: A Study Against
Visual Scale Variation [69.45176408639483]
We reform the conv layer by resorting to the scale-space theory.
We build a novel style named SCale AttentioN Conv Neural Network (textbfSCAN-CNN)
As a single-shot scheme, the inference is more efficient than multi-shot fusion.
arXiv Detail & Related papers (2022-09-19T06:35:04Z) - ScaleNet: A Shallow Architecture for Scale Estimation [25.29257353644138]
We design a new architecture, ScaleNet, that exploits dilated convolutions and self and cross-correlation layers to predict the scale between images.
We show how ScaleNet can be combined with sparse local features and dense correspondence networks to improve camera pose estimation, 3D reconstruction, or dense geometric matching.
arXiv Detail & Related papers (2021-12-09T11:32:01Z) - DISCO: accurate Discrete Scale Convolutions [2.1485350418225244]
Scale is often seen as a given, disturbing factor in many vision tasks. When doing so it is one of the factors why we need more data during learning.
We aim for accurate scale-equivariant convolutional neural networks (SE-CNNs) applicable for problems where high granularity of scale and small filter sizes are required.
arXiv Detail & Related papers (2021-06-04T21:48:09Z) - Direct Quantization for Training Highly Accurate Low Bit-width Deep
Neural Networks [73.29587731448345]
This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations.
First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights.
Second, to obtain low bit-width activations, existing works consider all channels equally.
arXiv Detail & Related papers (2020-12-26T15:21:18Z) - Completely Self-Supervised Crowd Counting via Distribution Matching [92.09218454377395]
We propose a complete self-supervision approach to training models for dense crowd counting.
The only input required to train, apart from a large set of unlabeled crowd images, is the approximate upper limit of the crowd count.
Our method dwells on the idea that natural crowds follow a power law distribution, which could be leveraged to yield error signals for backpropagation.
arXiv Detail & Related papers (2020-09-14T13:20:12Z) - FU-net: Multi-class Image Segmentation Using Feedback Weighted U-net [5.193724835939252]
We present a generic deep convolutional neural network (DCNN) for multi-class image segmentation.
It is based on a well-established supervised end-to-end DCNN model, known as U-net.
arXiv Detail & Related papers (2020-04-28T13:08:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.