Related papers: Weighing Counts: Sequential Crowd Counting by Reinforcement Learning

Weighing Counts: Sequential Crowd Counting by Reinforcement Learning

URL: http://arxiv.org/abs/2007.08260v1
Date: Thu, 16 Jul 2020 11:16:12 GMT
Title: Weighing Counts: Sequential Crowd Counting by Reinforcement Learning
Authors: Liang Liu, Hao Lu, Hongwei Zou, Haipeng Xiong, Zhiguo Cao, Chunhua Shen
Abstract summary: We formulate counting as a sequential decision problem and present a novel crowd counting model solvable by deep reinforcement learning. We propose a novel 'counting scale' termed LibraNet where the count value is analogized by weight. We show that LibraNet exactly implements scale weighing by visualizing the decision process how LibraNet chooses actions.
Score: 84.39624429527987
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We formulate counting as a sequential decision problem and present a novel crowd counting model solvable by deep reinforcement learning. In contrast to existing counting models that directly output count values, we divide one-step estimation into a sequence of much easier and more tractable sub-decision problems. Such sequential decision nature corresponds exactly to a physical process in reality scale weighing. Inspired by scale weighing, we propose a novel 'counting scale' termed LibraNet where the count value is analogized by weight. By virtually placing a crowd image on one side of a scale, LibraNet (agent) sequentially learns to place appropriate weights on the other side to match the crowd count. At each step, LibraNet chooses one weight (action) from the weight box (the pre-defined action pool) according to the current crowd image features and weights placed on the scale pan (state). LibraNet is required to learn to balance the scale according to the feedback of the needle (Q values). We show that LibraNet exactly implements scale weighing by visualizing the decision process how LibraNet chooses actions. Extensive experiments demonstrate the effectiveness of our design choices and report state-of-the-art results on a few crowd counting benchmarks. We also demonstrate good cross-dataset generalization of LibraNet. Code and models are made available at: https://git.io/libranet

Related papers

Stochastic Approximation Approach to Federated Machine Learning [0.0]
This paper examines Federated learning (FL) in a Approximation (SA) framework. FL is a collaborative way to train neural network models across various participants or clients. It is observed that the proposed algorithm is robust and gives more reliable estimates of the weights.
arXiv Detail & Related papers (2024-02-20T12:00:25Z)
STEERER: Resolving Scale Variations for Counting and Localization via Selective Inheritance Learning [74.2343877907438]
Scale variation is a deep-rooted problem in object counting, which has not been effectively addressed by existing scale-aware algorithms. We propose a novel method termed STEERER that addresses the issue of scale variations in object counting. STEERER selects the most suitable scale for patch objects to boost feature extraction and only inherits discriminative features from lower to higher resolution progressively.
arXiv Detail & Related papers (2023-08-21T05:09:07Z)
Self-similarity Driven Scale-invariant Learning for Weakly Supervised Person Search [66.95134080902717]
We propose a novel one-step framework, named Self-similarity driven Scale-invariant Learning (SSL) We introduce a Multi-scale Exemplar Branch to guide the network in concentrating on the foreground and learning scale-invariant features. Experiments on PRW and CUHK-SYSU databases demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2023-02-25T04:48:11Z)
Scale Attention for Learning Deep Face Representation: A Study Against Visual Scale Variation [69.45176408639483]
We reform the conv layer by resorting to the scale-space theory. We build a novel style named SCale AttentioN Conv Neural Network (textbfSCAN-CNN) As a single-shot scheme, the inference is more efficient than multi-shot fusion.
arXiv Detail & Related papers (2022-09-19T06:35:04Z)
ScaleNet: A Shallow Architecture for Scale Estimation [25.29257353644138]
We design a new architecture, ScaleNet, that exploits dilated convolutions and self and cross-correlation layers to predict the scale between images. We show how ScaleNet can be combined with sparse local features and dense correspondence networks to improve camera pose estimation, 3D reconstruction, or dense geometric matching.
arXiv Detail & Related papers (2021-12-09T11:32:01Z)
DISCO: accurate Discrete Scale Convolutions [2.1485350418225244]
Scale is often seen as a given, disturbing factor in many vision tasks. When doing so it is one of the factors why we need more data during learning. We aim for accurate scale-equivariant convolutional neural networks (SE-CNNs) applicable for problems where high granularity of scale and small filter sizes are required.
arXiv Detail & Related papers (2021-06-04T21:48:09Z)
Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks [73.29587731448345]
This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations. First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights. Second, to obtain low bit-width activations, existing works consider all channels equally.
arXiv Detail & Related papers (2020-12-26T15:21:18Z)
Completely Self-Supervised Crowd Counting via Distribution Matching [92.09218454377395]
We propose a complete self-supervision approach to training models for dense crowd counting. The only input required to train, apart from a large set of unlabeled crowd images, is the approximate upper limit of the crowd count. Our method dwells on the idea that natural crowds follow a power law distribution, which could be leveraged to yield error signals for backpropagation.
arXiv Detail & Related papers (2020-09-14T13:20:12Z)
FU-net: Multi-class Image Segmentation Using Feedback Weighted U-net [5.193724835939252]
We present a generic deep convolutional neural network (DCNN) for multi-class image segmentation. It is based on a well-established supervised end-to-end DCNN model, known as U-net.
arXiv Detail & Related papers (2020-04-28T13:08:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.