Bound Tightening Network for Robust Crowd Counting
- URL: http://arxiv.org/abs/2409.19146v1
- Date: Fri, 27 Sep 2024 21:18:31 GMT
- Title: Bound Tightening Network for Robust Crowd Counting
- Authors: Qiming Wu,
- Abstract summary: We propose a novel Bound Tightening Network (BTN) for Robust Crowd Counting.
It consists of three parts: base model, smooth regularization module and certify bound module.
Experiments on different benchmark datasets for counting demonstrate the effectiveness and efficiency of BTN.
- Score: 0.3626013617212667
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Crowd Counting is a fundamental topic, aiming to estimate the number of individuals in the crowded images or videos fed from surveillance cameras. Recent works focus on improving counting accuracy, while ignoring the certified robustness of counting models. In this paper, we propose a novel Bound Tightening Network (BTN) for Robust Crowd Counting. It consists of three parts: base model, smooth regularization module and certify bound module. The core idea is to propagate the interval bound through the base model (certify bound module) and utilize the layer weights (smooth regularization module) to guide the network learning. Experiments on different benchmark datasets for counting demonstrate the effectiveness and efficiency of BTN.
Related papers
- SCVCNet: Sliding cross-vector convolution network for cross-task and
inter-individual-set EEG-based cognitive workload recognition [15.537230343119875]
This paper presents a generic approach for applying the cognitive workload recognizer by exploiting common electroencephalogram (EEG) patterns across different human-machine tasks and individual sets.
We propose a neural network called SCVCNet, which eliminates task- and individual-set-related interferences in EEGs by analyzing finer-grained frequency structures in the power spectral densities.
arXiv Detail & Related papers (2023-09-21T13:06:30Z) - Part-Based Models Improve Adversarial Robustness [57.699029966800644]
We show that combining human prior knowledge with end-to-end learning can improve the robustness of deep neural networks.
Our model combines a part segmentation model with a tiny classifier and is trained end-to-end to simultaneously segment objects into parts.
Our experiments indicate that these models also reduce texture bias and yield better robustness against common corruptions and spurious correlations.
arXiv Detail & Related papers (2022-09-15T15:41:47Z) - Semi-supervised 3D Object Detection with Proficient Teachers [114.54835359657707]
Dominated point cloud-based 3D object detectors in autonomous driving scenarios rely heavily on the huge amount of accurately labeled samples.
Pseudo-Labeling methodology is commonly used for SSL frameworks, however, the low-quality predictions from the teacher model have seriously limited its performance.
We propose a new Pseudo-Labeling framework for semi-supervised 3D object detection, by enhancing the teacher model to a proficient one with several necessary designs.
arXiv Detail & Related papers (2022-07-26T04:54:03Z) - Compare learning: bi-attention network for few-shot learning [6.559037166322981]
One of the Few-shot learning methods called metric learning addresses this challenge by first learning a deep distance metric to determine whether a pair of images belong to the same category.
In this paper, we propose a novel approach named Bi-attention network to compare the instances, which can measure the similarity between embeddings of instances precisely, globally and efficiently.
arXiv Detail & Related papers (2022-03-25T07:39:10Z) - Joint CNN and Transformer Network via weakly supervised Learning for
efficient crowd counting [22.040942519355628]
We propose a Joint CNN and Transformer Network (JCTNet) via weakly supervised learning for crowd counting.
JCTNet can effectively focus on the crowd regions and obtain superior weakly supervised counting performance on five mainstream datasets.
arXiv Detail & Related papers (2022-03-12T09:40:29Z) - TransCrowd: Weakly-Supervised Crowd Counting with Transformer [56.84516562735186]
We propose TransCrowd, which reformulates the weakly-supervised crowd counting problem from the perspective of sequence-to-count based on Transformer.
Experiments on five benchmark datasets demonstrate that the proposed TransCrowd achieves superior performance compared with all the weakly-supervised CNN-based counting methods.
arXiv Detail & Related papers (2021-04-19T08:12:50Z) - Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths.
In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z) - Overcoming Statistical Shortcuts for Open-ended Visual Counting [54.858754825838865]
We aim to develop models that learn a proper mechanism of counting regardless of the output label.
First, we propose the Modifying Count Distribution protocol, which penalizes models that over-rely on statistical shortcuts.
Secondly, we introduce the Spatial Counting Network (SCN), which is dedicated to visual analysis and counting based on natural language questions.
arXiv Detail & Related papers (2020-06-17T18:02:01Z) - CRNet: Cross-Reference Networks for Few-Shot Segmentation [59.85183776573642]
Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images.
With a cross-reference mechanism, our network can better find the co-occurrent objects in the two images.
Experiments on the PASCAL VOC 2012 dataset show that our network achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-03-24T04:55:43Z) - Encoder-Decoder Based Convolutional Neural Networks with
Multi-Scale-Aware Modules for Crowd Counting [6.893512627479196]
We propose two modified neural networks for accurate and efficient crowd counting.
The first model is named M-SFANet, which is attached with atrous spatial pyramid pooling (ASPP) and context-aware module (CAN)
The second model is called M-SegNet, which is produced by replacing the bilinear upsampling in SFANet with max unpooling that is used in SegNet.
arXiv Detail & Related papers (2020-03-12T03:00:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.