Related papers: Robust Zero-Shot Crowd Counting and Localization With Adaptive Resolution SAM

Robust Zero-Shot Crowd Counting and Localization With Adaptive Resolution SAM

URL: http://arxiv.org/abs/2402.17514v2
Date: Thu, 15 Aug 2024 09:38:42 GMT
Title: Robust Zero-Shot Crowd Counting and Localization With Adaptive Resolution SAM
Authors: Jia Wan, Qiangqiang Wu, Wei Lin, Antoni B. Chan,
Abstract summary: We propose a simple yet effective crowd counting method by utilizing the Segment-Everything-Everywhere Model (SEEM) We show that SEEM's performance in dense crowd scenes is limited, primarily due to the omission of many persons in high-density areas. Our proposed method achieves the best unsupervised performance in crowd counting, while also being comparable to some supervised methods.
Score: 55.93697196726016
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The existing crowd counting models require extensive training data, which is time-consuming to annotate. To tackle this issue, we propose a simple yet effective crowd counting method by utilizing the Segment-Everything-Everywhere Model (SEEM), an adaptation of the Segmentation Anything Model (SAM), to generate pseudo-labels for training crowd counting models. However, our initial investigation reveals that SEEM's performance in dense crowd scenes is limited, primarily due to the omission of many persons in high-density areas. To overcome this limitation, we propose an adaptive resolution SEEM to handle the scale variations, occlusions, and overlapping of people within crowd scenes. Alongside this, we introduce a robust localization method, based on Gaussian Mixture Models, for predicting the head positions in the predicted people masks. Given the mask and point pseudo-labels, we propose a robust loss function, which is designed to exclude uncertain regions based on SEEM's predictions, thereby enhancing the training process of the counting networks. Finally, we propose an iterative method for generating pseudo-labels. This method aims at improving the quality of the segmentation masks by identifying more tiny persons in high-density regions, which are often missed in the first pseudo-labeling stage. Overall, our proposed method achieves the best unsupervised performance in crowd counting, while also being comparable results to some supervised methods. This makes it a highly effective and versatile tool for crowd counting, especially in situations where labeled data is not available.

Related papers

ProgRoCC: A Progressive Approach to Rough Crowd Counting [66.09510514180593]
We label Rough Crowd Counting that delivers better accuracy on the basis of training data that is easier to acquire. We propose an approach to the rough crowd counting problem based on CLIP, termed ProgRoCC. Specifically, we introduce a progressive estimation learning strategy that determines the object count through a coarse-to-fine approach.
arXiv Detail & Related papers (2025-04-18T01:57:42Z)
Embodied Crowd Counting [86.10533153162476]
Embodied Crowd Counting (ECC) is an interactive simulator that enables large scale scenes and large object quantity. A prior probability distribution that approximates realistic crowd distribution is introduced to generate crowds. This method contains a MLLM driven coarse-to-fine navigation mechanism, enabling active Z-axis exploration.
arXiv Detail & Related papers (2025-03-11T12:23:34Z)
Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation [49.827306773992376]
Continual Test-Time Adaptation (CTTA) is proposed to migrate a source pre-trained model to continually changing target distributions. Our proposed method attains state-of-the-art performance in both classification and segmentation CTTA tasks.
arXiv Detail & Related papers (2023-12-19T15:34:52Z)
Semi-Supervised Crowd Counting with Contextual Modeling: Facilitating Holistic Understanding of Crowd Scenes [19.987151025364067]
This paper presents a new semi-supervised method for training a reliable crowd counting model. We foster the model's intrinsic'subitizing' capability, which allows it to accurately estimate the count in regions. Our method achieves the state-of-the-art performance, surpassing previous approaches by a large margin on challenging benchmarks.
arXiv Detail & Related papers (2023-10-16T12:42:43Z)
Multi-View Knowledge Distillation from Crowd Annotations for Out-of-Domain Generalization [53.24606510691877]
We propose new methods for acquiring soft-labels from crowd-annotations by aggregating the distributions produced by existing methods. We demonstrate that these aggregation methods lead to the most consistent performance across four NLP tasks on out-of-domain test sets.
arXiv Detail & Related papers (2022-12-19T12:40:18Z)
Rethinking Clustering-Based Pseudo-Labeling for Unsupervised Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling. This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data. We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z)
Wisdom of (Binned) Crowds: A Bayesian Stratification Paradigm for Crowd Counting [16.09823718637455]
We analyze the performance of crowd counting approaches across standard datasets at per strata level and in aggregate. Our contributions represent a nuanced, statistically balanced and fine-grained characterization of performance for crowd counting approaches.
arXiv Detail & Related papers (2021-08-19T16:50:31Z)
A Three-Stage Self-Training Framework for Semi-Supervised Semantic Segmentation [0.9786690381850356]
We propose a holistic solution framed as a three-stage self-training framework for semantic segmentation. The key idea of our technique is the extraction of the pseudo-masks statistical information. We then decrease the uncertainty of the pseudo-masks using a multi-task model that enforces consistency.
arXiv Detail & Related papers (2020-12-01T21:00:27Z)
Completely Self-Supervised Crowd Counting via Distribution Matching [92.09218454377395]
We propose a complete self-supervision approach to training models for dense crowd counting. The only input required to train, apart from a large set of unlabeled crowd images, is the approximate upper limit of the crowd count. Our method dwells on the idea that natural crowds follow a power law distribution, which could be leveraged to yield error signals for backpropagation.
arXiv Detail & Related papers (2020-09-14T13:20:12Z)
Mask-guided sample selection for Semi-Supervised Instance Segmentation [13.091166009687058]
We propose a sample selection approach to decide which samples to annotate for semi-supervised instance segmentation. Our method consists in first predicting pseudo-masks for the unlabeled pool of samples, together with a score predicting the quality of the mask. We study which samples are better to annotate given the quality score, and show how our approach outperforms a random selection.
arXiv Detail & Related papers (2020-08-25T14:44:58Z)
Semi-Supervised Crowd Counting via Self-Training on Surrogate Tasks [50.78037828213118]
This paper tackles the semi-supervised crowd counting problem from the perspective of feature learning. We propose a novel semi-supervised crowd counting method which is built upon two innovative components.
arXiv Detail & Related papers (2020-07-07T05:30:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.