Regressor-Segmenter Mutual Prompt Learning for Crowd Counting
- URL: http://arxiv.org/abs/2312.01711v3
- Date: Wed, 3 Jan 2024 09:35:21 GMT
- Title: Regressor-Segmenter Mutual Prompt Learning for Crowd Counting
- Authors: Mingyue Guo, Li Yuan, Zhaoyi Yan, Binghui Chen, Yaowei Wang, Qixiang
Ye
- Abstract summary: We propose mutual prompt learning (mPrompt) to solve bias and inaccuracy caused by annotation variance.
Experiments show that mPrompt significantly reduces the Mean Average Error (MAE)
- Score: 70.49246560246736
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Crowd counting has achieved significant progress by training regressors to
predict instance positions. In heavily crowded scenarios, however, regressors
are challenged by uncontrollable annotation variance, which causes density map
bias and context information inaccuracy. In this study, we propose mutual
prompt learning (mPrompt), which leverages a regressor and a segmenter as
guidance for each other, solving bias and inaccuracy caused by annotation
variance while distinguishing foreground from background. In specific, mPrompt
leverages point annotations to tune the segmenter and predict pseudo head masks
in a way of point prompt learning. It then uses the predicted segmentation
masks, which serve as spatial constraint, to rectify biased point annotations
as context prompt learning. mPrompt defines a way of mutual information
maximization from prompt learning, mitigating the impact of annotation variance
while improving model accuracy. Experiments show that mPrompt significantly
reduces the Mean Average Error (MAE), demonstrating the potential to be general
framework for down-stream vision tasks.
Related papers
- Enhancing Consistency and Mitigating Bias: A Data Replay Approach for
Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.
To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks.
However, it is not expected in practice considering the memory constraint or data privacy issue.
As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - Bayesian Prompt Learning for Image-Language Model Generalization [64.50204877434878]
We use the regularization ability of Bayesian methods to frame prompt learning as a variational inference problem.
Our approach regularizes the prompt space, reduces overfitting to the seen prompts and improves the prompt generalization on unseen prompts.
We demonstrate empirically on 15 benchmarks that Bayesian prompt learning provides an appropriate coverage of the prompt space.
arXiv Detail & Related papers (2022-10-05T17:05:56Z) - End-to-End Label Uncertainty Modeling in Speech Emotion Recognition
using Bayesian Neural Networks and Label Distribution Learning [0.0]
We propose an end-to-end Bayesian neural network capable of being trained on a distribution of annotations to capture the subjectivity-based label uncertainty.
We show that the proposed t-distribution based approach achieves state-of-the-art uncertainty modeling results in speech emotion recognition.
arXiv Detail & Related papers (2022-09-30T12:55:43Z) - Rethinking the Learning Paradigm for Facial Expression Recognition [56.050738381526116]
We rethink the existing training paradigm and propose that it is better to use weakly supervised strategies to train FER models with original ambiguous annotation.
This paper argues that it is better to use weakly supervised strategies to train FER models with original ambiguous annotation.
arXiv Detail & Related papers (2022-09-30T12:00:54Z) - Learning Debiased and Disentangled Representations for Semantic
Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation.
By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes.
Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z) - Variance-reduced Language Pretraining via a Mask Proposal Network [5.819397109258169]
Self-supervised learning, a.k.a., pretraining, is important in natural language processing.
In this paper, we tackle the problem from the view of gradient variance reduction.
To improve efficiency, we introduced a MAsk Network (MAPNet), which approximates the optimal mask proposal distribution.
arXiv Detail & Related papers (2020-08-12T14:12:32Z) - Deep Semi-supervised Knowledge Distillation for Overlapping Cervical
Cell Instance Segmentation [54.49894381464853]
We propose to leverage both labeled and unlabeled data for instance segmentation with improved accuracy by knowledge distillation.
We propose a novel Mask-guided Mean Teacher framework with Perturbation-sensitive Sample Mining.
Experiments show that the proposed method improves the performance significantly compared with the supervised method learned from labeled data only.
arXiv Detail & Related papers (2020-07-21T13:27:09Z) - PointMask: Towards Interpretable and Bias-Resilient Point Cloud
Processing [16.470806722781333]
PointMask is a model-agnostic interpretable information-bottleneck approach for attribution in point cloud models.
We show that coupling a PointMask layer with an arbitrary model can discern the points in the input space which contribute the most to the prediction score.
arXiv Detail & Related papers (2020-07-09T03:06:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.