Leveraging Self-Supervision for Cross-Domain Crowd Counting
- URL: http://arxiv.org/abs/2103.16291v1
- Date: Tue, 30 Mar 2021 12:37:55 GMT
- Title: Leveraging Self-Supervision for Cross-Domain Crowd Counting
- Authors: Weizhe Liu, Nikita Durasov, Pascal Fua
- Abstract summary: State-of-the-art methods for counting people in crowded scenes rely on deep networks to estimate crowd density.
We train our network to recognize upside-down real images from regular ones and incorporate into it the ability to predict its own uncertainty.
This yields an algorithm that consistently outperforms state-of-the-art cross-domain crowd counting ones without any extra computation at inference time.
- Score: 71.75102529797549
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: State-of-the-art methods for counting people in crowded scenes rely on deep
networks to estimate crowd density. While effective, these data-driven
approaches rely on large amount of data annotation to achieve good performance,
which stops these models from being deployed in emergencies during which data
annotation is either too costly or cannot be obtained fast enough.
One popular solution is to use synthetic data for training. Unfortunately,
due to domain shift, the resulting models generalize poorly on real imagery. We
remedy this shortcoming by training with both synthetic images, along with
their associated labels, and unlabeled real images. To this end, we force our
network to learn perspective-aware features by training it to recognize
upside-down real images from regular ones and incorporate into it the ability
to predict its own uncertainty so that it can generate useful pseudo labels for
fine-tuning purposes. This yields an algorithm that consistently outperforms
state-of-the-art cross-domain crowd counting ones without any extra computation
at inference time.
Related papers
- SYRAC: Synthesize, Rank, and Count [19.20599654208014]
We propose a novel approach to eliminate the annotation burden by leveraging latent diffusion models to generate synthetic data.
We report state-of-the-art results for unsupervised crowd counting.
arXiv Detail & Related papers (2023-10-02T21:52:47Z) - Self-Supervised Pretraining for 2D Medical Image Segmentation [0.0]
Self-supervised learning offers a way to lower the need for manually annotated data by pretraining models for a specific domain on unlabelled data.
We find that self-supervised pretraining on natural images and target-domain-specific images leads to the fastest and most stable downstream convergence.
In low-data scenarios, supervised ImageNet pretraining achieves the best accuracy, requiring less than 100 annotated samples to realise close to minimal error.
arXiv Detail & Related papers (2022-09-01T09:25:22Z) - A low-rank representation for unsupervised registration of medical
images [10.499611180329804]
We propose a novel approach based on a low-rank representation, i.e., Regnet-LRR, to tackle the problem of noisy data registration scenarios.
We show that the low-rank representation can boost the ability and robustness of models as well as bring significant improvements in noisy data registration scenarios.
arXiv Detail & Related papers (2021-05-20T07:04:10Z) - A general approach to bridge the reality-gap [0.0]
A common approach to circumvent this is to leverage existing, similar data-sets with large amounts of labelled data.
We propose learning a general transformation to bring arbitrary images towards a canonical distribution.
This transformation is trained in an unsupervised regime, leveraging data augmentation to generate off-canonical examples of images.
arXiv Detail & Related papers (2020-09-03T18:19:28Z) - Deep Traffic Sign Detection and Recognition Without Target Domain Real
Images [52.079665469286496]
We propose a novel database generation method that requires no real image from the target-domain, and (ii) templates of the traffic signs.
The method does not aim at overcoming the training with real data, but to be a compatible alternative when the real data is not available.
On large data sets, training with a fully synthetic data set almost matches the performance of training with a real one.
arXiv Detail & Related papers (2020-07-30T21:06:47Z) - Syn2Real Transfer Learning for Image Deraining using Gaussian Processes [92.15895515035795]
CNN-based methods for image deraining have achieved excellent performance in terms of reconstruction error as well as visual quality.
Due to challenges in obtaining real world fully-labeled image deraining datasets, existing methods are trained only on synthetically generated data.
We propose a Gaussian Process-based semi-supervised learning framework which enables the network in learning to derain using synthetic dataset.
arXiv Detail & Related papers (2020-06-10T00:33:18Z) - Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm.
We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data.
Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z) - Towards Achieving Adversarial Robustness by Enforcing Feature
Consistency Across Bit Planes [51.31334977346847]
We train networks to form coarse impressions based on the information in higher bit planes, and use the lower bit planes only to refine their prediction.
We demonstrate that, by imposing consistency on the representations learned across differently quantized images, the adversarial robustness of networks improves significantly.
arXiv Detail & Related papers (2020-04-01T09:31:10Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.