SYRAC: Synthesize, Rank, and Count
- URL: http://arxiv.org/abs/2310.01662v3
- Date: Wed, 11 Oct 2023 19:56:13 GMT
- Title: SYRAC: Synthesize, Rank, and Count
- Authors: Adriano D'Alessandro, Ali Mahdavi-Amiri and Ghassan Hamarneh
- Abstract summary: We propose a novel approach to eliminate the annotation burden by leveraging latent diffusion models to generate synthetic data.
We report state-of-the-art results for unsupervised crowd counting.
- Score: 19.20599654208014
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Crowd counting is a critical task in computer vision, with several important
applications. However, existing counting methods rely on labor-intensive
density map annotations, necessitating the manual localization of each
individual pedestrian. While recent efforts have attempted to alleviate the
annotation burden through weakly or semi-supervised learning, these approaches
fall short of significantly reducing the workload. We propose a novel approach
to eliminate the annotation burden by leveraging latent diffusion models to
generate synthetic data. However, these models struggle to reliably understand
object quantities, leading to noisy annotations when prompted to produce images
with a specific quantity of objects. To address this, we use latent diffusion
models to create two types of synthetic data: one by removing pedestrians from
real images, which generates ranked image pairs with a weak but reliable object
quantity signal, and the other by generating synthetic images with a
predetermined number of objects, offering a strong but noisy counting signal.
Our method utilizes the ranking image pairs for pre-training and then fits a
linear layer to the noisy synthetic images using these crowd quantity features.
We report state-of-the-art results for unsupervised crowd counting.
Related papers
- Unlocking Pre-trained Image Backbones for Semantic Image Synthesis [29.688029979801577]
We propose a new class of GAN discriminators for semantic image synthesis that generates highly realistic images.
Our model, which we dub DP-SIMS, achieves state-of-the-art results in terms of image quality and consistency with the input label maps on ADE-20K, COCO-Stuff, and Cityscapes.
arXiv Detail & Related papers (2023-12-20T09:39:19Z) - Semantic Generative Augmentations for Few-Shot Counting [0.0]
We investigate how synthetic data can benefit few-shot class-agnostic counting.
We propose to rely on a double conditioning of Stable Diffusion with both a prompt and a density map.
Our experiments show that our diversified generation strategy significantly improves the counting accuracy of two recent and performing few-shot counting models.
arXiv Detail & Related papers (2023-10-26T11:42:48Z) - Ultrasonic Image's Annotation Removal: A Self-supervised Noise2Noise
Approach [6.459010811099552]
This study introduces an automated approach for detecting annotations in images.
It is achieved by treating the annotations as noise, creating a self-supervised pretext task and using a model trained under the Noise2Noise scheme to restore the image to a clean state.
Our results demonstrate that most models trained under the Noise2Noise scheme outperformed their counterparts trained with noisy-clean data pairs.
arXiv Detail & Related papers (2023-07-09T09:15:32Z) - Counting Guidance for High Fidelity Text-to-Image Synthesis [2.6212127510234797]
Text-to-image diffusion models fail to generate high fidelity content with respect to the input prompt.
E.g. given a prompt "five apples and ten lemons on a table", diffusion-generated images usually contain the wrong number of objects.
We propose a method to improve diffusion models to focus on producing the correct object count.
arXiv Detail & Related papers (2023-06-30T11:40:35Z) - Focus for Free in Density-Based Counting [56.961229110268036]
We introduce two methods that repurpose the available point annotations to enhance counting performance.
The first is a counting-specific augmentation that leverages point annotations to simulate occluded objects in both input and density images.
The second method, foreground distillation, generates foreground masks from the point annotations, from which we train an auxiliary network on images with blacked-out backgrounds.
arXiv Detail & Related papers (2023-06-08T11:54:37Z) - CamDiff: Camouflage Image Augmentation via Diffusion Model [83.35960536063857]
CamDiff is a novel approach to synthesize salient objects in camouflaged scenes.
We leverage the latent diffusion model to synthesize salient objects in camouflaged scenes.
Our approach enables flexible editing and efficient large-scale dataset generation at a low cost.
arXiv Detail & Related papers (2023-04-11T19:37:47Z) - Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis.
Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z) - A Deep Learning Generative Model Approach for Image Synthesis of Plant
Leaves [62.997667081978825]
We generate via advanced Deep Learning (DL) techniques artificial leaf images in an automatized way.
We aim to dispose of a source of training samples for AI applications for modern crop management.
arXiv Detail & Related papers (2021-11-05T10:53:35Z) - Leveraging Self-Supervision for Cross-Domain Crowd Counting [71.75102529797549]
State-of-the-art methods for counting people in crowded scenes rely on deep networks to estimate crowd density.
We train our network to recognize upside-down real images from regular ones and incorporate into it the ability to predict its own uncertainty.
This yields an algorithm that consistently outperforms state-of-the-art cross-domain crowd counting ones without any extra computation at inference time.
arXiv Detail & Related papers (2021-03-30T12:37:55Z) - Semi-Supervised Crowd Counting via Self-Training on Surrogate Tasks [50.78037828213118]
This paper tackles the semi-supervised crowd counting problem from the perspective of feature learning.
We propose a novel semi-supervised crowd counting method which is built upon two innovative components.
arXiv Detail & Related papers (2020-07-07T05:30:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.