DISCount: Counting in Large Image Collections with Detector-Based
Importance Sampling
- URL: http://arxiv.org/abs/2306.03151v1
- Date: Mon, 5 Jun 2023 18:04:57 GMT
- Title: DISCount: Counting in Large Image Collections with Detector-Based
Importance Sampling
- Authors: Gustavo Perez, Subhransu Maji, Daniel Sheldon
- Abstract summary: DISCount is a detector-based importance sampling framework for counting in large image collections.
It integrates an imperfect detector with human-in-the-loop screening to produce unbiased estimates of counts.
- Score: 32.522579550452484
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many modern applications use computer vision to detect and count objects in
massive image collections. However, when the detection task is very difficult
or in the presence of domain shifts, the counts may be inaccurate even with
significant investments in training data and model development. We propose
DISCount -- a detector-based importance sampling framework for counting in
large image collections that integrates an imperfect detector with
human-in-the-loop screening to produce unbiased estimates of counts. We propose
techniques for solving counting problems over multiple spatial or temporal
regions using a small number of screened samples and estimate confidence
intervals. This enables end-users to stop screening when estimates are
sufficiently accurate, which is often the goal in a scientific study. On the
technical side we develop variance reduction techniques based on control
variates and prove the (conditional) unbiasedness of the estimators. DISCount
leads to a 9-12x reduction in the labeling costs over naive screening for tasks
we consider, such as counting birds in radar imagery or estimating damaged
buildings in satellite imagery, and also surpasses alternative covariate-based
screening approaches in efficiency.
Related papers
- Unsupervised Few-Shot Continual Learning for Remote Sensing Image Scene Classification [14.758282519523744]
Unsupervised flat-wide learning approach (UNISA) for unsupervised few-shot continual learning approaches of remote sensing image scene classifications.
Our numerical study with remote sensing image scene datasets and a hyperspectral dataset confirms the advantages of our solution.
arXiv Detail & Related papers (2024-06-04T03:06:41Z) - DAVE -- A Detect-and-Verify Paradigm for Low-Shot Counting [10.461109095311546]
Low-shot counters estimate the number of objects corresponding to a selected category, based on only few or no exemplars in the image.
Current state-of-the-art estimates the total counts as the sum over the object location density map, but does not provide individual object locations and sizes.
We propose DAVE, a low-shot counter based on a detect-and-verify paradigm, that avoids the aforementioned issues by first generating a high-recall detection set and then verifying the detections to identify and remove the outliers.
arXiv Detail & Related papers (2024-04-25T14:07:52Z) - Graph Spatiotemporal Process for Multivariate Time Series Anomaly
Detection with Missing Values [67.76168547245237]
We introduce a novel framework called GST-Pro, which utilizes a graphtemporal process and anomaly scorer to detect anomalies.
Our experimental results show that the GST-Pro method can effectively detect anomalies in time series data and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-01-11T10:10:16Z) - MMNet: Multi-Collaboration and Multi-Supervision Network for Sequential
Deepfake Detection [81.59191603867586]
Sequential deepfake detection aims to identify forged facial regions with the correct sequence for recovery.
The recovery of forged images requires knowledge of the manipulation model to implement inverse transformations.
We propose Multi-Collaboration and Multi-Supervision Network (MMNet) that handles various spatial scales and sequential permutations in forged face images.
arXiv Detail & Related papers (2023-07-06T02:32:08Z) - ReDFeat: Recoupling Detection and Description for Multimodal Feature
Learning [51.07496081296863]
We recouple independent constraints of detection and description of multimodal feature learning with a mutual weighting strategy.
We propose a detector that possesses a large receptive field and is equipped with learnable non-maximum suppression layers.
We build a benchmark that contains cross visible, infrared, near-infrared and synthetic aperture radar image pairs for evaluating the performance of features in feature matching and image registration tasks.
arXiv Detail & Related papers (2022-05-16T04:24:22Z) - Revisiting Consistency Regularization for Semi-supervised Change
Detection in Remote Sensing Images [60.89777029184023]
We propose a semi-supervised CD model in which we formulate an unsupervised CD loss in addition to the supervised Cross-Entropy (CE) loss.
Experiments conducted on two publicly available CD datasets show that the proposed semi-supervised CD method can reach closer to the performance of supervised CD.
arXiv Detail & Related papers (2022-04-18T17:59:01Z) - IS-COUNT: Large-scale Object Counting from Satellite Images with
Covariate-based Importance Sampling [90.97859312029615]
We propose an approach to estimate object count statistics over large geographies through sampling.
We show empirically that the proposed framework achieves strong performance on estimating the number of buildings in the United States and Africa, cars in Kenya, brick kilns in Bangladesh, and swimming pools in the U.S.
arXiv Detail & Related papers (2021-12-16T18:59:29Z) - You Better Look Twice: a new perspective for designing accurate
detectors with reduced computations [56.34005280792013]
BLT-net is a new low-computation two-stage object detection architecture.
It reduces computations by separating objects from background using a very lite first-stage.
Resulting image proposals are then processed in the second-stage by a highly accurate model.
arXiv Detail & Related papers (2021-07-21T12:39:51Z) - Filtering Empty Camera Trap Images in Embedded Systems [0.0]
We present a comparative study on animal recognition models to analyze the trade-off between precision and inference latency on edge devices.
The experiments show that, when using the same set of images for training, detectors achieve superior performance.
Considering the high cost of generating labels for the detection problem, when there is a massive number of images labeled for classification, classifiers are able to reach results comparable to detectors but with half latency.
arXiv Detail & Related papers (2021-04-18T13:56:22Z) - Localizing Grouped Instances for Efficient Detection in Low-Resource
Scenarios [27.920304852537534]
We propose a novel flexible detection scheme that efficiently adapts to variable object sizes and densities.
We rely on a sequence of detection stages, each of which has the ability to predict groups of objects as well as individuals.
We report experimental results on two aerial image datasets, and show that the proposed method is as accurate yet computationally more efficient than standard single-shot detectors.
arXiv Detail & Related papers (2020-04-27T07:56:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.