DISCount: Counting in Large Image Collections with Detector-Based
Importance Sampling
- URL: http://arxiv.org/abs/2306.03151v1
- Date: Mon, 5 Jun 2023 18:04:57 GMT
- Title: DISCount: Counting in Large Image Collections with Detector-Based
Importance Sampling
- Authors: Gustavo Perez, Subhransu Maji, Daniel Sheldon
- Abstract summary: DISCount is a detector-based importance sampling framework for counting in large image collections.
It integrates an imperfect detector with human-in-the-loop screening to produce unbiased estimates of counts.
- Score: 32.522579550452484
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many modern applications use computer vision to detect and count objects in
massive image collections. However, when the detection task is very difficult
or in the presence of domain shifts, the counts may be inaccurate even with
significant investments in training data and model development. We propose
DISCount -- a detector-based importance sampling framework for counting in
large image collections that integrates an imperfect detector with
human-in-the-loop screening to produce unbiased estimates of counts. We propose
techniques for solving counting problems over multiple spatial or temporal
regions using a small number of screened samples and estimate confidence
intervals. This enables end-users to stop screening when estimates are
sufficiently accurate, which is often the goal in a scientific study. On the
technical side we develop variance reduction techniques based on control
variates and prove the (conditional) unbiasedness of the estimators. DISCount
leads to a 9-12x reduction in the labeling costs over naive screening for tasks
we consider, such as counting birds in radar imagery or estimating damaged
buildings in satellite imagery, and also surpasses alternative covariate-based
screening approaches in efficiency.
Related papers
- Uncertainty Estimation for 3D Object Detection via Evidential Learning [63.61283174146648]
We introduce a framework for quantifying uncertainty in 3D object detection by leveraging an evidential learning loss on Bird's Eye View representations in the 3D detector.
We demonstrate both the efficacy and importance of these uncertainty estimates on identifying out-of-distribution scenes, poorly localized objects, and missing (false negative) detections.
arXiv Detail & Related papers (2024-10-31T13:13:32Z) - A Novel Unified Architecture for Low-Shot Counting by Detection and Segmentation [10.461109095311546]
Low-shot object counters estimate the number of objects in an image using few or no annotated exemplars.
The existing approaches often lead to overgeneralization and false positive detections.
We introduce GeCo, a novel low-shot counter that achieves accurate object detection, segmentation, and count estimation.
arXiv Detail & Related papers (2024-09-27T12:20:29Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - DAVE -- A Detect-and-Verify Paradigm for Low-Shot Counting [10.461109095311546]
Low-shot counters estimate the number of objects corresponding to a selected category, based on only few or no exemplars in the image.
Current state-of-the-art estimates the total counts as the sum over the object location density map, but does not provide individual object locations and sizes.
We propose DAVE, a low-shot counter based on a detect-and-verify paradigm, that avoids the aforementioned issues by first generating a high-recall detection set and then verifying the detections to identify and remove the outliers.
arXiv Detail & Related papers (2024-04-25T14:07:52Z) - Graph Spatiotemporal Process for Multivariate Time Series Anomaly
Detection with Missing Values [67.76168547245237]
We introduce a novel framework called GST-Pro, which utilizes a graphtemporal process and anomaly scorer to detect anomalies.
Our experimental results show that the GST-Pro method can effectively detect anomalies in time series data and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-01-11T10:10:16Z) - MMNet: Multi-Collaboration and Multi-Supervision Network for Sequential
Deepfake Detection [81.59191603867586]
Sequential deepfake detection aims to identify forged facial regions with the correct sequence for recovery.
The recovery of forged images requires knowledge of the manipulation model to implement inverse transformations.
We propose Multi-Collaboration and Multi-Supervision Network (MMNet) that handles various spatial scales and sequential permutations in forged face images.
arXiv Detail & Related papers (2023-07-06T02:32:08Z) - Revisiting Consistency Regularization for Semi-supervised Change
Detection in Remote Sensing Images [60.89777029184023]
We propose a semi-supervised CD model in which we formulate an unsupervised CD loss in addition to the supervised Cross-Entropy (CE) loss.
Experiments conducted on two publicly available CD datasets show that the proposed semi-supervised CD method can reach closer to the performance of supervised CD.
arXiv Detail & Related papers (2022-04-18T17:59:01Z) - IS-COUNT: Large-scale Object Counting from Satellite Images with
Covariate-based Importance Sampling [90.97859312029615]
We propose an approach to estimate object count statistics over large geographies through sampling.
We show empirically that the proposed framework achieves strong performance on estimating the number of buildings in the United States and Africa, cars in Kenya, brick kilns in Bangladesh, and swimming pools in the U.S.
arXiv Detail & Related papers (2021-12-16T18:59:29Z) - Filtering Empty Camera Trap Images in Embedded Systems [0.0]
We present a comparative study on animal recognition models to analyze the trade-off between precision and inference latency on edge devices.
The experiments show that, when using the same set of images for training, detectors achieve superior performance.
Considering the high cost of generating labels for the detection problem, when there is a massive number of images labeled for classification, classifiers are able to reach results comparable to detectors but with half latency.
arXiv Detail & Related papers (2021-04-18T13:56:22Z) - Localizing Grouped Instances for Efficient Detection in Low-Resource
Scenarios [27.920304852537534]
We propose a novel flexible detection scheme that efficiently adapts to variable object sizes and densities.
We rely on a sequence of detection stages, each of which has the ability to predict groups of objects as well as individuals.
We report experimental results on two aerial image datasets, and show that the proposed method is as accurate yet computationally more efficient than standard single-shot detectors.
arXiv Detail & Related papers (2020-04-27T07:56:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.