Shift Equivariance in Object Detection
- URL: http://arxiv.org/abs/2008.05787v1
- Date: Thu, 13 Aug 2020 10:02:02 GMT
- Title: Shift Equivariance in Object Detection
- Authors: Marco Manfredi and Yu Wang
- Abstract summary: Recent works have shown that CNN-based classifiers are not shift invariant.
It is unclear to what extent this could impact object detection, mainly because of the architectural differences between the two and the dimensionality of the prediction space of modern detectors.
We propose an evaluation metric, built upon a greedy search of the lower and upper bounds of the mean average precision on a shifted image set.
- Score: 8.03777903218606
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robustness to small image translations is a highly desirable property for
object detectors. However, recent works have shown that CNN-based classifiers
are not shift invariant. It is unclear to what extent this could impact object
detection, mainly because of the architectural differences between the two and
the dimensionality of the prediction space of modern detectors. To assess shift
equivariance of object detection models end-to-end, in this paper we propose an
evaluation metric, built upon a greedy search of the lower and upper bounds of
the mean average precision on a shifted image set. Our new metric shows that
modern object detection architectures, no matter if one-stage or two-stage,
anchor-based or anchor-free, are sensitive to even one pixel shift to the input
images. Furthermore, we investigate several possible solutions to this problem,
both taken from the literature and newly proposed, quantifying the
effectiveness of each one with the suggested metric. Our results indicate that
none of these methods can provide full shift equivariance. Measuring and
analyzing the extent of shift variance of different models and the
contributions of possible factors, is a first step towards being able to devise
methods that mitigate or even leverage such variabilities.
Related papers
- A Novel Unified Architecture for Low-Shot Counting by Detection and Segmentation [10.461109095311546]
Low-shot object counters estimate the number of objects in an image using few or no annotated exemplars.
The existing approaches often lead to overgeneralization and false positive detections.
We introduce GeCo, a novel low-shot counter that achieves accurate object detection, segmentation, and count estimation.
arXiv Detail & Related papers (2024-09-27T12:20:29Z) - Unsupervised Object Detection with Theoretical Guarantees [15.779730667509915]
We develop an unsupervised object detection architecture and prove that the learned variables correspond to the true object positions up to small shifts.
We validate our theoretical predictions up to a precision of individual pixels.
Unlike current SOTA object detection methods, our method's prediction errors always lie within our theoretical bounds.
arXiv Detail & Related papers (2024-06-11T14:12:31Z) - Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection [133.66006666465447]
Current metrics are size-sensitive, where larger objects are focused, and smaller ones tend to be ignored.
We argue that the evaluation should be size-invariant because bias based on size is unjustified without additional semantic information.
We develop an optimization framework tailored to this goal, achieving considerable improvements in detecting objects of different sizes.
arXiv Detail & Related papers (2024-05-16T03:01:06Z) - Fast and Accurate Object Detection on Asymmetrical Receptive Field [0.0]
This article proposes methods for improving object detection accuracy from the perspective of changing receptive fields.
The structure of the head part of YOLOv5 is modified by adding asymmetrical pooling layers.
The performances of the new model in this article are compared with original YOLOv5 model and analyzed from several parameters.
arXiv Detail & Related papers (2023-03-15T23:59:18Z) - Learning Transformations To Reduce the Geometric Shift in Object
Detection [60.20931827772482]
We tackle geometric shifts emerging from variations in the image capture process.
We introduce a self-training approach that learns a set of geometric transformations to minimize these shifts.
We evaluate our method on two different shifts, i.e., a camera's field of view (FoV) change and a viewpoint change.
arXiv Detail & Related papers (2023-01-13T11:55:30Z) - Suitability of Different Metric Choices for Concept Drift Detection [9.76294323004155]
Many unsupervised approaches for drift detection rely on measuring the discrepancy between the sample of two time windows.
Most drift detection methods can be distinguished in what metric they use, how this metric is estimated, and how the decision threshold is found.
We compare different types of estimators and metrics theoretically and empirically and investigate the relevance of the single metric components.
arXiv Detail & Related papers (2022-02-19T01:11:32Z) - Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic
Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer.
With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices.
We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z) - Decoupled Adaptation for Cross-Domain Object Detection [69.5852335091519]
Cross-domain object detection is more challenging than object classification.
D-adapt achieves a state-of-the-art results on four cross-domain object detection tasks.
arXiv Detail & Related papers (2021-10-06T08:43:59Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.