Query-based Hard-Image Retrieval for Object Detection at Test Time
- URL: http://arxiv.org/abs/2209.11559v2
- Date: Thu, 29 Jun 2023 11:55:58 GMT
- Title: Query-based Hard-Image Retrieval for Object Detection at Test Time
- Authors: Edward Ayers, Jonathan Sadeghi, John Redford, Romain Mueller, Puneet
K. Dokania
- Abstract summary: We reformulate the problem of finding "hard" images as a query-based hard image retrieval task.
Our method is entirely post-hoc, does not require ground-truth annotations, and relies on an efficient Monte Carlo estimation.
We provide results on ranking and classification tasks using the widely used RetinaNet, Faster-RCNN, Mask-RCNN, and Cascade Mask-RCNN object detectors.
- Score: 10.63460618121976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is a longstanding interest in capturing the error behaviour of object
detectors by finding images where their performance is likely to be
unsatisfactory. In real-world applications such as autonomous driving, it is
also crucial to characterise potential failures beyond simple requirements of
detection performance. For example, a missed detection of a pedestrian close to
an ego vehicle will generally require closer inspection than a missed detection
of a car in the distance. The problem of predicting such potential failures at
test time has largely been overlooked in the literature and conventional
approaches based on detection uncertainty fall short in that they are agnostic
to such fine-grained characterisation of errors. In this work, we propose to
reformulate the problem of finding "hard" images as a query-based hard image
retrieval task, where queries are specific definitions of "hardness", and offer
a simple and intuitive method that can solve this task for a large family of
queries. Our method is entirely post-hoc, does not require ground-truth
annotations, is independent of the choice of a detector, and relies on an
efficient Monte Carlo estimation that uses a simple stochastic model in place
of the ground-truth. We show experimentally that it can be applied successfully
to a wide variety of queries for which it can reliably identify hard images for
a given detector without any labelled data. We provide results on ranking and
classification tasks using the widely used RetinaNet, Faster-RCNN, Mask-RCNN,
and Cascade Mask-RCNN object detectors. The code for this project is available
at https://github.com/fiveai/hardest.
Related papers
- Open-Set Semantic Uncertainty Aware Metric-Semantic Graph Matching [10.439907158831303]
A metric of semantic uncertainty for open-set object detections is calculated and incorporated into an object-level uncertainty tracking framework.
The proposed methods are feasible for real-time use in marine environments for the robust, open-set, multi-object, semantic-uncertainty-aware loop closure detection.
arXiv Detail & Related papers (2024-09-17T20:53:47Z) - Towards Building Self-Aware Object Detectors via Reliable Uncertainty
Quantification and Calibration [17.461451218469062]
In this work, we introduce the Self-Aware Object Detection (SAOD) task.
The SAOD task respects and adheres to the challenges that object detectors face in safety-critical environments such as autonomous driving.
We extensively use our framework, which introduces novel metrics and large scale test datasets, to test numerous object detectors.
arXiv Detail & Related papers (2023-07-03T11:16:39Z) - Semi-Supervised and Long-Tailed Object Detection with CascadeMatch [91.86787064083012]
We propose a novel pseudo-labeling-based detector called CascadeMatch.
Our detector features a cascade network architecture, which has multi-stage detection heads with progressive confidence thresholds.
We show that CascadeMatch surpasses existing state-of-the-art semi-supervised approaches in handling long-tailed object detection.
arXiv Detail & Related papers (2023-05-24T07:09:25Z) - SalienDet: A Saliency-based Feature Enhancement Algorithm for Object
Detection for Autonomous Driving [160.57870373052577]
We propose a saliency-based OD algorithm (SalienDet) to detect unknown objects.
Our SalienDet utilizes a saliency-based algorithm to enhance image features for object proposal generation.
We design a dataset relabeling approach to differentiate the unknown objects from all objects in training sample set to achieve Open-World Detection.
arXiv Detail & Related papers (2023-05-11T16:19:44Z) - Generalized Few-Shot 3D Object Detection of LiDAR Point Cloud for
Autonomous Driving [91.39625612027386]
We propose a novel task, called generalized few-shot 3D object detection, where we have a large amount of training data for common (base) objects, but only a few data for rare (novel) classes.
Specifically, we analyze in-depth differences between images and point clouds, and then present a practical principle for the few-shot setting in the 3D LiDAR dataset.
To solve this task, we propose an incremental fine-tuning method to extend existing 3D detection models to recognize both common and rare objects.
arXiv Detail & Related papers (2023-02-08T07:11:36Z) - Towards Hard-Positive Query Mining for DETR-based Human-Object
Interaction Detection [20.809479387186506]
Human-Object Interaction (HOI) detection is a core task for high-level image understanding.
In this paper, we propose to enhance Detection Transformer (DETR)-based HOI detectors by mining hard-positive queries.
Experimental results show that our proposed approach can be widely applied to existing DETR-based HOI detectors.
arXiv Detail & Related papers (2022-07-12T04:03:12Z) - Reference-based Defect Detection Network [57.89399576743665]
The first issue is the texture shift which means a trained defect detector model will be easily affected by unseen texture.
The second issue is partial visual confusion which indicates that a partial defect box is visually similar with a complete box.
We propose a Reference-based Defect Detection Network (RDDN) to tackle these two problems.
arXiv Detail & Related papers (2021-08-10T05:44:23Z) - Dynamic Relevance Learning for Few-Shot Object Detection [6.550840743803705]
We propose a dynamic relevance learning model, which utilizes the relationship between all support images and Region of Interest (RoI) on the query images to construct a dynamic graph convolutional network (GCN)
The proposed model achieves the best overall performance, which shows its effectiveness of learning more generalized features.
arXiv Detail & Related papers (2021-08-04T18:29:42Z) - AutoOD: Automated Outlier Detection via Curiosity-guided Search and
Self-imitation Learning [72.99415402575886]
Outlier detection is an important data mining task with numerous practical applications.
We propose AutoOD, an automated outlier detection framework, which aims to search for an optimal neural network model.
Experimental results on various real-world benchmark datasets demonstrate that the deep model identified by AutoOD achieves the best performance.
arXiv Detail & Related papers (2020-06-19T18:57:51Z) - FairMOT: On the Fairness of Detection and Re-Identification in Multiple
Object Tracking [92.48078680697311]
Multi-object tracking (MOT) is an important problem in computer vision.
We present a simple yet effective approach termed as FairMOT based on the anchor-free object detection architecture CenterNet.
The approach achieves high accuracy for both detection and tracking.
arXiv Detail & Related papers (2020-04-04T08:18:00Z) - Anomaly Detection by One Class Latent Regularized Networks [36.67420338535258]
Semi-supervised Generative Adversarial Networks (GAN)-based methods have been gaining popularity in anomaly detection task recently.
A novel adversarial dual autoencoder network is proposed, in which the underlying structure of training data is captured in latent feature space.
Experiments show that our model achieves the state-of-the-art results on MNIST and CIFAR10 datasets as well as GTSRB stop signs dataset.
arXiv Detail & Related papers (2020-02-05T02:21:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.