Correlation of Object Detection Performance with Visual Saliency and Depth Estimation
- URL: http://arxiv.org/abs/2411.02844v1
- Date: Tue, 05 Nov 2024 06:34:19 GMT
- Title: Correlation of Object Detection Performance with Visual Saliency and Depth Estimation
- Authors: Matthias Bartolo, Dylan Seychell,
- Abstract summary: This paper investigates the correlations between object detection accuracy and two fundamental visual tasks: depth prediction and visual saliency prediction.
Our analysis reveals significant variations in these correlations across object categories, with larger objects showing correlation values up to three times higher than smaller objects.
These findings suggest incorporating visual saliency features into object detection architectures could be more beneficial than depth information.
- Score: 0.09208007322096533
- License:
- Abstract: As object detection techniques continue to evolve, understanding their relationships with complementary visual tasks becomes crucial for optimising model architectures and computational resources. This paper investigates the correlations between object detection accuracy and two fundamental visual tasks: depth prediction and visual saliency prediction. Through comprehensive experiments using state-of-the-art models (DeepGaze IIE, Depth Anything, DPT-Large, and Itti's model) on COCO and Pascal VOC datasets, we find that visual saliency shows consistently stronger correlations with object detection accuracy (mA$\rho$ up to 0.459 on Pascal VOC) compared to depth prediction (mA$\rho$ up to 0.283). Our analysis reveals significant variations in these correlations across object categories, with larger objects showing correlation values up to three times higher than smaller objects. These findings suggest incorporating visual saliency features into object detection architectures could be more beneficial than depth information, particularly for specific object categories. The observed category-specific variations also provide insights for targeted feature engineering and dataset design improvements, potentially leading to more efficient and accurate object detection systems.
Related papers
- Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - On the Importance of Large Objects in CNN Based Object Detection
Algorithms [0.0]
We highlight the importance of large objects in learning features that are critical for all sizes.
We show that giving more weight to large objects leads to improved detection scores across all object sizes.
arXiv Detail & Related papers (2023-11-20T12:32:32Z) - The Impact of Different Backbone Architecture on Autonomous Vehicle
Dataset [120.08736654413637]
The quality of the features extracted by the backbone architecture can have a significant impact on the overall detection performance.
Our study evaluates three well-known autonomous vehicle datasets, namely KITTI, NuScenes, and BDD, to compare the performance of different backbone architectures on object detection tasks.
arXiv Detail & Related papers (2023-09-15T17:32:15Z) - Exploring Predicate Visual Context in Detecting Human-Object
Interactions [44.937383506126274]
We study how best to re-introduce image features via cross-attention.
Our model with enhanced predicate visual context (PViC) outperforms state-of-the-art methods on the HICO-DET and V-COCO benchmarks.
arXiv Detail & Related papers (2023-08-11T15:57:45Z) - Unified Visual Relationship Detection with Vision and Language Models [89.77838890788638]
This work focuses on training a single visual relationship detector predicting over the union of label spaces from multiple datasets.
We propose UniVRD, a novel bottom-up method for Unified Visual Relationship Detection by leveraging vision and language models.
Empirical results on both human-object interaction detection and scene-graph generation demonstrate the competitive performance of our model.
arXiv Detail & Related papers (2023-03-16T00:06:28Z) - Object Detection in Aerial Images with Uncertainty-Aware Graph Network [61.02591506040606]
We propose a novel uncertainty-aware object detection framework with a structured-graph, where nodes and edges are denoted by objects.
We refer to our model as Uncertainty-Aware Graph network for object DETection (UAGDet)
arXiv Detail & Related papers (2022-08-23T07:29:03Z) - Knowledge Distillation for Oriented Object Detection on Aerial Images [1.827510863075184]
We present a model compression method for rotated object detection on aerial images by knowledge distillation, namely KD-RNet.
The experimental result on a large-scale aerial object detection dataset (DOTA) demonstrates that the proposed KD-RNet model can achieve improved mean-average precision (mAP) with reduced number of parameters, at the same time, KD-RNet boost the performance on providing high quality detections with higher overlap with groundtruth annotations.
arXiv Detail & Related papers (2022-06-20T14:24:16Z) - Comprehensive Analysis of the Object Detection Pipeline on UAVs [16.071349046409885]
We first empirically analyze the influence of seven parameters (quantization, compression, resolution, color model, image distortion, gamma correction, additional channels) in remote sensing applications.
We show that not all parameters have an equal impact on detection accuracy and data throughput, and that by using a suitable compromise between parameters we are able to improve detection accuracy for lightweight object detection models.
arXiv Detail & Related papers (2022-03-01T09:30:01Z) - Contrastive Object Detection Using Knowledge Graph Embeddings [72.17159795485915]
We compare the error statistics of the class embeddings learned from a one-hot approach with semantically structured embeddings from natural language processing or knowledge graphs.
We propose a knowledge-embedded design for keypoint-based and transformer-based object detection architectures.
arXiv Detail & Related papers (2021-12-21T17:10:21Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.