Enhancing Evaluation Methods for Infrared Small-Target Detection in Real-world Scenarios
- URL: http://arxiv.org/abs/2301.03796v2
- Date: Sun, 25 Aug 2024 21:45:24 GMT
- Title: Enhancing Evaluation Methods for Infrared Small-Target Detection in Real-world Scenarios
- Authors: Saed Moradi, Alireza Memarmoghadam, Payman Moallem, Mohamad Farzan Sabahi,
- Abstract summary: Infrared small target detection (IRSTD) poses a significant challenge in the field of computer vision.
There has been a lack of extensive investigation into the evaluation metrics used for assessing their performance.
We employ a systematic approach to address this issue by first evaluating the effectiveness of existing metrics and then proposing new metrics to overcome the limitations of conventional ones.
- Score: 2.6723845245975064
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Infrared small target detection (IRSTD) poses a significant challenge in the field of computer vision. While substantial efforts have been made over the past two decades to improve the detection capabilities of IRSTD algorithms, there has been a lack of extensive investigation into the evaluation metrics used for assessing their performance. In this paper, we employ a systematic approach to address this issue by first evaluating the effectiveness of existing metrics and then proposing new metrics to overcome the limitations of conventional ones. To achieve this, we carefully analyze the necessary conditions for successful detection and identify the shortcomings of current evaluation metrics, including both pre-thresholding and post-thresholding metrics. We then introduce new metrics that are designed to align with the requirements of real-world systems. Furthermore, we utilize these newly proposed metrics to compare and evaluate the performance of five widely recognized small infrared target detection algorithms. The results demonstrate that the new metrics provide consistent and meaningful quantitative assessments, aligning with qualitative observations.
Related papers
- Rethinking Metrics and Benchmarks of Video Anomaly Detection [12.500876355560184]
Video Anomaly Detection (VAD) aims to detect anomalies that deviate from expectation.<n>In this paper, we rethink VAD evaluation protocols through comprehensive experimental analyses.<n>We propose three novel evaluation methods to address these limitations.
arXiv Detail & Related papers (2025-05-25T08:09:42Z) - Open-set object detection: towards unified problem formulation and benchmarking [2.4374097382908477]
We introduce two benchmarks: a unified VOC-COCO evaluation, and the new OpenImagesRoad benchmark which provides clear hierarchical object definition besides new evaluation metrics.
State-of-the-art methods are extensively evaluated on the proposed benchmarks.
This study provides a clear problem definition, ensures consistent evaluations, and draws new conclusions about effectiveness of OSOD strategies.
arXiv Detail & Related papers (2024-11-08T13:40:01Z) - Dissecting Out-of-Distribution Detection and Open-Set Recognition: A Critical Analysis of Methods and Benchmarks [17.520137576423593]
We aim to provide a consolidated view of the two largest sub-fields within the community: out-of-distribution (OOD) detection and open-set recognition (OSR)
We perform rigorous cross-evaluation between state-of-the-art methods in the OOD detection and OSR settings and identify a strong correlation between the performances of methods for them.
We propose a new, large-scale benchmark setting which we suggest better disentangles the problem tackled by OOD detection and OSR.
arXiv Detail & Related papers (2024-08-29T17:55:07Z) - Refined Infrared Small Target Detection Scheme with Single-Point Supervision [2.661766509317245]
We propose an innovative refined infrared small target detection scheme with single-point supervision.
The proposed scheme achieves state-of-the-art (SOTA) performance.
Notably, the proposed scheme won the third place in the "ICPR 2024 Resource-Limited Infrared Small Target Detection Challenge Track 1: Weakly Supervised Infrared Small Target Detection"
arXiv Detail & Related papers (2024-08-05T18:49:58Z) - Towards Effective Evaluations and Comparisons for LLM Unlearning Methods [97.2995389188179]
This paper seeks to refine the evaluation of machine unlearning for large language models.
It addresses two key challenges -- the robustness of evaluation metrics and the trade-offs between competing goals.
arXiv Detail & Related papers (2024-06-13T14:41:00Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - On Pixel-level Performance Assessment in Anomaly Detection [87.7131059062292]
Anomaly detection methods have demonstrated remarkable success across various applications.
However, assessing their performance, particularly at the pixel-level, presents a complex challenge.
In this paper, we dissect the intricacies of this challenge, underscored by visual evidence and statistical analysis.
arXiv Detail & Related papers (2023-10-25T08:02:27Z) - Beyond AUROC & co. for evaluating out-of-distribution detection
performance [50.88341818412508]
Given their relevance for safe(r) AI, it is important to examine whether the basis for comparing OOD detection methods is consistent with practical needs.
We propose a new metric - Area Under the Threshold Curve (AUTC), which explicitly penalizes poor separation between ID and OOD samples.
arXiv Detail & Related papers (2023-06-26T12:51:32Z) - From Static Benchmarks to Adaptive Testing: Psychometrics in AI Evaluation [60.14902811624433]
We discuss a paradigm shift from static evaluation methods to adaptive testing.
This involves estimating the characteristics and value of each test item in the benchmark and dynamically adjusting items in real-time.
We analyze the current approaches, advantages, and underlying reasons for adopting psychometrics in AI evaluation.
arXiv Detail & Related papers (2023-06-18T09:54:33Z) - One-Stage Cascade Refinement Networks for Infrared Small Target
Detection [21.28595135499812]
Single-frame InfraRed Small Target (SIRST) detection has been a challenging task due to a lack of inherent characteristics.
We present a new research benchmark for infrared small target detection consisting of the SIRST-V2 dataset of real-world, high-resolution single-frame targets.
arXiv Detail & Related papers (2022-12-16T13:37:23Z) - GO FIGURE: A Meta Evaluation of Factuality in Summarization [131.1087461486504]
We introduce GO FIGURE, a meta-evaluation framework for evaluating factuality evaluation metrics.
Our benchmark analysis on ten factuality metrics reveals that our framework provides a robust and efficient evaluation.
It also reveals that while QA metrics generally improve over standard metrics that measure factuality across domains, performance is highly dependent on the way in which questions are generated.
arXiv Detail & Related papers (2020-10-24T08:30:20Z) - PONE: A Novel Automatic Evaluation Metric for Open-Domain Generative
Dialogue Systems [48.99561874529323]
There are three kinds of automatic methods to evaluate the open-domain generative dialogue systems.
Due to the lack of systematic comparison, it is not clear which kind of metrics are more effective.
We propose a novel and feasible learning-based metric that can significantly improve the correlation with human judgments.
arXiv Detail & Related papers (2020-04-06T04:36:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.