Fugu-MT 論文翻訳(概要): Image Thresholding: Understanding Bias of Evaluation Metrics towards Specific Evaluation Functions

論文の概要: Image Thresholding: Understanding Bias of Evaluation Metrics towards Specific Evaluation Functions

arxiv url: http://arxiv.org/abs/2605.27132v1
Date: Tue, 26 May 2026 15:02:36 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-27 17:51:42.287496
Title: Image Thresholding: Understanding Bias of Evaluation Metrics towards Specific Evaluation Functions
Title（参考訳）: イメージThresholding:特定の評価機能に対する評価指標のバイアスを理解する
Authors: Eslam Hegazy, Mohamed Gabr,
Abstract要約: マルチレベル画像閾値設定は、医療画像からリモートセンシングまで幅広い用途において、セグメンテーションに広く用いられている。大津のクラス間分散やカプールのエントロピーのような古典的目的関数はメタヒューリスティックアルゴリズムを用いて最適化されることが多い。 BSDS500データセットにおける画像のしきい値に対するしきい値と品質指標の相関関係を解析する。
参考スコア（独自算出の注目度）: 0.8250374560598496
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multilevel image thresholding is widely used for segmentation in applications ranging from medical imaging to remote sensing. Classical objective functions, such as Otsu's between-class variance and Kapur's entropy, are often optimized using metaheuristic algorithms, with performance evaluated via metrics like Structural Similarity Index (SSIM) and Peak Signal-to-Noise Ratio (PSNR). These evaluations implicitly assume that SSIM and PSNR provide unbiased measures of segmentation quality. In this study, we examine this assumption by analyzing the correlation between thresholding objective functions and quality metrics across all possible thresholds for images in the BSDS500 dataset. Results show that Otsu's criterion consistently exhibits high correlation with both SSIM and PSNR, while Kapur's entropy demonstrates weaker and more variable correlation. Otsu outperforms Kapur in correlation with PSNR for all images and with SSIM for over 91%. Our findings reveal an inherent metric-objective-function bias. This work highlights the need for more neutral evaluation frameworks and motivates extending the analysis to additional thresholding criteria and domains. Source code of this paper can be found at https://w3id.org/met-dp/icpr26-95
Abstract（参考訳）: マルチレベル画像閾値設定は、医療画像からリモートセンシングまで幅広い用途において、セグメンテーションに広く用いられている。大津のクラス間分散やカプールのエントロピーのような古典的目的関数は、しばしばメタヒューリスティックアルゴリズムを用いて最適化され、構造類似度指数(SSIM)やピーク信号対雑音比(PSNR)といった指標によって評価される。これらの評価は、SSIMとPSNRがセグメンテーション品質の偏りのない尺度を提供すると暗黙的に仮定する。本研究では、BSDS500データセットにおける画像のしきい値に対するしきい値と品質指標との相関関係を解析することにより、この仮定を考察する。その結果、大津の基準はSSIMとPSNRの両方と常に高い相関を示し、一方カプールのエントロピーはより弱く、より可変な相関を示した。大津は、すべての画像のPSNRと91%以上のSSIMとの相関でカプールを上回っている。以上より, 主観的主観的主観的偏見がみられた。この研究は、より中立的な評価フレームワークの必要性を強調し、分析を追加のしきい値基準とドメインにまで拡張する動機付けである。この論文のソースコードはhttps://w3id.org/met-dp/icpr26-95にある。

論文の概要: Image Thresholding: Understanding Bias of Evaluation Metrics towards Specific Evaluation Functions

関連論文リスト