Fugu-MT 論文翻訳(概要): Beyond Synthetic Augmentation: Group-Aware Threshold Calibration for Robust Balanced Accuracy in Imbalanced Learning

論文の概要: Beyond Synthetic Augmentation: Group-Aware Threshold Calibration for Robust Balanced Accuracy in Imbalanced Learning

arxiv url: http://arxiv.org/abs/2509.02592v1
Date: Fri, 29 Aug 2025 05:57:17 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-04 21:40:46.234788
Title: Beyond Synthetic Augmentation: Group-Aware Threshold Calibration for Robust Balanced Accuracy in Imbalanced Learning
Title（参考訳）: 合成強化を超えて:不均衡学習におけるロバストバランスの正確性に対するグループ認識閾値校正
Authors: Hunter Gittlin,
Abstract要約: クラス不均衡は、機械学習における根本的な課題である。グループ対応のしきい値キャリブレーションにより、クラス不均衡に対するよりシンプルで、より解釈可能で、より効果的な解が得られることを示す。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Class imbalance remains a fundamental challenge in machine learning, with traditional solutions often creating as many problems as they solve. We demonstrate that group-aware threshold calibration--setting different decision thresholds for different demographic groups--provides superior robustness compared to synthetic data generation methods. Through extensive experiments, we show that group-specific thresholds achieve 1.5-4% higher balanced accuracy than SMOTE and CT-GAN augmented models while improving worst-group balanced accuracy. Unlike single-threshold approaches that apply one cutoff across all groups, our group-aware method optimizes the Pareto frontier between balanced accuracy and worst-group balanced accuracy, enabling fine-grained control over group-level performance. Critically, we find that applying group thresholds to synthetically augmented data yields minimal additional benefit, suggesting these approaches are fundamentally redundant. Our results span seven model families including linear, tree-based, instance-based, and boosting methods, confirming that group-aware threshold calibration offers a simpler, more interpretable, and more effective solution to class imbalance.
Abstract（参考訳）: クラス不均衡は、機械学習における根本的な課題であり、従来のソリューションは、解決できる限り多くの問題を生み出すことが多い。集団認識しきい値のキャリブレーション – 異なる人口集団に対して異なる決定しきい値を設定する – は、合成データ生成法と比較して、優れたロバスト性を示す。実験により,SMOTEとCT-GANの併用モデルよりも,グループ別閾値が1.5～4%高い精度を実現し,最悪のグループ間バランス精度が向上した。全てのグループに1つのカットオフを適用するシングルスレッドアプローチとは異なり、グループ認識方式は、バランスの取れた精度と最悪のグループのバランスの取れた精度の間のパレートフロンティアを最適化し、グループレベルのパフォーマンスをきめ細かな制御を可能にする。批判的に、グループ閾値を合成的に拡張したデータに適用すると、最小限の付加的な利益が得られることが分かり、これらのアプローチは基本的に冗長であることを示す。その結果、線形、ツリーベース、インスタンスベース、ブースティングメソッドを含む7つのモデルファミリーにまたがって、グループ対応のしきい値キャリブレーションにより、クラス不均衡に対するよりシンプルで、より解釈可能で、より効果的な解が得られることを確認した。

論文の概要: Beyond Synthetic Augmentation: Group-Aware Threshold Calibration for Robust Balanced Accuracy in Imbalanced Learning

関連論文リスト