Fugu-MT 論文翻訳(概要): Training Dynamics of Neural Software Defect Predictors under Coupled Data-Quality Issues

論文の概要: Training Dynamics of Neural Software Defect Predictors under Coupled Data-Quality Issues

arxiv url: http://arxiv.org/abs/2606.24968v1
Date: Tue, 23 Jun 2026 10:08:55 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-25 17:05:30.075338
Title: Training Dynamics of Neural Software Defect Predictors under Coupled Data-Quality Issues
Title（参考訳）: 結合データ品質問題におけるニューラルソフトウェア欠陥予測器のトレーニングダイナミクス
Authors: Emmanuel Charleson Dapaah, Philip Makedonski, Jens Grabowski,
Abstract要約: 本研究では,クラス不均衡,重なり合い,およびそれらの結合から生じるトレーニング・ダイナミックス・パターンが,インタラクション・アウェアな条件下でどのように特徴付けられるかを検討する。本研究は、メトリックベースのSDPにおいて、データ品質の複合問題に対する対話型経験的プロトコルと、トレーニング力学パターンの候補分類を作成する。
参考スコア（独自算出の注目度）: 0.34410212782758043
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Context: Software defect prediction supports maintenance decisions such as testing prioritization, release-risk assessment, and quality monitoring. However, metric-based SDP datasets often contain coupled data-quality issues, especially class imbalance and class overlap. Prior work has mainly measured their impact through endpoint performance, while recent evidence suggests that such issues may also appear in neural training dynamics (gradients, weights, biases, error trajectories). However, these studies examine issues in isolation, leaving open how internal neural network training patterns manifest when data quality issues are coupled. Objective: We investigate how training-dynamics patterns from class imbalance, overlap, and their coupling can be characterized under interaction-aware conditions in deep learning-based SDP. Method: We conduct a controlled intervention study on class-level UBD datasets, training a fixed MLP under imbalance-only, overlap-only, and joint conditions across five seeds. Training dynamics are logged per epoch; fidelity is monitored via coupling ratios. Patterns are characterized using effect sizes, trajectories, sensitivity analyses, and rule-based classification. Expected contribution: The study will produce an interaction-aware empirical protocol and a candidate taxonomy of training-dynamics patterns for coupled data-quality issues in metric-based SDP.
Abstract（参考訳）: コンテキスト: ソフトウェアの欠陥予測は、テストの優先順位付け、リリースリスク評価、品質監視といったメンテナンス上の決定をサポートする。しかし、メトリックベースのSDPデータセットには、データ品質の問題、特にクラス不均衡とクラスオーバーラップが混在していることが多い。以前の研究は主にエンドポイントのパフォーマンスを通じて影響を計測してきたが、最近の証拠は、そのような問題が神経トレーニングのダイナミクス(勾配、重み、バイアス、エラー軌跡)にも現れることを示唆している。しかしながら、これらの研究は、データ品質の問題が結合された場合、内部のニューラルネットワークトレーニングパターンがどのように現れるかをオープンにして、分離された問題を調べる。目的: 深層学習に基づくSDPにおいて, クラス不均衡, 重なり合い, およびそれらの結合が相互作用認識条件下でどのように特徴付けられるかを検討する。方法: クラスレベルのUBDデータセットに対する制御的介入研究を行い, 不均衡, 重複, 結合条件下での固定MLPの訓練を行った。トレーニングのダイナミクスはエポック毎にログされ、忠実度は結合比によって監視される。パターンは効果の大きさ、軌道、感度分析、規則に基づく分類によって特徴づけられる。期待されている貢献: この研究は、メトリックベースのSDPにおけるデータ品質の複合問題に対する、インタラクション対応の実証的プロトコルと、トレーニング力学パターンの候補分類を作成する。

論文の概要: Training Dynamics of Neural Software Defect Predictors under Coupled Data-Quality Issues

関連論文リスト