Fugu-MT 論文翻訳(概要): BAMI: Training-Free Bias Mitigation in GUI Grounding

論文の概要: BAMI: Training-Free Bias Mitigation in GUI Grounding

arxiv url: http://arxiv.org/abs/2605.06664v1
Date: Thu, 07 May 2026 17:59:31 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-08 22:27:12.078394
Title: BAMI: Training-Free Bias Mitigation in GUI Grounding
Title（参考訳）: BAMI: GUIグラウンディングにおけるトレーニング不要バイアス軽減
Authors: Borui Zhang, Bo Zhang, Bo Wang, Wenzhao Zheng, Yuhao Cheng, Liang Tang, Yiqiang Yan, Jie Zhou, Jiwen Lu,
Abstract要約: 我々は、これらのバイアスを効果的に軽減するために、粗い焦点と候補選択という2つの重要な操作を組み込んだtextbfBias-Aware Manipulation Inference (BAMI)を導入する。 BAMIは、トレーニングフリー環境で様々なGUIグラウンドモデルの精度を大幅に向上させる。
参考スコア（独自算出の注目度）: 72.1023567399117
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: GUI grounding is a critical capability for enabling GUI agents to execute tasks such as clicking and dragging. However, in complex scenarios like the ScreenSpot-Pro benchmark, existing models often suffer from suboptimal performance. Utilizing the proposed \textbf{Masked Prediction Distribution (MPD)} attribution method, we identify that the primary sources of errors are twofold: high image resolution (leading to precision bias) and intricate interface elements (resulting in ambiguity bias). To address these challenges, we introduce \textbf{Bias-Aware Manipulation Inference (BAMI)}, which incorporates two key manipulations, coarse-to-fine focus and candidate selection, to effectively mitigate these biases. Our extensive experimental results demonstrate that BAMI significantly enhances the accuracy of various GUI grounding models in a training-free setting. For instance, applying our method to the TianXi-Action-7B model boosts its accuracy on the ScreenSpot-Pro benchmark from 51.9\% to 57.8\%. Furthermore, ablation studies confirm the robustness of the BAMI approach across diverse parameter configurations, highlighting its stability and effectiveness. Code is available at https://github.com/Neur-IO/BAMI.
Abstract（参考訳）: GUIグラウンディングは、GUIエージェントがクリックやドラッグのようなタスクを実行できるようにする重要な機能である。しかしながら、ScreenSpot-Proベンチマークのような複雑なシナリオでは、既存のモデルは、しばしば最適以下のパフォーマンスに悩まされる。提案手法をMPD(textbf{Masked Prediction Distribution)属性として用いた結果,誤差の主な原因は,高分解能(精度バイアス)と複雑なインターフェース要素(あいまいさバイアス)の2つであることが判明した。これらの課題に対処するために、粗い焦点と候補選択という2つの重要な操作を組み込んだ \textbf{Bias-Aware Manipulation Inference (BAMI) を導入し、これらのバイアスを効果的に軽減する。実験の結果,BAMIはトレーニング不要環境での各種GUIグラウンドモデルの精度を大幅に向上することが示された。例えば、我々の手法を TianXi-Action-7B モデルに適用すると、ScreenSpot-Pro ベンチマークの精度は 51.9\% から 57.8\% に向上する。さらに、アブレーション研究は、様々なパラメータ構成におけるBAMIアプローチの堅牢性を確認し、その安定性と有効性を強調した。コードはhttps://github.com/Neur-IO/BAMIで入手できる。

論文の概要: BAMI: Training-Free Bias Mitigation in GUI Grounding

関連論文リスト