Fugu-MT 論文翻訳(概要): GUI-ReRank: Enhancing GUI Retrieval with Multi-Modal LLM-based Reranking

論文の概要: GUI-ReRank: Enhancing GUI Retrieval with Multi-Modal LLM-based Reranking

arxiv url: http://arxiv.org/abs/2508.03298v1
Date: Tue, 05 Aug 2025 10:17:38 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-06 18:18:55.913019
Title: GUI-ReRank: Enhancing GUI Retrieval with Multi-Modal LLM-based Reranking
Title（参考訳）: GUI-ReRank:マルチモーダルLCMによるGUI検索の強化
Authors: Kristian Kolthoff, Felix Kretzer, Christian Bartelt, Alexander Maedche, Simone Paolo Ponzetto,
Abstract要約: GUI-ReRankは、高速な埋め込みに基づく制約付き検索モデルと、非常に効果的なMLLMベースのリグレード技術を統合する新しいフレームワークである。提案手法を確立されたNLベースのGUI検索ベンチマークで評価した。
参考スコア（独自算出の注目度）: 55.762798168494726
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: GUI prototyping is a fundamental component in the development of modern interactive systems, which are now ubiquitous across diverse application domains. GUI prototypes play a critical role in requirements elicitation by enabling stakeholders to visualize, assess, and refine system concepts collaboratively. Moreover, prototypes serve as effective tools for early testing, iterative evaluation, and validation of design ideas with both end users and development teams. Despite these advantages, the process of constructing GUI prototypes remains resource-intensive and time-consuming, frequently demanding substantial effort and expertise. Recent research has sought to alleviate this burden through NL-based GUI retrieval approaches, which typically rely on embedding-based retrieval or tailored ranking models for specific GUI repositories. However, these methods often suffer from limited retrieval performance and struggle to generalize across arbitrary GUI datasets. In this work, we present GUI-ReRank, a novel framework that integrates rapid embedding-based constrained retrieval models with highly effective MLLM-based reranking techniques. GUI-ReRank further introduces a fully customizable GUI repository annotation and embedding pipeline, enabling users to effortlessly make their own GUI repositories searchable, which allows for rapid discovery of relevant GUIs for inspiration or seamless integration into customized LLM-based RAG workflows. We evaluated our approach on an established NL-based GUI retrieval benchmark, demonstrating that GUI-ReRank significantly outperforms SOTA tailored LTR models in both retrieval accuracy and generalizability. Additionally, we conducted a comprehensive cost and efficiency analysis of employing MLLMs for reranking, providing valuable insights regarding the trade-offs between retrieval effectiveness and computational resources. Video: https://youtu.be/_7x9UCh82ug
Abstract（参考訳）: GUIプロトタイピング(GUI Prototyping)は、様々なアプリケーションドメインにまたがる現代の対話型システム開発における基本的なコンポーネントである。 GUIプロトタイプは、利害関係者が協調してシステム概念を視覚化し、評価し、洗練させることによって、要求の導出において重要な役割を担います。さらにプロトタイプは、エンドユーザと開発チームの両方で設計アイデアを早期テスト、反復評価、検証するための効果的なツールとして役立ちます。これらの利点にもかかわらず、GUIプロトタイプの構築プロセスは資源集約的で時間を要するため、かなりの努力と専門知識を必要としている。近年の研究では、組み込みベースの検索や特定のGUIリポジトリのランキングモデルに依存する、NLベースのGUI検索アプローチを通じて、この負担を軽減することを目指している。しかし、これらの手法は検索性能の限界に悩まされ、任意のGUIデータセットにまたがる一般化に苦慮することが多い。本稿では, 高速埋め込みに基づく制約付き検索モデルと, MLLMに基づく高効率なリグレード手法を融合したGUI-ReRankを提案する。 GUI-ReRankはさらに、完全にカスタマイズ可能なGUIリポジトリのアノテーションと埋め込みパイプラインを導入し、ユーザが自由に独自のGUIリポジトリを検索できるようにする。提案手法を確立されたNLベースのGUI検索ベンチマークで評価し,GUI-ReRank がSOTA に最適化された LTR モデルよりも高い精度と一般化性を示した。さらに,MLLMを用いて再評価を行い,検索効率と計算資源とのトレードオフについて,総合的なコスト・効率分析を行った。ビデオ:https://youtu.be/_7x9UCh82ug

論文の概要: GUI-ReRank: Enhancing GUI Retrieval with Multi-Modal LLM-based Reranking

関連論文リスト