Fugu-MT 論文翻訳(概要): Towards Scalable and Interpretable Mobile App Risk Analysis via Large Language Models

論文の概要: Towards Scalable and Interpretable Mobile App Risk Analysis via Large Language Models

arxiv url: http://arxiv.org/abs/2508.15606v1
Date: Thu, 21 Aug 2025 14:33:13 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-22 16:26:46.365136
Title: Towards Scalable and Interpretable Mobile App Risk Analysis via Large Language Models
Title（参考訳）: 大規模言語モデルによるスケーラブルで解釈可能なモバイルアプリリスク分析を目指して
Authors: Yu Yang, Zhenyuan Li, Xiandong Ran, Jiahao Liu, Jiahui Wang, Bo Yu, Shouling Ji,
Abstract要約: モバイルアプリケーションのマーケットプレースでは,セキュリティリスクの特定と軽減のためにアプリを審査する責任がある。現在の審査プロセスは労働集約的であり、半自動化ツールによって支援されるセキュリティ専門家による手動分析に依存している。リスクの自動識別とプロファイリングにLarge Language Models(LLM)を利用するシステムであるMarsを提案する。
参考スコア（独自算出の注目度）: 36.98842280350961
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Mobile application marketplaces are responsible for vetting apps to identify and mitigate security risks. Current vetting processes are labor-intensive, relying on manual analysis by security professionals aided by semi-automated tools. To address this inefficiency, we propose Mars, a system that leverages Large Language Models (LLMs) for automated risk identification and profiling. Mars is designed to concurrently analyze multiple applications across diverse risk categories with minimal human intervention. To enhance analytical precision and operational efficiency, Mars leverages a pre-constructed risk identification tree to extract relevant indicators from high-dimensional application features. This initial step filters the data, reducing the input volume for the LLM and mitigating the potential for model hallucination induced by irrelevant features. The extracted indicators are then subjected to LLM analysis for final risk determination. Furthermore, Mars automatically generates a comprehensive evidence chain for each assessment, documenting the analytical process to provide transparent justification. These chains are designed to facilitate subsequent manual review and to inform enforcement decisions, such as application delisting. The performance of Mars was evaluated on a real-world dataset from a partner Android marketplace. The results demonstrate that Mars attained an F1-score of 0.838 in risk identification and an F1-score of 0.934 in evidence retrieval. To assess its practical applicability, a user study involving 20 expert analysts was conducted, which indicated that Mars yielded a substantial efficiency gain, ranging from 60% to 90%, over conventional manual analysis.
Abstract（参考訳）: モバイルアプリケーションのマーケットプレースでは,セキュリティリスクの特定と軽減のためにアプリを審査する責任がある。現在の審査プロセスは労働集約的であり、半自動化ツールによって支援されるセキュリティ専門家による手動分析に依存している。この非効率性に対処するため,リスクの自動識別とプロファイリングにLarge Language Models(LLM)を利用するシステムであるMarsを提案する。 Marsは、人間の介入を最小限に抑えて、さまざまなリスクカテゴリにまたがる複数のアプリケーションを同時に分析するように設計されている。分析精度と運用効率を向上させるため、火星は構築済みのリスク識別ツリーを活用して、高次元の応用特徴から関連する指標を抽出する。この初期ステップはデータをフィルタリングし、LCMの入力ボリュームを低減し、無関係な特徴によって誘導されるモデル幻覚の可能性を緩和する。抽出された指標は、最終リスク判定のためのLLM分析を受ける。さらに、火星はそれぞれの評価のための包括的なエビデンス連鎖を自動生成し、分析過程を文書化し、透明な正当化を提供する。これらのチェーンは、その後の手作業によるレビューの促進と、アプリケーションの削除などの強制的な決定を通知するために設計されている。 Marsのパフォーマンスは、パートナーのAndroidマーケットプレースの実際のデータセットで評価された。その結果、火星のリスク識別におけるF1スコアは0.838、エビデンス検索におけるF1スコアは0.934に達した。実用性を評価するため、20人の専門家によるユーザースタディが実施され、火星が従来の手動解析よりも60%から90%の効率向上を示した。

論文の概要: Towards Scalable and Interpretable Mobile App Risk Analysis via Large Language Models

関連論文リスト