Fugu-MT 論文翻訳(概要): (How) Do Large Language Models Understand High-Level Message Sequence Charts?

論文の概要: (How) Do Large Language Models Understand High-Level Message Sequence Charts?

arxiv url: http://arxiv.org/abs/2605.13773v2
Date: Thu, 14 May 2026 04:48:10 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-15 15:19:49.925687
Title: (How) Do Large Language Models Understand High-Level Message Sequence Charts?
Title（参考訳）: ()大規模言語モデルは高レベルメッセージシーケンスチャートをどう理解するか?
Authors: Mohammad Reza Mousavi,
Abstract要約: 大規模言語モデル(LLM)は、ソフトウェア開発ライフサイクル全体にわたってタスクを自動化するために広く使われています。しかしながら、これらのタスクが処理対象のアーティファクトのセマンティクスに関して一貫して実行されるかどうかは不明である。 LLMがHMSCのセマンティクスを「理解」するかどうかを3つのLLMを用いて検討し、19のセマンティクスタスクの実行方法について検討した。
参考スコア（独自算出の注目度）: 0.23689955632456086
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) are being employed widely to automate tasks across the software development life-cycle. It is, however, unclear whether these tasks are performed consistently with respect to the semantics of the artefacts being handled. This question is particularly under-researched concerning architectural design specification. In this paper, we address this question for High-Level Message Sequence Charts (HMSCs). These are visual models with a rigorous formal semantics that have been used for various purposes, including as a foundation for Sequence Diagrams in the Unified Modelling Language (UML). We examine whether LLMs "understand" the semantics of HMSCs by examining three LLMs (Gemini-3, GPT-5.4, and Qwen-3.6) on how they perform 129 semantic tasks ranging from querying basic semantic constructs in HMSCs (i.e., events and their ordering) to semantic-preserving abstractions and compositions, and calculating the set of traces and trace-equivalent labelled transition systems. The results show that LLMs only have a modest understanding of the formal semantics of HMSCs (ca. 52% overall accuracy), with great variability across different semantic concepts: while LLMs seem to understand the basic semantic concepts of MSCs (ca. 88% accuracy), they struggle with semantic reasoning in tasks involving abstraction and composition (ca. 36% accuracy) and traces and LTSs (ca. 42% accuracy). In particular, all three LLMs struggle with the notions of co-region and explicit causal dependencies and never employed them in semantic-preserving transformations.
Abstract（参考訳）: 大規模言語モデル(LLM)は、ソフトウェア開発ライフサイクル全体にわたってタスクを自動化するために広く使われています。しかしながら、これらのタスクが処理対象のアーティファクトのセマンティクスに関して一貫して実行されるかどうかは不明である。この問題は特にアーキテクチャ設計の仕様について研究されていない。本稿では,HMSC(High-Level Message Sequence Charts)について述べる。これらは、UML(Unified Modelling Language)におけるシーケンスダイアグラム(Sequence Diagrams)の基盤など、様々な目的で使われてきた厳密な形式的意味論を持つビジュアルモデルである。我々は,3つのLLM(Gemini-3, GPT-5.4, Qwen-3.6)を用いて,HMSCの基本的意味構文の問合せから,意味保存抽象化や構成,トレースおよびトレース等価なラベル付き遷移システムの集合の計算まで,3つのLLM(Gemini-3, GPT-5.4, Qwen-3.6)を問うことによって,LLMがHMSCのセマンティクスを"理解"するかどうかを検討する。その結果、LLMはHMSCの形式的意味論(全体の52%の精度)を緩やかに理解し、異なる意味概念の相違も大きく、MSCの基本的な意味論概念(約88%の精度)を理解しているように見える一方で、抽象的・構成的タスク(約36%の精度)とトレース的・LTS的概念(約42%の精度)のセマンティック推論に苦慮していることがわかった。特に、3つのLLMは、共領域と明示的な因果依存性の概念に苦慮し、意味保存変換においてそれらを決して使用しない。

論文の概要: (How) Do Large Language Models Understand High-Level Message Sequence Charts?

関連論文リスト