Fugu-MT 論文翻訳(概要): Agentic Jackal: Live Execution and Semantic Value Grounding for Text-to-JQL

論文の概要: Agentic Jackal: Live Execution and Semantic Value Grounding for Text-to-JQL

arxiv url: http://arxiv.org/abs/2604.09470v1
Date: Fri, 10 Apr 2026 16:27:31 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-13 17:57:53.961216
Title: Agentic Jackal: Live Execution and Semantic Value Grounding for Text-to-JQL
Title（参考訳）: Agentic Jackal: Text-to-JQLのライブ実行とセマンティックバリューグラウンド
Authors: Vishnu Murali, Anmol Gulati, Elias Lumer, Kevin Frank, Sindy Campagna, Vamse Kumar Subbiah,
Abstract要約: 自然言語をJira Query Language(JQL)にマッピングするためのオープンな実行ベースのベンチマークは存在しない。 Jackalは、20万以上の問題のあるライブJiraインスタンスで10万の検証済みのNL-JQLペアで構成される、最初の大規模実行ベースのテキスト-to-JQLベンチマークである。
参考スコア（独自算出の注目度）: 1.5773713958458309
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Translating natural language into Jira Query Language (JQL) requires resolving ambiguous field references, instance-specific categorical values, and complex Boolean predicates. Single-pass LLMs cannot discover which categorical values (e.g., component names or fix versions) actually exist in a given Jira instance, nor can they verify generated queries against a live data source, limiting accuracy on paraphrased or ambiguous requests. No open, execution-based benchmark exists for mapping natural language to JQL. We introduce Jackal, the first large-scale, execution-based text-to-JQL benchmark comprising 100,000 validated NL-JQL pairs on a live Jira instance with over 200,000 issues. To establish baselines on Jackal, we propose Agentic Jackal, a tool-augmented agent that equips LLMs with live query execution via the Jira MCP server and JiraAnchor, a semantic retrieval tool that resolves natural-language mentions of categorical values through embedding-based similarity search. Among 9 frontier LLMs evaluated, single-pass models average only 43.4% execution accuracy on short natural-language queries, highlighting that text-to-JQL remains an open challenge. The agentic approach improves 7 of 9 models, with a 9.0% relative gain on the most linguistically challenging variant; in a controlled ablation isolating JiraAnchor, categorical-value accuracy rises from 48.7% to 71.7%, with component-field accuracy jumping from 16.9% to 66.2%. Our analysis identifies inherent semantic ambiguities, such as issue-type disambiguation and text-field selection, as the dominant failure modes rather than value-resolution errors, pointing to concrete directions for future work. We publicly release the benchmark, all agent transcripts, and evaluation code to support reproducibility.
Abstract（参考訳）: 自然言語をJira Query Language(JQL)に変換するには、あいまいなフィールド参照、インスタンス固有のカテゴリ値、複雑なBoolean述語を解決する必要がある。シングルパスのLCMでは、あるJiraインスタンスにどのカテゴリ値(コンポーネント名や修正バージョンなど)が存在するのか、あるいは生のデータソースに対して生成されたクエリを検証できないため、パラフレーズまたはあいまいなリクエストの精度が制限される。自然言語をJQLにマッピングするためのオープンな実行ベースのベンチマークは存在しない。 Jackalは、20万以上の問題のあるライブJiraインスタンスで10万の検証済みのNL-JQLペアで構成される、最初の大規模実行ベースのテキスト-to-JQLベンチマークである。 Jackalのベースラインを確立するために,Jira MCPサーバを介してライブクエリ実行を行うツール拡張エージェントのAgentic Jackalと,埋め込みベースの類似性検索によりカテゴリ値の自然言語参照を解決するセマンティック検索ツールのJiraAnchorを提案する。 9つのフロンティアLCMの評価のうち、シングルパスモデルの平均実行精度は43.4%で、JQLへのテキスト変換は依然としてオープンな課題である。エージェント的アプローチは9つのモデルのうち7つを改善し、最も言語学的に困難なモデルでは9.0%の相対的な増加を示し、JiraAnchorを分離する制御されたアブレーションでは、カテゴリー値の精度は48.7%から71.7%に上昇し、コンポーネントフィールドの精度は16.9%から66.2%に上昇した。本分析では,課題型曖昧化やテキストフィールド選択など固有の意味的曖昧さを,値分解誤差よりも優先的な障害モードとして認識し,今後の作業の具体的な方向性を示す。我々は、再現性をサポートするために、ベンチマーク、すべてのエージェント書き起こし、評価コードを公開した。

論文の概要: Agentic Jackal: Live Execution and Semantic Value Grounding for Text-to-JQL

関連論文リスト