Fugu-MT 論文翻訳(概要): ReViSQL: Achieving Human-Level Text-to-SQL

論文の概要: ReViSQL: Achieving Human-Level Text-to-SQL

arxiv url: http://arxiv.org/abs/2603.20004v1
Date: Fri, 20 Mar 2026 14:49:27 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 19:48:39.188277
Title: ReViSQL: Achieving Human-Level Text-to-SQL
Title（参考訳）: ReViSQL: ヒューマンレベルテキストからSQLへの到達
Authors: Yuxuan Zhu, Tengjun Jin, Yoojin Choi, Daniel Kang,
Abstract要約: 本稿では,BIRDベンチマークデータ上での人間レベルの精度向上を目的とした,合理化フレームワークReViを紹介する。複雑なAIエージェントの代わりに、ReViは、私たちがキュレートしたデータセットBIRD-で検証可能な報酬(RLVR)で学習を活用する。我々はBIRDトレインの61.1%でデータエラーを特定し修正し、データ品質の改善だけで1世代精度を8.2～13.9%向上させることを示した。
参考スコア（独自算出の注目度）: 8.94428202485629
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Translating natural language to SQL (Text-to-SQL) is a critical challenge in both database research and data analytics applications. Recent efforts have focused on enhancing SQL reasoning by developing large language models and AI agents that decompose Text-to-SQL tasks into manually designed, step-by-step pipelines. However, despite these extensive architectural engineering efforts, a significant gap remains: even state-of-the-art (SOTA) AI agents have not yet achieved the human-level accuracy on the BIRD benchmark. In this paper, we show that closing this gap does not require further architectural complexity, but rather clean training data to improve SQL reasoning of the underlying models. We introduce ReViSQL, a streamlined framework that achieves human-level accuracy on BIRD for the first time. Instead of complex AI agents, ReViSQL leverages reinforcement learning with verifiable rewards (RLVR) on BIRD-Verified, a dataset we curated comprising 2.5k verified Text-to-SQL instances based on the BIRD Train set. To construct BIRD-Verified, we design a data correction and verification workflow involving SQL experts. We identified and corrected data errors in 61.1% of a subset of BIRD Train. By training on BIRD-Verified, we show that improving data quality alone boosts the single-generation accuracy by 8.2-13.9% under the same RLVR algorithm. To further enhance performance, ReViSQL performs inference-time scaling via execution-based reconciliation and majority voting. Empirically, we demonstrate the superiority of our framework with two model scales: ReViSQL-235B-A22B and ReViSQL-30B-A3B. On an expert-verified BIRD Mini-Dev set, ReViSQL-235B-A22B achieves 93.2% execution accuracy, exceeding the proxy human-level accuracy (92.96%) and outperforming the prior open-source SOTA method by 9.8%. Our lightweight ReViSQL-30B-A3B matches the prior SOTA at a 7.5$\times$ lower per-query cost.
Abstract（参考訳）: SQLへの自然言語の翻訳(Text-to-SQL)は、データベースの研究とデータ分析アプリケーションにおいて重要な課題である。近年の取り組みは、テキストからSQLまでのタスクを手作業で設計したステップバイステップパイプラインに分解する、大規模な言語モデルとAIエージェントを開発することで、SQL推論の強化に重点を置いている。最先端(SOTA)のAIエージェントでさえ、BIRDベンチマークで人間レベルの精度を達成できていない。本稿では,このギャップを埋めるにはアーキテクチャの複雑さが増す必要はなく,基礎となるモデルのSQL推論を改善するためのクリーンなトレーニングデータが必要であることを示す。 BIRD上で人間レベルの精度を初めて達成する,合理化されたフレームワークであるReViSQLを紹介する。複雑なAIエージェントの代わりに、ReViSQLはBIRD-Verified上で検証可能な報酬(RLVR)による強化学習を活用します。 BIRD-Verifiedを構築するために、我々はSQL専門家を含むデータ修正と検証のワークフローを設計する。我々はBIRDトレインの61.1%のサブセットでデータエラーを特定し修正した。 BIRD-Verifiedのトレーニングにより、データ品質の向上だけで、同一のRLVRアルゴリズムの下で1世代精度が8.2～13.9%向上することを示した。パフォーマンスをさらに向上するため、ReViSQLは実行ベースの和解と多数決による推論時間スケーリングを実行する。 ReViSQL-235B-A22BとReViSQL-30B-A3Bです。 BIRD Mini-Devセットでは、ReViSQL-235B-A22Bは93.2%の実行精度を達成し、プロキシレベルの精度(92.96%)を超え、以前のオープンソースSOTAメソッドを9.8%上回っている。私たちの軽量なReViSQL-30B-A3Bは、以前のSOTAと7.5$\times$1クエリあたりのコストで一致します。

論文の概要: ReViSQL: Achieving Human-Level Text-to-SQL

関連論文リスト