Fugu-MT 論文翻訳(概要): Declarative Outcome-Conformant Synthesis: Exact, Closed-Form Specification Satisfaction and a Conformance Benchmark

論文の概要: Declarative Outcome-Conformant Synthesis: Exact, Closed-Form Specification Satisfaction and a Conformance Benchmark

arxiv url: http://arxiv.org/abs/2606.08736v1
Date: Sun, 07 Jun 2026 17:10:40 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-09 14:42:06.421451
Title: Declarative Outcome-Conformant Synthesis: Exact, Closed-Form Specification Satisfaction and a Conformance Benchmark
Title（参考訳）: 宣言的アウトカム・コンフォーマント合成:実行、クローズドフォーム仕様満足度およびコンフォーマンスベンチマーク
Authors: Muhammed Rasin,
Abstract要約: イミテーション法は実際の分布とサンプルを学習し、実データへの忠実性に基づいて判断される。市販の模倣ツールは、そのようなターゲットに対するインターフェースを提供しておらず、サンプルが正確なアグリゲーションをヒットすることはない。我々は、このタスク結果整合合成を命名し、その評価軸は忠実性よりも整合性であり、2つの軸が整合性であることを示す。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study a capability the dominant paradigm in synthetic tabular data does not provide: exact satisfaction of a declared analytical outcome with no source data. Imitation methods (copulas, GANs, diffusion) learn a real distribution and sample from it, and are judged on fidelity to real data. A large, practical class of needs is different: generating data with no source data ("cold start") that reproduces a declared outcome (a revenue curve, a churn rate, a group share) across a relational schema. Off-the-shelf imitation tools offer no interface for such targets, and no sampler can hit an exact aggregate, because sampling has variance. On a real public dataset, off-the-shelf learned synthesizers trained on that very data miss the declared monthly aggregate by 74 to 86 percent; a per-period steelman cuts the miss to about 19 percent and still cannot reach 0; a closed-form generator reaches exactly 0. We name this task outcome-conformant synthesis, argue its evaluation axis is conformance rather than fidelity, and show the two axes are orthogonal. We contribute: (1) a formal account showing a widely-used family of exact-aggregate generators is exactly conditional-sum sampling of a Gamma population (via Lukacs' characterization), with closed-form exactness, a closed-form marginal CV, and scale-invariance; a controlled experiment maps the boundary, enforcing the exact aggregate costs at most 0.006 in 1-Wasserstein distance to an arbitrary external marginal, the rest being shape-family mismatch; (2) SpecBench, to our knowledge the first benchmark to measure conformance to analytical outcomes for cold-start relational synthesis; and (3) a closed-form, deterministic reference system. Exact aggregation alone is trivial; the contribution is conformance jointly with closed-form marginals, integrity, determinism, and zero source data. We concede fidelity to imitation where real data exists.
Abstract（参考訳）: 本研究では,合成表データにおいて支配的なパラダイムが提供しない能力について考察する。模擬法(コプラ、GAN、拡散)は実際の分布とサンプルを学習し、実データに対する忠実性に基づいて判断される。宣言された結果(収益曲線、チャーンレート、グループシェア)をリレーショナルスキーマで再現するソースデータ("コールドスタート")のないデータを生成する。オフザシェルフの模倣ツールは、そのようなターゲットに対するインターフェースを提供しておらず、サンプリングがばらつきがあるため、サンプリング者が正確なアグリゲーションを打つことはできない。実際の公開データセットでは、そのデータに基づいて訓練された無学のシンセサイザーが、宣言された月間累計を74～86%減らし、周期ごとのスチールマンがミスを約19%減らし、まだ0に到達できない。我々はこのタスク結果整合合成を命名し、その評価軸は忠実性よりも適合性であり、2つの軸が直交であることを示す。筆者らは,(1) 厳密な集合生成体群を広く利用していることを示す公式な説明として,(Lukacs による)ガンマ個体群を厳密な条件付きでサンプリングし,(Lukacs による)クローズドフォームの完全性,閉形式境界CV,スケール不変性, 制御された実験により, 境界をマッピングし, 1-ワッサーシュタイン距離で最大0.006 の正確な集計コストを任意の外部辺縁に付与し, 残りは形状的ミスマッチである,(2) SpecBench について, コールドスタート関係合成における解析結果に適合する最初のベンチマーク, (3) クローズドフォームの決定論的基準系について検討した。厳密なアグリゲーションのみは自明であり、コントリビューションはクローズド形式、完全性、決定性、およびゼロソースデータと共同で適合する。実データが存在する場所では、忠実さを模倣とみなす。

論文の概要: Declarative Outcome-Conformant Synthesis: Exact, Closed-Form Specification Satisfaction and a Conformance Benchmark

関連論文リスト