Fugu-MT 論文翻訳(概要): Testing Deep Learning Libraries via Neurosymbolic Constraint Learning

論文の概要: Testing Deep Learning Libraries via Neurosymbolic Constraint Learning

arxiv url: http://arxiv.org/abs/2601.15493v1
Date: Wed, 21 Jan 2026 21:54:41 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-23 21:37:20.42819
Title: Testing Deep Learning Libraries via Neurosymbolic Constraint Learning
Title（参考訳）: ニューロシンボリック制約学習による深層学習ライブラリのテスト
Authors: M M Abid Naziri, Shinhae Kim, Feiran Qin, Marcelo d'Amorim, Saikat Dutta,
Abstract要約: ディープラーニング(DL)ライブラリ(例えばPyTorch)はAI開発で人気がある。 DLライブラリをテストする上で重要な課題は、API仕様の欠如である。動的に学習された入力制約を用いてDLライブラリAPIをテストする最初のニューロシンボリック手法であるCentaurを開発する。
参考スコア（独自算出の注目度）: 3.491101173753068
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep Learning (DL) libraries (e.g., PyTorch) are popular in AI development. These libraries are complex and contain bugs. Researchers have proposed various bug-finding techniques for such libraries. Yet, there is much room for improvement. A key challenge in testing DL libraries is the lack of API specifications. Prior testing approaches often inaccurately model the input specifications of DL APIs, resulting in missed valid inputs that could reveal bugs or false alarms due to invalid inputs. To address this challenge, we develop Centaur -- the first neurosymbolic technique to test DL library APIs using dynamically learned input constraints. Centaur leverages the key idea that formal API constraints can be learned from a small number of automatically generated seed inputs, and that the learned constraints can be solved using SMT solvers to generate valid and diverse test inputs. We develop a novel grammar that represents first-order logic formulae over API parameters and expresses tensor-related properties (e.g., shape, data types) as well as relational properties between parameters. We use the grammar to guide a Large Language Model (LLM) to enumerate syntactically correct candidate rules, validated using seed inputs. Further, we develop a custom refinement strategy to prune the set of learned rules to eliminate spurious or redundant rules. We use the learned constraints to systematically generate valid and diverse inputs by integrating SMT-based solving with randomized sampling. We evaluate Centaur for testing PyTorch and TensorFlow. Our results show that Centaur's constraints have a recall of 94.0% and a precision of 94.0% on average. In terms of coverage, Centaur covers 203, 150, and 9,608 more branches than TitanFuzz, ACETest and Pathfinder, respectively. Using Centaur, we also detect 26 new bugs in PyTorch and TensorFlow, 18 of which are confirmed.
Abstract（参考訳）: ディープラーニング(DL)ライブラリ(例:PyTorch)は、AI開発で人気がある。これらのライブラリは複雑で、バグを含んでいる。研究者はこれらのライブラリに対して様々なバグフィニング手法を提案している。しかし、改善の余地はたくさんある。 DLライブラリをテストする上で重要な課題は、API仕様の欠如である。以前のテストアプローチでは、しばしばDL APIの入力仕様を不正確なモデルでモデル化する。この課題に対処するため、動的に学習された入力制約を用いてDLライブラリAPIをテストする最初のニューロシンボリックテクニックであるCentaurを開発した。 Centaur氏は、フォーマルなAPI制約は、少数の自動生成されたシードインプットから学ぶことができ、学習された制約は、SMTソルバを使用して解決し、有効で多様なテストインプットを生成するというキーアイデアを活用している。本研究では,APIパラメータ上の一階述語論理式を表現し,テンソル関連特性(例えば,形状,データ型)とパラメータ間の関係性を表現する新しい文法を開発する。我々は,Large Language Model (LLM) を用いて,シード入力を用いて検証された構文的に正しい候補規則を列挙する。さらに、学習ルールの集合を創り出し、余分なルールや冗長なルールを排除するためのカスタムリファインメント戦略を開発する。学習制約を用いて,SMTに基づく解法とランダムサンプリングを統合することにより,有効かつ多様な入力を体系的に生成する。我々は、PyTorchとTensorFlowをテストするためにCentaurを評価する。以上の結果から,センターの制約は94.0%,精度は94.0%であった。カバレッジに関しては、TitanFuzz、ACETest、Pathfinderよりも203、150、9,608のブランチがカバーされている。 Centaurを使用することで、PyTorchとTensorFlowの26の新たなバグを検出します。

論文の概要: Testing Deep Learning Libraries via Neurosymbolic Constraint Learning

関連論文リスト