Related papers: Optimization-Aware Test Generation for Deep Learning Compilers

Optimization-Aware Test Generation for Deep Learning Compilers

URL: http://arxiv.org/abs/2511.18918v1
Date: Mon, 24 Nov 2025 09:27:59 GMT
Title: Optimization-Aware Test Generation for Deep Learning Compilers
Authors: Qingchao Shen, Zan Wang, Haoyang Ma, Yongqiang Tian, Lili Huang, Zibo Xiao, Junjie Chen, Shing-Chi Cheung,
Abstract summary: OATest is a novel approach for synthesizing optimization-aware computational graphs.<n>It can detect more bugs and achieve higher code coverage in TVM and ONNXRutimes.<n> OATest uncovers 58 previously unknown bugs, 36 of which have been confirmed or fixed by developers.
Score: 18.99078574014009
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep Learning (DL) compilers have been widely utilized to optimize DL models for efficient deployment across various hardware. Due to their vital role in the DL ecosystem, ensuring their reliability and security is critical. However, existing approaches have limitations in testing optimization stages, which is the core functionality of DL compilers, due to the difficulty in generating optimization-aware tests. In this paper, we proposed OATest, a novel approach for synthesizing optimization-aware computational graphs. The approach combines patterns extracted from documented tests for optimization and incorporates them into seed computational graphs, enabling broader exploration of optimization paths. To guarantee the optimization-awareness of generated graphs, OATest introduces the edges reusing strategy to establish strong connections between patterns and contexts. Additionally, to solve the validity challenge for the generated graphs, OATest employs an auxiliary layers addition strategy to resolve broken constraints. Equipped with two distinct test oracles, OATest applies differential testing to evaluate the two widely used DL compilers (i.e., TVM and ONNXRuntime). Our experimental results show that OATest outperforms the state-of-the-art method by detecting more bugs and achieving higher code coverage in TVM and ONNXRutimes. Additionally, OATest uncovers 58 previously unknown bugs, 36 of which have been confirmed or fixed by developers.

Related papers

Prism: Efficient Test-Time Scaling via Hierarchical Search and Self-Verification for Discrete Diffusion Language Models [96.0074341403456]
Inference-time compute has re-emerged as a practical way to improve LLM reasoning.<n>Most test-time scaling (TTS) algorithms rely on autoregressive decoding.<n>We propose Prism, an efficient TTS framework for dLLMs.
arXiv Detail & Related papers (2026-02-02T09:14:51Z)
Data-driven Test Generation for Fuzzing AI Compiler [0.4441866681085516]
We present a unified data-driven testing framework that addresses stage-specific challenges in AI compilers.<n> OPERA migrates tests for AI libraries to test various operator conversion logic in the model loading stage.<n> OATest synthesizes diverse optimization-aware computational graphs for testing high-level optimizations.<n>HarmONY generates and mutates diverse low-level IR seeds to generate hardware-optimization-aware tests.
arXiv Detail & Related papers (2026-01-24T12:56:40Z)
Synthesizing Performance Constraints for Evaluating and Improving Code Efficiency [4.292737608159482]
We present WEDGE, a framework for generating performance-stressing input given the program under test.<n>WEDGE synthesizes explicit performance-characterizing constraints in the form of branch conditions to partition the programs' execution space into performance-specific regions.<n>Our evaluation shows that WEDGE introduces a significant slowdown compared to the tests in CodeContests and those claimed to be optimized by existing approaches.
arXiv Detail & Related papers (2025-05-29T14:26:22Z)
Compiler Optimization Testing Based on Optimization-Guided Equivalence Transformations [3.2987550056134873]
We propose a metamorphic testing approach inspired by compiler optimizations.<n>Our approach first employs tailored code construction strategies to generate input programs that satisfy optimization conditions.<n>By comparing the outputs of pre- and post-transformation programs, this approach effectively identifies incorrect optimization bugs.
arXiv Detail & Related papers (2025-04-06T01:37:57Z)
Discovering Preference Optimization Algorithms with and for Large Language Models [50.843710797024805]
offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs. We perform objective discovery to automatically discover new state-of-the-art preference optimization algorithms without (expert) human intervention. Experiments demonstrate the state-of-the-art performance of DiscoPOP, a novel algorithm that adaptively blends logistic and exponential losses.
arXiv Detail & Related papers (2024-06-12T16:58:41Z)
Learning Performance-Improving Code Edits [107.21538852090208]
We introduce a framework for adapting large language models (LLMs) to high-level program optimization. First, we curate a dataset of performance-improving edits made by human programmers of over 77,000 competitive C++ programming submission pairs. For prompting, we propose retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play.
arXiv Detail & Related papers (2023-02-15T18:59:21Z)
An Empirical Evaluation of Zeroth-Order Optimization Methods on AI-driven Molecule Optimization [78.36413169647408]
We study the effectiveness of various ZO optimization methods for optimizing molecular objectives. We show the advantages of ZO sign-based gradient descent (ZO-signGD) We demonstrate the potential effectiveness of ZO optimization methods on widely used benchmark tasks from the Guacamol suite.
arXiv Detail & Related papers (2022-10-27T01:58:10Z)
Learning to Optimize: A Primer and A Benchmark [94.29436694770953]
Learning to optimize (L2O) is an emerging approach that leverages machine learning to develop optimization methods. This article is poised to be the first comprehensive survey and benchmark of L2O for continuous optimization.
arXiv Detail & Related papers (2021-03-23T20:46:20Z)
Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems. We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z)
Static Neural Compiler Optimization via Deep Reinforcement Learning [1.458855293397494]
In this paper, we employ a deep reinforcement learning approach to the phase-ordering problem. Provided with sub-sequences constituting LLVM's O3 sequence, our agent learns to outperform the O3 sequence on the set of source codes used for training. We believe that the models trained using our approach can be integrated into modern compilers as neural optimization agents.
arXiv Detail & Related papers (2020-08-20T13:16:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.