Constraint-Guided Unit Test Generation for Machine Learning Libraries
- URL: http://arxiv.org/abs/2510.09108v1
- Date: Fri, 10 Oct 2025 08:02:15 GMT
- Title: Constraint-Guided Unit Test Generation for Machine Learning Libraries
- Authors: Lukas Krodinger, Altin Hajdari, Stephan Lukasczyk, Gordon Fraser,
- Abstract summary: Machine learning (ML) libraries such as PyTorch and tensors are essential for a wide range of modern applications.<n> Ensuring the correctness of ML libraries through testing is crucial.<n>In this paper, we present PynguinML, an approach that improves the Pynguin test generator to leverage these constraints.
- Score: 8.883254370291256
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning (ML) libraries such as PyTorch and TensorFlow are essential for a wide range of modern applications. Ensuring the correctness of ML libraries through testing is crucial. However, ML APIs often impose strict input constraints involving complex data structures such as tensors. Automated test generation tools such as Pynguin are not aware of these constraints and often create non-compliant inputs. This leads to early test failures and limited code coverage. Prior work has investigated extracting constraints from official API documentation. In this paper, we present PynguinML, an approach that improves the Pynguin test generator to leverage these constraints to generate compliant inputs for ML APIs, enabling more thorough testing and higher code coverage. Our evaluation is based on 165 modules from PyTorch and TensorFlow, comparing PynguinML against Pynguin. The results show that PynguinML significantly improves test effectiveness, achieving up to 63.9 % higher code coverage.
Related papers
- ATTest: Agent-Driven Tensor Testing for Deep Learning Library Modules [19.355376741404267]
Unit testing of Deep Learning (DL) libraries is challenging due to complex numerical semantics and implicit tensor constraints.<n>This paper proposes ATTest, an agent-driven testing framework for module-level unit test generation.
arXiv Detail & Related papers (2026-02-15T04:47:58Z) - Improving Deep Learning Library Testing with Machine Learning [40.21709249669499]
We explore using machine learning (ML) to determine input validity.<n>Shape relationships are a precise abstraction to encode concrete inputs and capture of the data.<n>We show that ML-enhanced input classification is an important aid to scale DL library testing.
arXiv Detail & Related papers (2026-02-03T17:19:01Z) - Testing Deep Learning Libraries via Neurosymbolic Constraint Learning [3.491101173753068]
Deep Learning (DL) libraries (e.g., PyTorch) are popular in AI development.<n>A key challenge in testing DL libraries is the lack of API specifications.<n>We develop Centaur -- the first neurosymbolic technique to test DL library APIs using dynamically learned input constraints.
arXiv Detail & Related papers (2026-01-21T21:54:41Z) - LlamaRestTest: Effective REST API Testing with Small Language Models [50.058600784556816]
We present LlamaRestTest, a novel approach that employs two custom Large Language Models (LLMs) to generate realistic test inputs.<n>We evaluate it against several state-of-the-art REST API testing tools, including RESTGPT, a GPT-powered specification-enhancement tool.<n>Our study shows that small language models can perform as well as, or better than, large language models in REST API testing.
arXiv Detail & Related papers (2025-01-15T05:51:20Z) - Your Fix Is My Exploit: Enabling Comprehensive DL Library API Fuzzing with Large Language Models [49.214291813478695]
Deep learning (DL) libraries, widely used in AI applications, often contain vulnerabilities like overflows and use buffer-free errors.<n>Traditional fuzzing struggles with the complexity and API diversity of DL libraries.<n>We propose DFUZZ, an LLM-driven fuzzing approach for DL libraries.
arXiv Detail & Related papers (2025-01-08T07:07:22Z) - Subgraph-Oriented Testing for Deep Learning Libraries [9.78188667672054]
We propose SORT (Subgraph-Oriented Realistic Testing) to test Deep Learning (DL) libraries on different hardware platforms.<n>SORT takes popular API interaction patterns, represented as frequent subgraphs of model graphs, as test subjects.<n>SORT achieves a 100% valid input generation rate, detects more precision bugs than existing methods, and reveals interaction-related bugs missed by single-API testing.
arXiv Detail & Related papers (2024-12-09T12:10:48Z) - PyPulse: A Python Library for Biosignal Imputation [58.35269251730328]
We introduce PyPulse, a Python package for imputation of biosignals in both clinical and wearable sensor settings.<n>PyPulse's framework provides a modular and extendable framework with high ease-of-use for a broad userbase, including non-machine-learning bioresearchers.<n>We released PyPulse under the MIT License on Github and PyPI.
arXiv Detail & Related papers (2024-12-09T11:00:55Z) - Enhancing Differential Testing With LLMs For Testing Deep Learning Libraries [8.779035160734523]
This paper introduces an LLM-enhanced differential testing technique for DL libraries.<n>It addresses the challenges of finding alternative implementations for a given API and generating diverse test inputs.<n>It synthesizes counterparts for 1.84 times as many APIs as those found by state-of-the-art techniques.
arXiv Detail & Related papers (2024-06-12T07:06:38Z) - Automatic Unit Test Generation for Deep Learning Frameworks based on API
Knowledge [11.523398693942413]
We propose MUTester to generate unit test cases for APIs of deep learning frameworks.
We first propose a set of 18 rules for mining API constraints from the API documents.
We then use the frequent itemset mining technique to mine the API usage patterns from a large corpus of machine learning API related code fragments.
arXiv Detail & Related papers (2023-07-01T18:34:56Z) - PyGOD: A Python Library for Graph Outlier Detection [56.33769221859135]
PyGOD is an open-source library for detecting outliers in graph data.
It supports a wide array of leading graph-based methods for outlier detection.
PyGOD is released under a BSD 2-Clause license at https://pygod.org and at the Python Package Index (PyPI)
arXiv Detail & Related papers (2022-04-26T06:15:21Z) - PyHHMM: A Python Library for Heterogeneous Hidden Markov Models [63.01207205641885]
PyHHMM is an object-oriented Python implementation of Heterogeneous-Hidden Markov Models (HHMMs)
PyHHMM emphasizes features not supported in similar available frameworks: a heterogeneous observation model, missing data inference, different model order selection criterias, and semi-supervised training.
PyHHMM relies on the numpy, scipy, scikit-learn, and seaborn Python packages, and is distributed under the Apache-2.0 License.
arXiv Detail & Related papers (2022-01-12T07:32:36Z) - MOGPTK: The Multi-Output Gaussian Process Toolkit [71.08576457371433]
We present MOGPTK, a Python package for multi-channel data modelling using Gaussian processes (GP)
The aim of this toolkit is to make multi-output GP (MOGP) models accessible to researchers, data scientists, and practitioners alike.
arXiv Detail & Related papers (2020-02-09T23:34:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.