TorchProbe: Fuzzing Dynamic Deep Learning Compilers
- URL: http://arxiv.org/abs/2310.20078v1
- Date: Mon, 30 Oct 2023 23:20:47 GMT
- Title: TorchProbe: Fuzzing Dynamic Deep Learning Compilers
- Authors: Qidong Su, Chuqin Geng, Gennady Pekhimenko, Xujie Si
- Abstract summary: PyTorch 2.0 supports compiling arbitrary deep learning programs in Python.
We propose code transformations to generate test cases involving dynamic features.
We have successfully identified twenty previously unknown bugs in the PyTorch compiler and its underlying tensor compiler Triton.
- Score: 9.324205843411352
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Static and dynamic computational graphs represent two distinct approaches to
constructing deep learning frameworks. The former prioritizes compiler-based
optimizations, while the latter focuses on programmability and
user-friendliness. The recent release of PyTorch 2.0, which supports compiling
arbitrary deep learning programs in Python, signifies a new direction in the
evolution of deep learning infrastructure to incorporate compiler techniques in
a more dynamic manner and support more dynamic language features like dynamic
control flows and closures. Given PyTorch's seamless integration with Python,
its compiler aims to support arbitrary deep learning code written in Python.
However, the inherent dynamism of Python poses challenges to the completeness
and robustness of the compiler. While recent research has introduced fuzzing to
test deep learning compilers, there is still a lack of comprehensive analysis
on how to test dynamic features. To address this issue, we propose several code
transformations to generate test cases involving dynamic features. These
transformations preserve the program's semantics, ensuring that any discrepancy
between the transformed and original programs indicates the presence of a bug.
Through our approach, we have successfully identified twenty previously unknown
bugs in the PyTorch compiler and its underlying tensor compiler Triton.
Related papers
- PoTo: A Hybrid Andersen's Points-to Analysis for Python [3.6793233203143743]
PoTo is an Andersen-style context-insensitive and flow-insensitive points-to analysis for Python.
PoTo+ is a static type inference for Python built on the points-to analysis.
arXiv Detail & Related papers (2024-09-05T21:26:25Z) - depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers [92.13613958373628]
textttdepyf is a tool designed to demystify the inner workings of the PyTorch compiler.
textttdepyf decompiles bytecode generated by PyTorch back into equivalent source code.
arXiv Detail & Related papers (2024-03-14T16:17:14Z) - pyvene: A Library for Understanding and Improving PyTorch Models via
Interventions [79.72930339711478]
$textbfpyvene$ is an open-source library that supports customizable interventions on a range of different PyTorch modules.
We show how $textbfpyvene$ provides a unified framework for performing interventions on neural models and sharing the intervened upon models with others.
arXiv Detail & Related papers (2024-03-12T16:46:54Z) - DyPyBench: A Benchmark of Executable Python Software [18.129031749321058]
We present DyPyBench, the first benchmark of Python projects that is large scale, diverse, ready to run and ready to analyze.
The benchmark encompasses 50 popular opensource projects from various application domains, with a total of 681k lines of Python code, and 30k test cases.
We envision DyPyBench to provide a basis for other dynamic analyses and for studying the runtime behavior of Python code.
arXiv Detail & Related papers (2024-03-01T13:53:15Z) - Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic
Interpretability: A Case Study on Othello-GPT [59.245414547751636]
We propose a circuit discovery framework alternative to activation patching.
Our framework suffers less from out-of-distribution and proves to be more efficient in terms of complexity.
We dig in a small transformer trained on a synthetic task named Othello and find a number of human-understandable fine-grained circuits inside of it.
arXiv Detail & Related papers (2024-02-19T15:04:53Z) - Outline, Then Details: Syntactically Guided Coarse-To-Fine Code
Generation [61.50286000143233]
ChainCoder is a program synthesis language model that generates Python code progressively.
A tailored transformer architecture is leveraged to jointly encode the natural language descriptions and syntactically aligned I/O data samples.
arXiv Detail & Related papers (2023-04-28T01:47:09Z) - Serenity: Library Based Python Code Analysis for Code Completion and
Automated Machine Learning [8.362734311902278]
We present a framework for static analysis of Python that turns out to be sufficient for some tasks.
Serenity exploits two basic mechanisms: (a) reliance on dynamic dispatch at the core of language translation, and (b) extreme abstraction of libraries.
We demonstrate the efficiency and usefulness of Serenity's analysis in two applications: code completion and automated machine learning.
arXiv Detail & Related papers (2023-01-05T02:09:08Z) - ReACC: A Retrieval-Augmented Code Completion Framework [53.49707123661763]
We propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval.
We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.
arXiv Detail & Related papers (2022-03-15T08:25:08Z) - torch.fx: Practical Program Capture and Transformation for Deep Learning
in Python [0.0]
We study the different designs for program capture and transformation used in deep learning.
By designing for typical deep learning use cases rather than long tail ones, it is possible to create a simpler framework for program capture and transformation.
We apply this principle in torch.fx, a program capture and transformation library for PyTorch written entirely in Python and optimized for high developer productivity by ML practitioners.
arXiv Detail & Related papers (2021-12-15T19:16:29Z) - OPFython: A Python-Inspired Optimum-Path Forest Classifier [68.8204255655161]
This paper proposes a Python-based Optimum-Path Forest framework, denoted as OPFython.
As OPFython is a Python-based library, it provides a more friendly environment and a faster prototyping workspace than the C language.
arXiv Detail & Related papers (2020-01-28T15:46:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.