Related papers: The Deep Learning Compiler: A Comprehensive Survey

The Deep Learning Compiler: A Comprehensive Survey

URL: http://arxiv.org/abs/2002.03794v4
Date: Fri, 28 Aug 2020 09:19:43 GMT
Title: The Deep Learning Compiler: A Comprehensive Survey
Authors: Mingzhen Li, Yi Liu, Xiaoyan Liu, Qingxiao Sun, Xin You, Hailong Yang, Zhongzhi Luan, Lin Gan, Guangwen Yang, Depei Qian
Abstract summary: We perform a comprehensive survey of existing DL compilers by dissecting the commonly adopted design in details. Specifically, we provide a comprehensive comparison among existing DL compilers from various aspects.
Score: 16.19025439622745
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The difficulty of deploying various deep learning (DL) models on diverse DL hardware has boosted the research and development of DL compilers in the community. Several DL compilers have been proposed from both industry and academia such as Tensorflow XLA and TVM. Similarly, the DL compilers take the DL models described in different DL frameworks as input, and then generate optimized codes for diverse DL hardware as output. However, none of the existing survey has analyzed the unique design architecture of the DL compilers comprehensively. In this paper, we perform a comprehensive survey of existing DL compilers by dissecting the commonly adopted design in details, with emphasis on the DL oriented multi-level IRs, and frontend/backend optimizations. Specifically, we provide a comprehensive comparison among existing DL compilers from various aspects. In addition, we present detailed analysis on the design of multi-level IRs and illustrate the commonly adopted optimization techniques. Finally, several insights are highlighted as the potential research directions of DL compiler. This is the first survey paper focusing on the design architecture of DL compilers, which we hope can pave the road for future research towards DL compiler.

Related papers

OpenCodeInstruct: A Large-scale Instruction Tuning Dataset for Code LLMs [62.68905180014956]
We introduce OpenCodeInstruct, the largest open-access instruction tuning dataset, comprising 5 million diverse samples. Each sample includes a programming question, solution, test cases, execution feedback, and LLM-generated quality assessments. We fine-tune various base models, including LLaMA and Qwen, across multiple scales (1B+, 3B+, and 7B+) using our dataset.
arXiv Detail & Related papers (2025-04-05T02:52:16Z)
Binary Code Similarity Detection via Graph Contrastive Learning on Intermediate Representations [52.34030226129628]
Binary Code Similarity Detection (BCSD) plays a crucial role in numerous fields, including vulnerability detection, malware analysis, and code reuse identification. In this paper, we propose IRBinDiff, which mitigates compilation differences by leveraging LLVM-IR with higher-level semantic abstraction. Our extensive experiments, conducted under varied compilation settings, demonstrate that IRBinDiff outperforms other leading BCSD methods in both One-to-one comparison and One-to-many search scenarios.
arXiv Detail & Related papers (2024-10-24T09:09:20Z)
A Taxonomy of Self-Admitted Technical Debt in Deep Learning Systems [13.90991624629898]
This paper empirically analyzes the presence of Self-Admitted Technical Debt (SATD) in Deep Learning systems. We derived a taxonomy of DL-specific SATD through open coding, featuring seven categories and 41 leaves.
arXiv Detail & Related papers (2024-09-18T09:21:10Z)
What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions. We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types. We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z)
A Survey of Deep Learning Library Testing Methods [33.62859142913532]
Deep learning (DL) libraries undertake the underlying optimization and computation. DL libraries are not immune to bugs, which can pose serious threats to users' personal property and safety. This paper provides an overview of the testing research related to various DL libraries.
arXiv Detail & Related papers (2024-04-27T11:42:13Z)
Serving Deep Learning Model in Relational Databases [70.53282490832189]
Serving deep learning (DL) models on relational data has become a critical requirement across diverse commercial and scientific domains. We highlight three pivotal paradigms: The state-of-the-art DL-centric architecture offloads DL computations to dedicated DL frameworks. The potential UDF-centric architecture encapsulates one or more tensor computations into User Defined Functions (UDFs) within the relational database management system (RDBMS)
arXiv Detail & Related papers (2023-10-07T06:01:35Z)
A Comprehensive Benchmark of Deep Learning Libraries on Mobile Devices [12.342282138576348]
We build a benchmark that includes 6 representative DL libs and 15 diversified DL models. We then perform extensive experiments on 10 mobile devices, which help reveal a complete landscape of the current mobile DL libs ecosystem. We find that the best-performing DL lib is severely fragmented across different models and hardware.
arXiv Detail & Related papers (2022-02-14T07:00:31Z)
Design Smells in Deep Learning Programs: An Empirical Study [9.112172220055431]
Design smells in Deep Learning (DL) programs are poor design and-or configuration decisions taken during the development of DL components. We present a catalogue of 8 design smells for a popular DL architecture, namely deep Feedforward Neural Networks.
arXiv Detail & Related papers (2021-07-05T21:26:05Z)
Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning Workloads [86.62083829086393]
This work introduces the Processing Primitives (TPP), a programming abstraction striving for efficient, portable implementation of Deep Learning-workloads with high-productivity. TPPs define a compact, yet versatile set of 2D-tensor operators (or a virtual ISA), which can be utilized as building-blocks to construct complex operators on high-dimensional tensors. We demonstrate the efficacy of our approach using standalone kernels and end-to-end DL-workloads expressed entirely via TPPs that outperform state-of-the-art implementations on multiple platforms.
arXiv Detail & Related papers (2021-04-12T18:35:49Z)
A Survey of Deep Active Learning [54.376820959917005]
Active learning (AL) attempts to maximize the performance gain of the model by marking the fewest samples. Deep learning (DL) is greedy for data and requires a large amount of data supply to optimize massive parameters. Deep active learning (DAL) has emerged.
arXiv Detail & Related papers (2020-08-30T04:28:31Z)
PolyDL: Polyhedral Optimizations for Creation of High Performance DL primitives [55.79741270235602]
We present compiler algorithms to automatically generate high performance implementations of Deep Learning primitives. We develop novel data reuse analysis algorithms using the polyhedral model. We also show that such a hybrid compiler plus a minimal library-use approach results in state-of-the-art performance.
arXiv Detail & Related papers (2020-06-02T06:44:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.