Related papers: OpEvo: An Evolutionary Method for Tensor Operator Optimization

OpEvo: An Evolutionary Method for Tensor Operator Optimization

URL: http://arxiv.org/abs/2006.05664v2
Date: Mon, 21 Dec 2020 08:02:18 GMT
Title: OpEvo: An Evolutionary Method for Tensor Operator Optimization
Authors: Xiaotian Gao, Cui Wei, Lintao Zhang and Mao Yang
Abstract summary: We propose a novel evolutionary method, OpEvo, which efficiently explores the search spaces of tensor operators. Our comprehensive experiment results show that OpEvo can find the best configuration with the lowest variance and least efforts in the number of trials and wall-clock time.
Score: 6.273446055072434
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Training and inference efficiency of deep neural networks highly rely on the performance of tensor operators on hardware platforms. Manually optimizing tensor operators has limitations in terms of supporting new operators or hardware platforms. Therefore, automatically optimizing device code configurations of tensor operators is getting increasingly attractive. However, current methods for tensor operator optimization usually suffer from poor sample-efficiency due to the combinatorial search space. In this work, we propose a novel evolutionary method, OpEvo, which efficiently explores the search spaces of tensor operators by introducing a topology-aware mutation operation based on q-random walk to leverage the topological structures over the search spaces. Our comprehensive experiment results show that compared with state-of-the-art (SOTA) methods OpEvo can find the best configuration with the lowest variance and least efforts in the number of trials and wall-clock time. All code of this work is available online.

Related papers

Tensor Network Estimation of Distribution Algorithms [0.0]
Methods integrating tensor networks into evolutionary optimization algorithms have appeared in the recent literature. We find that optimization performance of these methods is not related to the power of the generative model in a straightforward way. In light of this we find that adding an explicit mutation operator to the output of the generative model often improves optimization performance.
arXiv Detail & Related papers (2024-12-27T18:22:47Z)
Preventing Local Pitfalls in Vector Quantization via Optimal Transport [77.15924044466976]
We introduce OptVQ, a novel vector quantization method that employs the Sinkhorn algorithm to optimize the optimal transport problem. Our experiments on image reconstruction tasks demonstrate that OptVQ achieves 100% codebook utilization and surpasses current state-of-the-art VQNs in reconstruction quality.
arXiv Detail & Related papers (2024-12-19T18:58:14Z)
Syno: Structured Synthesis for Neural Operators [1.5826646053411249]
We develop an end-to-end framework Syno, to realize practical neural operator synthesis. We demonstrate that Syno discovers better operators with an average of $2.06times$ speedup and less than $1%$ accuracy loss, even on NAS-optimized models.
arXiv Detail & Related papers (2024-10-31T09:00:24Z)
Use Your INSTINCT: INSTruction optimization for LLMs usIng Neural bandits Coupled with Transformers [66.823588073584]
Large language models (LLMs) have shown remarkable instruction-following capabilities and achieved impressive performances in various applications. Recent work has used the query-efficient Bayesian optimization (BO) algorithm to automatically optimize the instructions given to black-box LLMs. We propose a neural bandit algorithm which replaces the GP in BO by an NN surrogate to optimize instructions for black-box LLMs.
arXiv Detail & Related papers (2023-10-02T02:01:16Z)
Beyond Regular Grids: Fourier-Based Neural Operators on Arbitrary Domains [13.56018270837999]
We propose a simple method to extend neural operators to arbitrary domains. An efficient implementation* of such direct spectral evaluations is coupled with existing neural operator models. We demonstrate that the proposed method allows us to extend neural operators to arbitrary point distributions with significant gains in training speed over baselines.
arXiv Detail & Related papers (2023-05-31T09:01:20Z)
Performance Embeddings: A Similarity-based Approach to Automatic Performance Optimization [71.69092462147292]
Performance embeddings enable knowledge transfer of performance tuning between applications. We demonstrate this transfer tuning approach on case studies in deep neural networks, dense and sparse linear algebra compositions, and numerical weather prediction stencils.
arXiv Detail & Related papers (2023-03-14T15:51:35Z)
Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z)
OLLIE: Derivation-based Tensor Program Optimizer [13.23204410403652]
We propose OLLIE, the first derivation-based tensor program. We show that OLLIE can outperform existing tensor expressions by up to 2.73$times$ (1.46$times$ on average) on an A100 GPU and up to 2.68$times$1$times$ on a V100 GPU.
arXiv Detail & Related papers (2022-08-02T14:38:58Z)
How much progress have we made in neural network training? A New Evaluation Protocol for Benchmarking Optimizers [86.36020260204302]
We propose a new benchmarking protocol to evaluate both end-to-end efficiency and data-addition training efficiency. A human study is conducted to show that our evaluation protocol matches human tuning behavior better than the random search. We then apply the proposed benchmarking framework to 7s and various tasks, including computer vision, natural language processing, reinforcement learning, and graph mining.
arXiv Detail & Related papers (2020-10-19T21:46:39Z)
AdaLead: A simple and robust adaptive greedy search algorithm for sequence design [55.41644538483948]
We develop an easy-to-directed, scalable, and robust evolutionary greedy algorithm (AdaLead) AdaLead is a remarkably strong benchmark that out-competes more complex state of the art approaches in a variety of biologically motivated sequence design challenges.
arXiv Detail & Related papers (2020-10-05T16:40:38Z)
Adaptive Learning of Tensor Network Structures [6.407946291544721]
We leverage the TN formalism to develop a generic and efficient adaptive algorithm to learn the structure and the parameters of a TN from data. Our algorithm can adaptively identify TN structures with small number of parameters that effectively optimize any differentiable objective function.
arXiv Detail & Related papers (2020-08-12T16:41:56Z)
Woodpecker-DL: Accelerating Deep Neural Networks via Hardware-Aware Multifaceted Optimizations [15.659251804042748]
Woodpecker-DL (WPK) is a hardware-aware deep learning framework. WPK uses graph optimization, automated searches, domain-specific language ( DSL) and system-level exploration to accelerate inference. We show that on a maximum P100 GPU, we can achieve the speedup of 5.40 over cuDNN and 1.63 over TVM on individual operators, and run up to 1.18 times faster than TeslaRT for end-to-end model inference.
arXiv Detail & Related papers (2020-08-11T07:50:34Z)
Differentiable Top-k Operator with Optimal Transport [135.36099648554054]
The SOFT top-k operator approximates the output of the top-k operation as the solution of an Entropic Optimal Transport (EOT) problem. We apply the proposed operator to the k-nearest neighbors and beam search algorithms, and demonstrate improved performance.
arXiv Detail & Related papers (2020-02-16T04:57:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.