Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space
Search
- URL: http://arxiv.org/abs/2103.01489v1
- Date: Tue, 2 Mar 2021 06:11:58 GMT
- Title: Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space
Search
- Authors: Kartik Hegde, Po-An Tsai, Sitao Huang, Vikas Chandra, Angshuman
Parashar, and Christopher W. Fletcher
- Abstract summary: This paper presents a novel Mind-based search foraccel-based search space.
We derive differentiable approximations to the otherwise nonsmooth$optimal mapping space.
With a differentiable approximation, we can compare high-bound efficient algorithms to find Mind-based search schemes.
- Score: 7.596028906226877
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Modern day computing increasingly relies on specialization to satiate growing
performance and efficiency requirements. A core challenge in designing such
specialized hardware architectures is how to perform mapping space search,
i.e., search for an optimal mapping from algorithm to hardware. Prior work
shows that choosing an inefficient mapping can lead to multiplicative-factor
efficiency overheads. Additionally, the search space is not only large but also
non-convex and non-smooth, precluding advanced search techniques. As a result,
previous works are forced to implement mapping space search using expert
choices or sub-optimal search heuristics.
This work proposes Mind Mappings, a novel gradient-based search method for
algorithm-accelerator mapping space search. The key idea is to derive a smooth,
differentiable approximation to the otherwise non-smooth, non-convex search
space. With a smooth, differentiable approximation, we can leverage efficient
gradient-based search algorithms to find high-quality mappings. We extensively
compare Mind Mappings to black-box optimization schemes used in prior work.
When tasked to find mappings for two important workloads (CNN and MTTKRP), the
proposed search finds mappings that achieve an average $1.40\times$,
$1.76\times$, and $1.29\times$ (when run for a fixed number of steps) and
$3.16\times$, $4.19\times$, and $2.90\times$ (when run for a fixed amount of
time) better energy-delay product (EDP) relative to Simulated Annealing,
Genetic Algorithms and Reinforcement Learning, respectively. Meanwhile, Mind
Mappings returns mappings with only $5.32\times$ higher EDP than a possibly
unachievable theoretical lower-bound, indicating proximity to the global
optima.
Related papers
- SLOPE: Search with Learned Optimal Pruning-based Expansion [2.0618817976970103]
We present the Search with Learned Optimal Pruning-based Expansion (SLOPE)
It learns the distance of a node from a possible optimal path, which in turn reduces the size of the open list.
This ensures that the search explores only the region close to optimal paths while lowering memory and computational costs.
arXiv Detail & Related papers (2024-06-07T13:42:15Z) - Minuet: Accelerating 3D Sparse Convolutions on GPUs [9.54287796030519]
Sparse Convolution (SC) is widely used for processing 3D point clouds that are inherently sparse.
In this work, we analyze the shortcomings of prior state-of-the-art SC engines, and propose Minuet, a novel memory-efficient SC engine tailored for modern GPUs.
Our evaluations show that Minuet significantly outperforms prior SC engines by on average $1.74times$ (up to $2.22times$) for end-to-end point cloud network executions.
arXiv Detail & Related papers (2023-12-01T05:09:02Z) - Demystifying Map Space Exploration for NPUs [4.817475305740601]
Map Space Exploration is the problem of finding optimized mappings of a Deep Neural Network (DNN) model.
We do a first-of-its-kind apples-to-apples comparison of search techniques leveraged by different mappers.
Next, we propose two new techniques that can augment existing mappers.
arXiv Detail & Related papers (2022-10-07T17:58:45Z) - Efficient Non-Parametric Optimizer Search for Diverse Tasks [93.64739408827604]
We present the first efficient scalable and general framework that can directly search on the tasks of interest.
Inspired by the innate tree structure of the underlying math expressions, we re-arrange the spaces into a super-tree.
We adopt an adaptation of the Monte Carlo method to tree search, equipped with rejection sampling and equivalent- form detection.
arXiv Detail & Related papers (2022-09-27T17:51:31Z) - Searching a High-Performance Feature Extractor for Text Recognition
Network [92.12492627169108]
We design a domain-specific search space by exploring principles for having good feature extractors.
As the space is huge and complexly structured, no existing NAS algorithms can be applied.
We propose a two-stage algorithm to effectively search in the space.
arXiv Detail & Related papers (2022-09-27T03:49:04Z) - Lazy and Fast Greedy MAP Inference for Determinantal Point Process [17.50810164319995]
This paper presents how to combine the ideas of "lazy" and "fast", which have been considered incompatible in the literature.
Our lazy and fast greedy algorithm achieves almost the same time as the current best one and runs faster in practice.
arXiv Detail & Related papers (2022-06-13T07:33:32Z) - Waypoint Planning Networks [66.72790309889432]
We propose a hybrid algorithm based on LSTMs with a local kernel - a classic algorithm such as A*, and a global kernel using a learned algorithm.
We compare WPN against A*, as well as related works including motion planning networks (MPNet) and value networks (VIN)
It is shown that WPN's search space is considerably less than A*, while being able to generate near optimal results.
arXiv Detail & Related papers (2021-05-01T18:02:01Z) - Towards Improving the Consistency, Efficiency, and Flexibility of
Differentiable Neural Architecture Search [84.4140192638394]
Most differentiable neural architecture search methods construct a super-net for search and derive a target-net as its sub-graph for evaluation.
In this paper, we introduce EnTranNAS that is composed of Engine-cells and Transit-cells.
Our method also spares much memory and computation cost, which speeds up the search process.
arXiv Detail & Related papers (2021-01-27T12:16:47Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse
Coding [86.40042104698792]
We formulate neural architecture search as a sparse coding problem.
In experiments, our two-stage method on CIFAR-10 requires only 0.05 GPU-day for search.
Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.
arXiv Detail & Related papers (2020-10-13T04:34:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.