UniSparse: An Intermediate Language for General Sparse Format
Customization
- URL: http://arxiv.org/abs/2403.05802v1
- Date: Sat, 9 Mar 2024 05:38:45 GMT
- Title: UniSparse: An Intermediate Language for General Sparse Format
Customization
- Authors: Jie Liu, Zhongyuan Zhao, Zijian Ding, Benjamin Brock, Hongbo Rong,
Zhiru Zhang
- Abstract summary: We propose UniSparse, an intermediate language that provides a unified abstraction for representing and customizing sparse formats.
Unlike the existing attribute-based frameworks, UniSparse decouples the logical representation of the sparse tensor from its low-level memory layout.
As a result, a rich set of format customizations can be succinctly expressed in a small set of well-defined query, mutation, and layout primitives.
- Score: 13.132033187592349
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The ongoing trend of hardware specialization has led to a growing use of
custom data formats when processing sparse workloads, which are typically
memory-bound. These formats facilitate optimized software/hardware
implementations by utilizing sparsity pattern- or target-aware data structures
and layouts to enhance memory access latency and bandwidth utilization.
However, existing sparse tensor programming models and compilers offer little
or no support for productively customizing the sparse formats. Additionally,
because these frameworks represent formats using a limited set of per-dimension
attributes, they lack the flexibility to accommodate numerous new variations of
custom sparse data structures and layouts. To overcome this deficiency, we
propose UniSparse, an intermediate language that provides a unified abstraction
for representing and customizing sparse formats. Unlike the existing
attribute-based frameworks, UniSparse decouples the logical representation of
the sparse tensor (i.e., the data structure) from its low-level memory layout,
enabling the customization of both. As a result, a rich set of format
customizations can be succinctly expressed in a small set of well-defined
query, mutation, and layout primitives. We also develop a compiler leveraging
the MLIR infrastructure, which supports adaptive customization of formats, and
automatic code generation of format conversion and compute operations for
heterogeneous architectures. We demonstrate the efficacy of our approach
through experiments running commonly-used sparse linear algebra operations with
specialized formats on multiple different hardware targets, including an Intel
CPU, an NVIDIA GPU, an AMD Xilinx FPGA, and a simulated processing-in-memory
(PIM) device.
Related papers
- B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory [91.81390121042192]
We develop a class of models called B'MOJO to seamlessly combine eidetic and fading memory within an composable module.
B'MOJO's ability to modulate eidetic and fading memory results in better inference on longer sequences tested up to 32K tokens.
arXiv Detail & Related papers (2024-07-08T18:41:01Z) - Exploring FPGA designs for MX and beyond [6.843913224130847]
We describe and evaluate the first open-source FPGA implementation of the arithmetic defined in the Open Compute Project MX standard.
Our designs fully support all the standard's concrete formats for conversion into and out of MX formats.
We release an open-source Pytorch library for quantization into the new standard, integrated with the Brevitas library.
arXiv Detail & Related papers (2024-07-01T17:07:33Z) - PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM [58.67882997399021]
Our research introduces a unified framework for automated graphic layout generation.
Our data-driven method employs structured text (JSON format) and visual instruction tuning to generate layouts.
We conduct extensive experiments and achieved state-of-the-art (SOTA) performance on public multi-modal layout generation benchmarks.
arXiv Detail & Related papers (2024-06-05T03:05:52Z) - Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formatting [68.19544657508509]
Large language models (LLMs) are adopted as a fundamental component of language technologies.
We find that several widely used open-source LLMs are extremely sensitive to subtle changes in prompt format in few-shot settings.
We propose an algorithm that rapidly evaluates a sampled set of plausible prompt formats for a given task, and reports the interval of expected performance without accessing model weights.
arXiv Detail & Related papers (2023-10-17T15:03:30Z) - Exploring Format Consistency for Instruction Tuning [79.0698403613366]
In this work, we propose a framework named Unified Instruction Tuning (UIT)
UIT calls OpenAI APIs for automatic format transfer among different instruction tuning datasets such as PromptSource, FLAN and CrossFit.
With the framework, we demonstrate the necessity of maintaining format consistency in instruction tuning; (2) improve the generalization performance on unseen instructions on T5-LM-xl; and (3) provide a novel perplexity-based denoising method to reduce the noise of automatic format transfer.
arXiv Detail & Related papers (2023-07-28T12:00:13Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - Machine Learning-Driven Adaptive OpenMP For Portable Performance on
Heterogeneous Systems [1.885335997132172]
Adapting a program to a new heterogeneous platform is laborious and requires developers to manually explore a vast space of execution parameters.
This paper proposes extensions to OpenMP for autonomous, machine learning-driven adaptation.
Our solution includes a set of novel language constructs, compiler transformations, and runtime support.
arXiv Detail & Related papers (2023-03-15T18:37:18Z) - Frame Averaging for Equivariant Shape Space Learning [85.42901997467754]
A natural way to incorporate symmetries in shape space learning is to ask that the mapping to the shape space (encoder) and mapping from the shape space (decoder) are equivariant to the relevant symmetries.
We present a framework for incorporating equivariance in encoders and decoders by introducing two contributions.
arXiv Detail & Related papers (2021-12-03T06:41:19Z) - Efficient Multi-Organ Segmentation Using SpatialConfiguration-Net with
Low GPU Memory Requirements [8.967700713755281]
In this work, we employ a multi-organ segmentation model based on the SpatialConfiguration-Net (SCN)
We modified the architecture of the segmentation model to reduce its memory footprint without drastically impacting the quality of the predictions.
Lastly, we implemented a minimal inference script for which we optimized both, execution time and required GPU memory.
arXiv Detail & Related papers (2021-11-26T17:47:10Z) - Memory-based Semantic Segmentation for Off-road Unstructured Natural
Environments [29.498304237783763]
We propose a built-in memory module for semantic segmentation.
The memory module stores significant representations of training images as memory items.
We conduct experiments on the Robot Unstructured Ground Driving dataset and RELLIS dataset.
arXiv Detail & Related papers (2021-08-12T10:04:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.