GRAPHSPY: Fused Program Semantic-Level Embedding via Graph Neural
Networks for Dead Store Detection
- URL: http://arxiv.org/abs/2011.09501v1
- Date: Wed, 18 Nov 2020 19:17:15 GMT
- Title: GRAPHSPY: Fused Program Semantic-Level Embedding via Graph Neural
Networks for Dead Store Detection
- Authors: Yixin Guo, Pengcheng Li, Yingwei Luo, Xiaolin Wang, Zhenlin Wang
- Abstract summary: We propose a learning-grained approach to identify unnecessary memory operations intelligently with low overhead.
By applying several prevalent graph neural network models to extract program semantics, we present a novel, hybrid program embedding approach.
Results show that our model achieves 90% of accuracy and incurs only around a half of time overhead of the state-of-art tool.
- Score: 4.82596017481926
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Production software oftentimes suffers from the issue of performance
inefficiencies caused by inappropriate use of data structures, programming
abstractions, and conservative compiler optimizations. It is desirable to avoid
unnecessary memory operations. However, existing works often use a
whole-program fine-grained monitoring method with incredibly high overhead. To
this end, we propose a learning-aided approach to identify unnecessary memory
operations intelligently with low overhead. By applying several prevalent graph
neural network models to extract program semantics with respect to program
structure, execution order and dynamic states, we present a novel, hybrid
program embedding approach so that to derive unnecessary memory operations
through the embedding. We train our model with tens of thousands of samples
acquired from a set of real-world benchmarks. Results show that our model
achieves 90% of accuracy and incurs only around a half of time overhead of the
state-of-art tool.
Related papers
- Tracing Optimization for Performance Modeling and Regression Detection [15.99435412859094]
A performance model analytically describes the relationship between the performance of a system and its runtime activities.
We propose statistical approaches to reduce tracing overhead by identifying and excluding performance-insensitive code regions.
Our approach is fully automated, making it ready to be used in production environments with minimal human effort.
arXiv Detail & Related papers (2024-11-26T16:11:55Z) - CItruS: Chunked Instruction-aware State Eviction for Long Sequence Modeling [52.404072802235234]
We introduce Chunked Instruction-aware State Eviction (CItruS), a novel modeling technique that integrates the attention preferences useful for a downstream task into the eviction process of hidden states.
Our training-free method exhibits superior performance on long sequence comprehension and retrieval tasks over several strong baselines under the same memory budget.
arXiv Detail & Related papers (2024-06-17T18:34:58Z) - FuzzyFlow: Leveraging Dataflow To Find and Squash Program Optimization
Bugs [92.47146416628965]
FuzzyFlow is a fault localization and test case extraction framework designed to test program optimizations.
We leverage dataflow program representations to capture a fully reproducible system state and area-of-effect for optimizations.
To reduce testing time, we design an algorithm for minimizing test inputs, trading off memory for recomputation.
arXiv Detail & Related papers (2023-06-28T13:00:17Z) - Benchmarking Node Outlier Detection on Graphs [90.29966986023403]
Graph outlier detection is an emerging but crucial machine learning task with numerous applications.
We present the first comprehensive unsupervised node outlier detection benchmark for graphs called UNOD.
arXiv Detail & Related papers (2022-06-21T01:46:38Z) - DCT-Former: Efficient Self-Attention with Discrete Cosine Transform [4.622165486890318]
An intrinsic limitation of the Trasformer architectures arises from the computation of the dot-product attention.
Our idea takes inspiration from the world of lossy data compression (such as the JPEG algorithm) to derive an approximation of the attention module.
An extensive section of experiments shows that our method takes up less memory for the same performance, while also drastically reducing inference time.
arXiv Detail & Related papers (2022-03-02T15:25:27Z) - Efficient Multi-Organ Segmentation Using SpatialConfiguration-Net with
Low GPU Memory Requirements [8.967700713755281]
In this work, we employ a multi-organ segmentation model based on the SpatialConfiguration-Net (SCN)
We modified the architecture of the segmentation model to reduce its memory footprint without drastically impacting the quality of the predictions.
Lastly, we implemented a minimal inference script for which we optimized both, execution time and required GPU memory.
arXiv Detail & Related papers (2021-11-26T17:47:10Z) - Software Vulnerability Detection via Deep Learning over Disaggregated
Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora.
Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z) - Using Graph Neural Networks to model the performance of Deep Neural
Networks [2.1151356984322307]
We develop a novel performance model that adopts a graph representation.
Experimental evaluation shows a 7:75x and 12x reduction in prediction error compared to the Halide and TVM models, respectively.
arXiv Detail & Related papers (2021-08-27T20:20:17Z) - Neural Language Modeling for Contextualized Temporal Graph Generation [49.21890450444187]
This paper presents the first study on using large-scale pre-trained language models for automated generation of an event-level temporal graph for a document.
arXiv Detail & Related papers (2020-10-20T07:08:00Z) - Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm.
We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data.
Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z) - Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural
Networks for Edge Devices [10.876317610988059]
We present a memory-aware compiler, dubbed SERENITY, that finds a sequence that finds a schedule with optimal memory footprint.
Our solution also comprises of graph rewriting technique that allows further reduction beyond the optimum.
arXiv Detail & Related papers (2020-03-04T23:38:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.