An Empirical Study: MEMS as a Static Performance Metric
- URL: http://arxiv.org/abs/2505.07208v1
- Date: Mon, 12 May 2025 03:31:33 GMT
- Title: An Empirical Study: MEMS as a Static Performance Metric
- Authors: Liwei Zhang, Baoquan Cui, Xutong Ma, Jian Zhang,
- Abstract summary: We investigate mems, the number of memory accesses, as a static and architecture-independent performance metric.<n>We develop a Clang-based automated instrumentation tool that rewrites source code to insert path tracing and textitmems counting logic.
- Score: 5.417296778663869
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Static performance estimation is essential during compile-time analysis, yet traditional runtime-based methods are costly and platform-dependent. We investigate mems, the number of memory accesses, as a static and architecture-independent performance metric. We develop a Clang-based automated instrumentation tool that rewrites source code to insert path tracing and \textit{mems} counting logic. This allows us to evaluate mems-based performance estimation across ten classical algorithm programs. Experimental results show that within the same program, execution paths with higher mems values consistently exhibit longer runtime. However, this correlation weakens between different programs, suggesting that mems is best suited for comparing performance of different execution paths in a program.
Related papers
- MIB: A Mechanistic Interpretability Benchmark [77.35046700898326]
We propose MIB, a benchmark with two tracks spanning four tasks and five models.<n>Using MIB, we find that attribution and mask optimization methods perform best on circuit localization.<n>For causal variable localization, we find that the supervised DAS method performs best, while SAE features are not better than neurons.
arXiv Detail & Related papers (2025-04-17T17:55:45Z) - Path-optimal symbolic execution of heap-manipulating programs [5.639904484784126]
This paper introduces POSE, path-optimal symbolic execution, a symbolic execution algorithm that originally accomplishes path optimality against heap-manipulating programs.
We formalize the POSE algorithm for a tiny, but representative object-oriented programming language, and implement the formalization into a prototype symbolic executor to experiment the algorithm against a benchmark of sample programs that take data structures as inputs.
arXiv Detail & Related papers (2024-07-23T20:35:33Z) - Parallel Program Analysis on Path Ranges [3.018638214344819]
Ranged symbolic execution performs symbolic execution on program parts, so called path ranges, in parallel.
We present a verification approach that splits programs into path ranges and then runs arbitrary analyses on the ranges in parallel.
arXiv Detail & Related papers (2024-02-19T08:26:52Z) - LLVM Static Analysis for Program Characterization and Memory Reuse
Profile Estimation [0.0]
This paper presents an LLVM-based probabilistic static analysis method.
It accurately predicts different program characteristics and estimates the reuse distance profile of a program.
The results show that our approach can predict application characteristics accurately compared to another LLVM-based dynamic code analysis tool, Byfl.
arXiv Detail & Related papers (2023-11-20T23:05:06Z) - Constant Memory Attention Block [74.38724530521277]
Constant Memory Attention Block (CMAB) is a novel general-purpose attention block that computes its output in constant memory and performs updates in constant computation.
We show our proposed methods achieve results competitive with state-of-the-art while being significantly more memory efficient.
arXiv Detail & Related papers (2023-06-21T22:41:58Z) - Performance Embeddings: A Similarity-based Approach to Automatic
Performance Optimization [71.69092462147292]
Performance embeddings enable knowledge transfer of performance tuning between applications.
We demonstrate this transfer tuning approach on case studies in deep neural networks, dense and sparse linear algebra compositions, and numerical weather prediction stencils.
arXiv Detail & Related papers (2023-03-14T15:51:35Z) - Exploring Techniques for the Analysis of Spontaneous Asynchronicity in
MPI-Parallel Applications [0.8889304968879161]
We run microbenchmarks and realistic proxy applications with the regular compute-communicate structure on two different supercomputing platforms.
We show how desynchronization patterns can be readily identified from a data set that is much smaller than a full MPI trace.
arXiv Detail & Related papers (2022-05-27T13:19:07Z) - Memory-Based Semantic Parsing [79.48882899104997]
We present a memory-based model for context-dependent semantic parsing.
We learn a context memory controller that manages the memory by maintaining the cumulative meaning of sequential user utterances.
arXiv Detail & Related papers (2021-09-07T16:15:13Z) - Recall@k Surrogate Loss with Large Batches and Similarity Mixup [62.67458021725227]
Direct optimization, by gradient descent, of an evaluation metric is not possible when it is non-differentiable.
In this work, a differentiable surrogate loss for the recall is proposed.
The proposed method achieves state-of-the-art results in several image retrieval benchmarks.
arXiv Detail & Related papers (2021-08-25T11:09:11Z) - The Benchmark Lottery [114.43978017484893]
"A benchmark lottery" describes the overall fragility of the machine learning benchmarking process.
We show that the relative performance of algorithms may be altered significantly simply by choosing different benchmark tasks.
arXiv Detail & Related papers (2021-07-14T21:08:30Z) - Runtime Performances Benchmark for Knowledge Graph Embedding Methods [0.0]
This paper focuses on providing a characterization of the runtime performances of state-of-the-art implementations of KGE alghoritms.
arXiv Detail & Related papers (2020-11-05T21:58:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.