Related papers: Pinpointing the Memory Behaviors of DNN Training

Pinpointing the Memory Behaviors of DNN Training

URL: http://arxiv.org/abs/2104.00258v1
Date: Thu, 1 Apr 2021 05:30:03 GMT
Title: Pinpointing the Memory Behaviors of DNN Training
Authors: Jiansong Li, Xiao Dong, Guangli Li, Peng Zhao, Xueying Wang, Xiaobing Chen, Xianzhi Yu, Yongxin Yang, Zihan Jiang, Wei Cao, Lei Liu, Xiaobing Feng
Abstract summary: Training of deep neural networks (DNNs) is usually memory-hungry due to the limited device memory capacity of accelerators. In this work, we pinpoint the memory behaviors of each device memory block of GPU during training by instrumenting the memory allocators of the runtime system.
Score: 37.78973307051419
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The training of deep neural networks (DNNs) is usually memory-hungry due to the limited device memory capacity of DNN accelerators. Characterizing the memory behaviors of DNN training is critical to optimize the device memory pressures. In this work, we pinpoint the memory behaviors of each device memory block of GPU during training by instrumenting the memory allocators of the runtime system. Our results show that the memory access patterns of device memory blocks are stable and follow an iterative fashion. These observations are useful for the future optimization of memory-efficient training from the perspective of raw memory access patterns.

Related papers

Memory-Efficient Training for Deep Speaker Embedding Learning in Speaker Verification [50.596077598766975]
We explore a memory-efficient training strategy for deep speaker embedding learning in resource-constrained scenarios. For activations, we design two types of reversible neural networks which eliminate the need to store intermediate activations. For states, we introduce a dynamic quantization approach that replaces the original 32-bit floating-point values with a dynamic tree-based 8-bit data type.
arXiv Detail & Related papers (2024-12-02T06:57:46Z)
Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term. We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents. Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z)
Topology-aware Embedding Memory for Continual Learning on Expanding Networks [63.35819388164267]
We present a framework to tackle the memory explosion problem using memory replay techniques. PDGNNs with Topology-aware Embedding Memory (TEM) significantly outperform state-of-the-art techniques.
arXiv Detail & Related papers (2024-01-24T03:03:17Z)
Recurrent Dynamic Embedding for Video Object Segmentation [54.52527157232795]
We propose a Recurrent Dynamic Embedding (RDE) to build a memory bank of constant size. We propose an unbiased guidance loss during the training stage, which makes SAM more robust in long videos. We also design a novel self-correction strategy so that the network can repair the embeddings of masks with different qualities in the memory bank.
arXiv Detail & Related papers (2022-05-08T02:24:43Z)
Memory Planning for Deep Neural Networks [0.0]
We study memory allocation patterns in DNNs during inference. Latencies incurred due to such textttmutex contention produce undesirable bottlenecks in user-facing services. We present an implementation of textttMemoMalloc in the PyTorch deep learning framework.
arXiv Detail & Related papers (2022-02-23T05:28:18Z)
Hierarchical Memory Matching Network for Video Object Segmentation [38.24999776705497]
We propose two advanced memory read modules that enable us to perform memory in multiple scales while exploiting temporal smoothness. We first propose a guided memory matching module that replaces the non-local dense memory read, commonly adopted in previous memory-based methods. We introduce a hierarchical memory matching scheme and propose a top-k guided memory matching module in which memory read on a fine-scale is guided by that on a coarse-scale.
arXiv Detail & Related papers (2021-09-23T14:36:43Z)
Space Time Recurrent Memory Network [35.06536468525509]
We propose a novel visual memory network architecture for the learning and inference problem in the spatial-temporal domain. This architecture is benchmarked on the video object segmentation and video prediction problems. We show that our memory architecture can achieve competitive results with state-of-the-art while maintaining constant memory capacity.
arXiv Detail & Related papers (2021-09-14T06:53:51Z)
Kanerva++: extending The Kanerva Machine with differentiable, locally block allocated latent memory [75.65949969000596]
Episodic and semantic memory are critical components of the human memory model. We develop a new principled Bayesian memory allocation scheme that bridges the gap between episodic and semantic memory. We demonstrate that this allocation scheme improves performance in memory conditional image generation.
arXiv Detail & Related papers (2021-02-20T18:40:40Z)
Neural Storage: A New Paradigm of Elastic Memory [4.307341575886927]
Storage and retrieval of data in a computer memory plays a major role in system performance. We introduce Neural Storage (NS), a brain-inspired learning memory paradigm that organizes the memory as a flexible neural memory network. NS achieves an order of magnitude improvement in memory access performance for two representative applications.
arXiv Detail & Related papers (2021-01-07T19:19:25Z)
Memformer: A Memory-Augmented Transformer for Sequence Modeling [55.780849185884996]
We present Memformer, an efficient neural network for sequence modeling. Our model achieves linear time complexity and constant memory space complexity when processing long sequences.
arXiv Detail & Related papers (2020-10-14T09:03:36Z)
Robust High-dimensional Memory-augmented Neural Networks [13.82206983716435]
Memory-augmented neural networks enhance neural networks with an explicit memory to overcome these issues. Access to this explicit memory occurs via soft read and write operations involving every individual memory entry. We propose a robust architecture that employs a computational memory unit as the explicit memory performing analog in-memory computation on high-dimensional (HD) vectors.
arXiv Detail & Related papers (2020-10-05T12:01:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.