Related papers: Benchmark Assessment for DeepSpeed Optimization Library

Benchmark Assessment for DeepSpeed Optimization Library

URL: http://arxiv.org/abs/2202.12831v1
Date: Sat, 12 Feb 2022 04:52:28 GMT
Title: Benchmark Assessment for DeepSpeed Optimization Library
Authors: Gongbo Liang and Izzat Alsmadi
Abstract summary: Deep Learning (DL) models are widely used in machine learning due to their performance and ability to deal with large datasets. The size of such datasets and the complexity of DL models cause such models to be complex, consuming large amount of resources and time to train. Many recent libraries and applications are introduced to deal with DL complexity and efficiency issues.
Score: 1.7839986996686321
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep Learning (DL) models are widely used in machine learning due to their performance and ability to deal with large datasets while producing high accuracy and performance metrics. The size of such datasets and the complexity of DL models cause such models to be complex, consuming large amount of resources and time to train. Many recent libraries and applications are introduced to deal with DL complexity and efficiency issues. In this paper, we evaluated one example, Microsoft DeepSpeed library through classification tasks. DeepSpeed public sources reported classification performance metrics on the LeNet architecture. We extended this through evaluating the library on several modern neural network architectures, including convolutional neural networks (CNNs) and Vision Transformer (ViT). Results indicated that DeepSpeed, while can make improvements in some of those cases, it has no or negative impact on others.

Related papers

Adaptable Embeddings Network (AEN) [49.1574468325115]
We introduce Adaptable Embeddings Networks (AEN), a novel dual-encoder architecture using Kernel Density Estimation (KDE) AEN allows for runtime adaptation of classification criteria without retraining and is non-autoregressive. The architecture's ability to preprocess and cache condition embeddings makes it ideal for edge computing applications and real-time monitoring systems.
arXiv Detail & Related papers (2024-11-21T02:15:52Z)
NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals [58.83169560132308]
We introduce NNsight and NDIF, technologies that work in tandem to enable scientific study of the representations and computations learned by very large neural networks.
arXiv Detail & Related papers (2024-07-18T17:59:01Z)
Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads [65.47816359465155]
Running multiple deep neural networks (DNNs) in parallel has become an emerging workload in both edge devices. We propose Dysta, a novel scheduler that utilizes both static sparsity patterns and dynamic sparsity information for the sparse multi-DNN scheduling. Our proposed approach outperforms the state-of-the-art methods with up to 10% decrease in latency constraint violation rate and nearly 4X reduction in average normalized turnaround time.
arXiv Detail & Related papers (2023-10-17T09:25:17Z)
A serial dual-channel library occupancy detection system based on Faster RCNN [14.922479331766368]
Existing solutions, such as software-based seat reservations and sensors-based occupancy detection, have proven to be inadequate in effectively addressing this problem. We propose a novel approach: a serial dual-channel object detection model based on Faster RCNN. This model is designed to discern all instances of occupied seats within the library and continuously update real-time information regarding seat occupancy status.
arXiv Detail & Related papers (2023-06-28T10:27:17Z)
Deep Cellular Recurrent Network for Efficient Analysis of Time-Series Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information. The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv Detail & Related papers (2021-01-12T20:08:18Z)
Deep Structure Learning using Feature Extraction in Trained Projection Space [0.0]
We introduce a network architecture using a self-adjusting and data dependent version of the Radon-transform (linear data projection), also known as x-ray projection, to enable feature extraction via convolutions in lower-dimensional space. The resulting framework, named PiNet, can be trained end-to-end and shows promising performance on volumetric segmentation tasks.
arXiv Detail & Related papers (2020-09-01T12:16:55Z)
PolyDL: Polyhedral Optimizations for Creation of High Performance DL primitives [55.79741270235602]
We present compiler algorithms to automatically generate high performance implementations of Deep Learning primitives. We develop novel data reuse analysis algorithms using the polyhedral model. We also show that such a hybrid compiler plus a minimal library-use approach results in state-of-the-art performance.
arXiv Detail & Related papers (2020-06-02T06:44:09Z)
RadixSpline: A Single-Pass Learned Index [84.84747738666263]
We introduce RadixSpline (RS), a learned index that can be built in a single pass over the data. RS achieves competitive results on all datasets, despite the fact that it only has two parameters.
arXiv Detail & Related papers (2020-04-30T01:56:54Z)
A Close Look at Deep Learning with Small Data [0.0]
We show that model complexity is a critical factor when only a few samples per class are available. We also show that even standard data augmentation can boost recognition performance by large margins.
arXiv Detail & Related papers (2020-03-28T17:11:29Z)
On the performance of deep learning models for time series classification in streaming [0.0]
This work is to assess the performance of different types of deep architectures for data streaming classification. We evaluate models such as multi-layer perceptrons, recurrent, convolutional and temporal convolutional neural networks over several time-series datasets.
arXiv Detail & Related papers (2020-03-05T11:41:29Z)
Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.