Related papers: FORTALESA: Fault-Tolerant Reconfigurable Systolic Array for DNN Inference

FORTALESA: Fault-Tolerant Reconfigurable Systolic Array for DNN Inference

URL: http://arxiv.org/abs/2503.04426v1
Date: Thu, 06 Mar 2025 13:35:59 GMT
Title: FORTALESA: Fault-Tolerant Reconfigurable Systolic Array for DNN Inference
Authors: Natalia Cherezova, Artur Jutman, Maksim Jenihhin,
Abstract summary: Deep Neural Networks (DNNs) in mission- and safety-critical applications bring their reliability to the front.<n>This work presents a run-time reconfigurable systolic array architecture with three execution modes and four implementation options.<n>The proposed architecture efficiently protects registers and MAC units of systolic array PEs from transient and permanent faults.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The emergence of Deep Neural Networks (DNNs) in mission- and safety-critical applications brings their reliability to the front. High performance demands of DNNs require the use of specialized hardware accelerators. Systolic array architecture is widely used in DNN accelerators due to its parallelism and regular structure. This work presents a run-time reconfigurable systolic array architecture with three execution modes and four implementation options. All four implementations are evaluated in terms of resource utilization, throughput, and fault tolerance improvement. The proposed architecture is used for reliability enhancement of DNN inference on systolic array through heterogeneous mapping of different network layers to different execution modes. The approach is supported by a novel reliability assessment method based on fault propagation analysis. It is used for the exploration of the appropriate execution mode-layer mapping for DNN inference. The proposed architecture efficiently protects registers and MAC units of systolic array PEs from transient and permanent faults. The reconfigurability feature enables a speedup of up to $3\times$, depending on layer vulnerability. Furthermore, it requires $6\times$ less resources compared to static redundancy and $2.5\times$ less resources compared to the previously proposed solution for transient faults.

Related papers

FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression [55.992528247880685]
Decentralized training faces significant challenges regarding system design and efficiency. We present FusionLLM, a decentralized training system designed and implemented for training large deep neural networks (DNNs) We show that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.
arXiv Detail & Related papers (2024-10-16T16:13:19Z)
HYDRA: Hybrid Data Multiplexing and Run-time Layer Configurable DNN Accelerator [0.0]
The article proposes a layer-multiplexed approach, which further reuses a single activation function within the execution of a single layer with improved Fused-Multiply-Accumulate (FMA) The proposed architectures achieve reductions over 90% of power consumption and resource utilization improvements, with 35.21 TOPSW.
arXiv Detail & Related papers (2024-09-08T05:10:02Z)
SAFFIRA: a Framework for Assessing the Reliability of Systolic-Array-Based DNN Accelerators [0.4391603054571586]
This paper introduces a novel hierarchical software-based hardware-aware fault injection strategy tailored for systolic array-based Deep Neural Network (DNN) accelerators.
arXiv Detail & Related papers (2024-03-05T13:17:09Z)
REDS: Resource-Efficient Deep Subnetworks for Dynamic Resource Constraints [2.9209462960232235]
State-of-the-art machine learning pipelines generate resource-agnostic models, not capable to adapt at runtime. We introduce Resource-Efficient Deep Subnetworks (REDS) to tackle model adaptation to variable resources. We provide a theoretical result and empirical evidence for REDS outstanding performance in terms of submodels' test set accuracy.
arXiv Detail & Related papers (2023-11-22T12:34:51Z)
Special Session: Approximation and Fault Resiliency of DNN Accelerators [0.9126382223122612]
This paper explores the approximation and fault resiliency of Deep Neural Network accelerators. We propose to use approximate (AxC) arithmetic circuits to emulate errors in hardware without performing fault injection on the DNN. We also propose a fine-grain analysis of fault resiliency by examining fault propagation and masking in networks.
arXiv Detail & Related papers (2023-05-31T19:27:45Z)
Detecting train driveshaft damages using accelerometer signals and Differential Convolutional Neural Networks [67.60224656603823]
This paper proposes the development of a railway axle condition monitoring system based on advanced 2D-Convolutional Neural Network (CNN) architectures. The resultant system converts the railway axle vibration signals into time-frequency domain representations, i.e., spectrograms, and, thus, trains a two-dimensional CNN to classify them depending on their cracks.
arXiv Detail & Related papers (2022-11-15T15:04:06Z)
enpheeph: A Fault Injection Framework for Spiking and Compressed Deep Neural Networks [10.757663798809144]
We present enpheeph, a Fault Injection Framework for Spiking and Compressed Deep Neural Networks (DNNs) By injecting a random and increasing number of faults, we show that DNNs can show a reduction in accuracy with a fault rate as low as 7 x 10 (-7) faults per parameter, with an accuracy drop higher than 40%.
arXiv Detail & Related papers (2022-07-31T00:30:59Z)
Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration [71.80326738527734]
We propose a general, fine-grained structured pruning scheme and corresponding compiler optimizations. We show that our pruning scheme mapping methods, together with the general fine-grained structured pruning scheme, outperform the state-of-the-art DNN optimization framework.
arXiv Detail & Related papers (2021-11-22T23:53:14Z)
Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks [61.76338096980383]
A range of neural architecture search (NAS) techniques are used to automatically learn two types of hyper- parameters of state-of-the-art factored time delay neural networks (TDNNs) These include the DARTS method integrating architecture selection with lattice-free MMI (LF-MMI) TDNN training. Experiments conducted on a 300-hour Switchboard corpus suggest the auto-configured systems consistently outperform the baseline LF-MMI TDNN systems.
arXiv Detail & Related papers (2020-07-17T08:32:11Z)
When Residual Learning Meets Dense Aggregation: Rethinking the Aggregation of Deep Neural Networks [57.0502745301132]
We propose Micro-Dense Nets, a novel architecture with global residual learning and local micro-dense aggregations. Our micro-dense block can be integrated with neural architecture search based models to boost their performance.
arXiv Detail & Related papers (2020-04-19T08:34:52Z)
BLK-REW: A Unified Block-based DNN Pruning Framework using Reweighted Regularization Method [69.49386965992464]
We propose a new block-based pruning framework that comprises a general and flexible structured pruning dimension as well as a powerful and efficient reweighted regularization method. Our framework is universal, which can be applied to both CNNs and RNNs, implying complete support for the two major kinds ofintensive computation layers. It is the first time that the weight pruning framework achieves universal coverage for both CNNs and RNNs with real-time mobile acceleration and no accuracy compromise.
arXiv Detail & Related papers (2020-01-23T03:30:56Z)
PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning [57.20262984116752]
We introduce a new dimension, fine-grained pruning patterns inside the coarse-grained structures, revealing a previously unknown point in design space. With the higher accuracy enabled by fine-grained pruning patterns, the unique insight is to use the compiler to re-gain and guarantee high hardware efficiency.
arXiv Detail & Related papers (2020-01-01T04:52:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.