Related papers: Gradient Backpropagation based Feature Attribution to Enable Explainable-AI on the Edge

Gradient Backpropagation based Feature Attribution to Enable Explainable-AI on the Edge

URL: http://arxiv.org/abs/2210.10922v1
Date: Wed, 19 Oct 2022 22:58:59 GMT
Title: Gradient Backpropagation based Feature Attribution to Enable Explainable-AI on the Edge
Authors: Ashwin Bhat, Adou Sangbone Assoa, Arijit Raychowdhury
Abstract summary: In this work, we analyze the dataflow of gradient backpropagation based feature attribution algorithms to determine the resource overhead required over inference. We develop a High-Level Synthesis (HLS) based FPGA design that is targeted for edge devices and supports three feature attribution algorithms. Our design methodology demonstrates a pathway to repurpose inference accelerators to support feature attribution with minimal overhead, thereby enabling real-time XAI on the edge.
Score: 1.7338677787507768
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: There has been a recent surge in the field of Explainable AI (XAI) which tackles the problem of providing insights into the behavior of black-box machine learning models. Within this field, \textit{feature attribution} encompasses methods which assign relevance scores to input features and visualize them as a heatmap. Designing flexible accelerators for multiple such algorithms is challenging since the hardware mapping of these algorithms has not been studied yet. In this work, we first analyze the dataflow of gradient backpropagation based feature attribution algorithms to determine the resource overhead required over inference. The gradient computation is optimized to minimize the memory overhead. Second, we develop a High-Level Synthesis (HLS) based configurable FPGA design that is targeted for edge devices and supports three feature attribution algorithms. Tile based computation is employed to maximally use on-chip resources while adhering to the resource constraints. Representative CNNs are trained on CIFAR-10 dataset and implemented on multiple Xilinx FPGAs using 16-bit fixed-point precision demonstrating flexibility of our library. Finally, through efficient reuse of allocated hardware resources, our design methodology demonstrates a pathway to repurpose inference accelerators to support feature attribution with minimal overhead, thereby enabling real-time XAI on the edge.

Related papers

IAMAP: Unlocking Deep Learning in QGIS for non-coders and limited computing resources [0.0]
IAMAP builds on recent advancements in self-supervised learning strategies.<n>It enables non-AI specialists to leverage the high-quality features provided by recent deep learning approaches.
arXiv Detail & Related papers (2025-08-01T13:39:43Z)
Accelerating 3D Gaussian Splatting with Neural Sorting and Axis-Oriented Rasterization [14.87046071090259]
3D Gaussian Splatting (3DGS) has recently gained significant attention for high-quality and efficient view synthesis.<n>Despite its impressive algorithmic performance, real-time rendering on resource-constrained devices remains a major challenge due to tight power and area budgets.
arXiv Detail & Related papers (2025-06-08T10:14:54Z)
A projection-based framework for gradient-free and parallel learning [50.96641619247761]
We introduce PJAX, a JAX-based software framework that enables this paradigm.<n>PJAX composes projection operators for elementary operations, automatically deriving the solution operators for the feasibility problems.<n>We train diverse architectures (MLPs, CNNs, RNNs) on standard benchmarks using PJAX, demonstrating its generality.
arXiv Detail & Related papers (2025-06-06T08:44:56Z)
ApproXAI: Energy-Efficient Hardware Acceleration of Explainable AI using Approximate Computing [0.0]
XAIedge is a novel framework that leverages approximate computing techniques into XAI algorithms, including integrated gradients, model distillation, and Shapley analysis. XAIedge translates these algorithms into approximate matrix computations and exploits the synergy between convolution, Fourier transform, and approximate computing paradigms. Our comprehensive evaluation demonstrates that XAIedge achieves a $2times$ improvement in energy efficiency compared to existing accurate XAI hardware acceleration techniques.
arXiv Detail & Related papers (2025-04-24T20:40:29Z)
On Accelerating Edge AI: Optimizing Resource-Constrained Environments [1.7355861031903428]
Resource-constrained edge deployments demand AI solutions that balance high performance with stringent compute, memory, and energy limitations. We present a comprehensive overview of the primary strategies for accelerating deep learning models under such constraints.
arXiv Detail & Related papers (2025-01-25T01:37:03Z)
Front-propagation Algorithm: Explainable AI Technique for Extracting Linear Function Approximations from Neural Networks [0.0]
This paper introduces the front-propagation algorithm, a novel AI technique designed to elucidate the decision-making logic of deep neural networks. Unlike other popular explainability algorithms such as Integrated Gradients or Shapley Values, the proposed algorithm is able to extract an accurate and consistent linear function explanation of the network. We demonstrate its efficacy in providing accurate linear functions with three different neural network architectures trained on publicly available benchmark datasets.
arXiv Detail & Related papers (2024-05-25T14:50:23Z)
GloptiNets: Scalable Non-Convex Optimization with Certificates [61.50835040805378]
We present a novel approach to non-cube optimization with certificates, which handles smooth functions on the hypercube or on the torus. By exploiting the regularity of the target function intrinsic in the decay of its spectrum, we allow at the same time to obtain precise certificates and leverage the advanced and powerful neural networks.
arXiv Detail & Related papers (2023-06-26T09:42:59Z)
Symbolic Regression on FPGAs for Fast Machine Learning Inference [2.0920303420933273]
High-energy physics community is investigating the potential of deploying machine-learning-based solutions on Field-Programmable Gate Arrays (FPGAs) We introduce a novel end-to-end procedure that utilizes a machine learning technique called symbolic regression (SR) We show that our approach can approximate a 3-layer neural network using an inference model that achieves up to a 13-fold decrease in execution time, down to 5 ns, while still preserving more than 90% approximation accuracy.
arXiv Detail & Related papers (2023-05-06T17:04:02Z)
Task-Oriented Sensing, Computation, and Communication Integration for Multi-Device Edge AI [108.08079323459822]
This paper studies a new multi-intelligent edge artificial-latency (AI) system, which jointly exploits the AI model split inference and integrated sensing and communication (ISAC) We measure the inference accuracy by adopting an approximate but tractable metric, namely discriminant gain.
arXiv Detail & Related papers (2022-07-03T06:57:07Z)
Learning Implicit Feature Alignment Function for Semantic Segmentation [51.36809814890326]
Implicit Feature Alignment function (IFA) is inspired by the rapidly expanding topic of implicit neural representations. We show that IFA implicitly aligns the feature maps at different levels and is capable of producing segmentation maps in arbitrary resolutions. Our method can be combined with improvement on various architectures, and it achieves state-of-the-art accuracy trade-off on common benchmarks.
arXiv Detail & Related papers (2022-06-17T09:40:14Z)
Real-time Hyper-Dimensional Reconfiguration at the Edge using Hardware Accelerators [12.599871451119538]
HyDRATE can perform real-time reconfiguration at the edge using deep neural nets (DNN) combined with hyperdimensional (HD) computing accelerators. We describe the algorithm, trained quantized model generation, and simulated performance of a feature extractor free of multiply-accumulates. We show that reconfigurability in the field is achieved by retraining only the feed-forward HD classifier without descent gradient backpropagation.
arXiv Detail & Related papers (2022-06-10T14:08:41Z)
Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points. The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains. We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z)
Hardware-Efficient Deconvolution-Based GAN for Edge Computing [1.5229257192293197]
Generative Adversarial Networks (GAN) are cutting-edge algorithms for generating new data samples based on the learned data distribution. We proposed an HW/SW co-design approach for training quantized deconvolution GAN (QDCGAN) implemented on FPGA using a scalable streaming dataflow architecture. Various precisions, datasets, and network scalability were analyzed for low-power inference on resource-constrained platforms.
arXiv Detail & Related papers (2022-01-18T11:16:59Z)
Lightweight Single-Image Super-Resolution Network with Attentive Auxiliary Feature Learning [73.75457731689858]
We develop a computation efficient yet accurate network based on the proposed attentive auxiliary features (A$2$F) for SISR. Experimental results on large-scale dataset demonstrate the effectiveness of the proposed model against the state-of-the-art (SOTA) SR methods.
arXiv Detail & Related papers (2020-11-13T06:01:46Z)
ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators using Reinforcement Learning [5.251940442946459]
We propose an autonomous strategy called ConfuciuX to find optimized HW resource assignments for a given model and dataflow style. It converges to the optimized hardware configuration 4.7 to 24 times faster than alternate techniques.
arXiv Detail & Related papers (2020-09-04T04:59:26Z)
Heuristic Semi-Supervised Learning for Graph Generation Inspired by Electoral College [80.67842220664231]
We propose a novel pre-processing technique, namely ELectoral COllege (ELCO), which automatically expands new nodes and edges to refine the label similarity within a dense subgraph. In all setups tested, our method boosts the average score of base models by a large margin of 4.7 points, as well as consistently outperforms the state-of-the-art.
arXiv Detail & Related papers (2020-06-10T14:48:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.