Related papers: HDReason: Algorithm-Hardware Codesign for Hyperdimensional Knowledge Graph Reasoning

HDReason: Algorithm-Hardware Codesign for Hyperdimensional Knowledge Graph Reasoning

URL: http://arxiv.org/abs/2403.05763v1
Date: Sat, 9 Mar 2024 02:17:43 GMT
Title: HDReason: Algorithm-Hardware Codesign for Hyperdimensional Knowledge Graph Reasoning
Authors: Hanning Chen, Yang Ni, Ali Zakeri, Zhuowen Zou, Sanggeon Yun, Fei Wen, Behnam Khaleghi, Narayan Srinivasa, Hugo Latapie, and Mohsen Imani
Abstract summary: Brain-inspired HyperDimensional Computing (HDC) has been introduced as a promising solution for lightweight machine learning. In this paper, we leverage HDC for an intrinsically more efficient and acceleration-friendly Knowledge Graph Completion (KGC) algorithm. We also co-design an acceleration framework named HDReason targeting FPGA platforms.
Score: 18.790512589967875
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent times, a plethora of hardware accelerators have been put forth for graph learning applications such as vertex classification and graph classification. However, previous works have paid little attention to Knowledge Graph Completion (KGC), a task that is well-known for its significantly higher algorithm complexity. The state-of-the-art KGC solutions based on graph convolution neural network (GCN) involve extensive vertex/relation embedding updates and complicated score functions, which are inherently cumbersome for acceleration. As a result, existing accelerator designs are no longer optimal, and a novel algorithm-hardware co-design for KG reasoning is needed. Recently, brain-inspired HyperDimensional Computing (HDC) has been introduced as a promising solution for lightweight machine learning, particularly for graph learning applications. In this paper, we leverage HDC for an intrinsically more efficient and acceleration-friendly KGC algorithm. We also co-design an acceleration framework named HDReason targeting FPGA platforms. On the algorithm level, HDReason achieves a balance between high reasoning accuracy, strong model interpretability, and less computation complexity. In terms of architecture, HDReason offers reconfigurability, high training throughput, and low energy consumption. When compared with NVIDIA RTX 4090 GPU, the proposed accelerator achieves an average 10.6x speedup and 65x energy efficiency improvement. When conducting cross-models and cross-platforms comparison, HDReason yields an average 4.2x higher performance and 3.4x better energy efficiency with similar accuracy versus the state-of-the-art FPGA-based GCN training platform.

Related papers

Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution Networks [12.181052673940465]
Graph Convolutional Networks (GCNs) are pivotal in extracting latent information from graph data across various domains. We present Accel-GCN, a GPU accelerator architecture for GCNs. Evaluation of Accel-GCN across 18 benchmark graphs reveals that it outperforms cuSPARSE, GNNAdvisor, and graph-BLAST by factors of 1.17 times, 1.86 times, and 2.94 times respectively.
arXiv Detail & Related papers (2023-08-22T23:12:17Z)
INR-Arch: A Dataflow Architecture and Compiler for Arbitrary-Order Gradient Computations in Implicit Neural Representation Processing [66.00729477511219]
Given a function represented as a computation graph, traditional architectures face challenges in efficiently computing its nth-order gradient. We introduce INR-Arch, a framework that transforms the computation graph of an nth-order gradient into a hardware-optimized dataflow architecture. We present results that demonstrate 1.8-4.8x and 1.5-3.6x speedup compared to CPU and GPU baselines respectively.
arXiv Detail & Related papers (2023-08-11T04:24:39Z)
Scalable Graph Convolutional Network Training on Distributed-Memory Systems [5.169989177779801]
Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs. Since the convolution operation on graphs induces irregular memory access patterns, designing a memory- and communication-efficient parallel algorithm for GCN training poses unique challenges. We propose a highly parallel training algorithm that scales to large processor counts.
arXiv Detail & Related papers (2022-12-09T17:51:13Z)
A Comprehensive Study on Large-Scale Graph Training: Benchmarking and Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs) We present a new ensembling training manner, named EnGCN, to address the existing issues. Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z)
Learning to Solve Combinatorial Graph Partitioning Problems via Efficient Exploration [72.15369769265398]
Experimentally, ECORD achieves a new SOTA for RL algorithms on the Maximum Cut problem. Compared to the nearest competitor, ECORD reduces the optimality gap by up to 73%.
arXiv Detail & Related papers (2022-05-27T17:13:10Z)
GraphHD: Efficient graph classification using hyperdimensional computing [58.720142291102135]
We present a baseline approach for graph classification with HDC. We evaluate GraphHD on real-world graph classification problems. Our results show that when compared to the state-of-the-art Graph Neural Networks (GNNs) the proposed model achieves comparable accuracy.
arXiv Detail & Related papers (2022-05-16T17:32:58Z)
FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task. The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources. It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z)
SPA-GCN: Efficient and Flexible GCN Accelerator with an Application for Graph Similarity Computation [7.54579279348595]
We propose a flexible architecture called SPA-GCN for accelerating Graph Convolutional Networks (GCN) on graphs. We show that SPA-GCN can deliver a high speedup compared to a multi-core CPU implementation and a GPU implementation.
arXiv Detail & Related papers (2021-11-10T20:47:57Z)
Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning [56.83172249278467]
We introduce Evolutionary Graph Reinforcement Learning (EGRL), a method designed for large search spaces. We train and validate our approach directly on the Intel NNP-I chip for inference. We additionally achieve 28-78% speed-up compared to the native NNP-I compiler on all three workloads.
arXiv Detail & Related papers (2020-07-14T18:50:12Z)
GraphACT: Accelerating GCN Training on CPU-FPGA Heterogeneous Platforms [1.2183405753834562]
Graph Convolutional Networks (GCNs) have emerged as the state-of-the-art deep learning model for representation learning on graphs. It is challenging to accelerate training of GCNs due to substantial and irregular data communication. We design a novel accelerator for training GCNs on CPU-FPGA heterogeneous systems.
arXiv Detail & Related papers (2019-12-31T21:19:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.