Design and Implementation of a RISC-V SoC with Custom DSP Accelerators for Edge Computing
- URL: http://arxiv.org/abs/2506.06693v1
- Date: Sat, 07 Jun 2025 07:17:40 GMT
- Title: Design and Implementation of a RISC-V SoC with Custom DSP Accelerators for Edge Computing
- Authors: Priyanshu Yadav,
- Abstract summary: We examine the RV32I base instruction set with extensions for multiplication (M) and atomic operations (A)<n>Our results demonstrate RISC-V's advantages in embedded systems and its scalability for custom accelerators.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a comprehensive analysis of the RISC-V instruction set architecture, focusing on its modular design, implementation challenges, and performance characteristics. We examine the RV32I base instruction set with extensions for multiplication (M) and atomic operations (A). Through cycle-accurate simulation of a pipelined implementation, we evaluate performance metrics including CPI (cycles per instruction) and power efficiency. Our results demonstrate RISC-V's advantages in embedded systems and its scalability for custom accelerators. Comparative analysis shows a 17% reduction in power consumption compared to ARM Cortex-M0 implementations in similar process nodes. The open-standard nature of RISC-V provides significant flexibility for domain-specific optimizations.
Related papers
- Tensor Program Optimization for the RISC-V Vector Extension Using Probabilistic Programs [0.6242215470795112]
We present a workflow based on the TVM compiler to efficiently map AI workloads onto RISC-V vector units.<n>Our proposal shows a mean improvement of 46% in execution latency when compared against the autovectorization feature of GCC.<n>We open-sourced our proposal for the community to expand it to target other RISC-V extensions.
arXiv Detail & Related papers (2025-07-02T08:15:33Z) - ShadowBinding: Realizing Effective Microarchitectures for In-Core Secure Speculation Schemes [1.359473465752453]
We present effective microarchitectures for two state-of-the-art secure schemes.<n>We find that the IPC impact of in-core secure schemes is higher than previously estimated.
arXiv Detail & Related papers (2025-04-09T16:33:42Z) - Efficient Risk-sensitive Planning via Entropic Risk Measures [51.42922439693624]
We show that only Entropic Risk Measures (EntRM) can be efficiently optimized through dynamic programming.<n>We prove that this optimality front can be computed effectively thanks to a novel structural analysis and smoothness properties of entropic risks.
arXiv Detail & Related papers (2025-02-27T09:56:51Z) - Graph Structure Refinement with Energy-based Contrastive Learning [56.957793274727514]
We introduce an unsupervised method based on a joint of generative training and discriminative training to learn graph structure and representation.<n>We propose an Energy-based Contrastive Learning (ECL) guided Graph Structure Refinement (GSR) framework, denoted as ECL-GSR.<n>ECL-GSR achieves faster training with fewer samples and memories against the leading baseline, highlighting its simplicity and efficiency in downstream tasks.
arXiv Detail & Related papers (2024-12-20T04:05:09Z) - AUCSeg: AUC-oriented Pixel-level Long-tail Semantic Segmentation [88.50256898176269]
We develop a pixel-level AUC loss function and conduct a dependency-graph-based theoretical analysis of the algorithm's generalization ability.
We also design a Tail-Classes Memory Bank to manage the significant memory demand.
arXiv Detail & Related papers (2024-09-30T15:31:02Z) - Mixed-precision Neural Networks on RISC-V Cores: ISA extensions for Multi-Pumped Soft SIMD Operations [5.847997723738113]
Modern embedded microprocessors provide very limited support for mixed-precision NNs.
We introduce a hardware-software co-design framework that enables cooperative hardware design, mixed-precision quantization, ISA extensions and inference.
Our framework can achieve, on average, 15x energy reduction for less than 1% accuracy loss and outperforms the ISA-agnostic state-of-the-art RISC-V cores.
arXiv Detail & Related papers (2024-07-19T12:54:04Z) - RISC-V R-Extension: Advancing Efficiency with Rented-Pipeline for Edge DNN Processing [0.8192907805418583]
This paper introduces the RISC-V R-extension, a novel approach to enhancing deep neural network (DNN) process efficiency on edge devices.
The extension features rented-pipeline stages and architectural pipeline registers (APR), which optimize critical operation execution, thereby reducing latency and memory access frequency.
arXiv Detail & Related papers (2024-07-02T19:25:05Z) - Reconfigurable Distributed FPGA Cluster Design for Deep Learning
Accelerators [59.11160990637615]
We propose a distributed system based on lowpower embedded FPGAs designed for edge computing applications.
The proposed system can simultaneously execute diverse Neural Network (NN) models, arrange the graph in a pipeline structure, and manually allocate greater resources to the most computationally intensive layers of the NN graph.
arXiv Detail & Related papers (2023-05-24T16:08:55Z) - Performance Embeddings: A Similarity-based Approach to Automatic
Performance Optimization [71.69092462147292]
Performance embeddings enable knowledge transfer of performance tuning between applications.
We demonstrate this transfer tuning approach on case studies in deep neural networks, dense and sparse linear algebra compositions, and numerical weather prediction stencils.
arXiv Detail & Related papers (2023-03-14T15:51:35Z) - Evolving Pareto-Optimal Actor-Critic Algorithms for Generalizability and
Stability [67.8426046908398]
Generalizability and stability are two key objectives for operating reinforcement learning (RL) agents in the real world.
This paper presents MetaPG, an evolutionary method for automated design of actor-critic loss functions.
arXiv Detail & Related papers (2022-04-08T20:46:16Z) - Simulation platform for pattern recognition based on reservoir computing
with memristor networks [1.5664378826358722]
We develop a simulation platform for reservoir computing (RC) with memristor device networks.
We show that the memristor-network-based RC systems can yield high computational performance comparable to that of state-of-the-art methods in three time series classification tasks.
arXiv Detail & Related papers (2021-12-01T03:06:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.