A Scalable and Reproducible System-on-Chip Simulation for Reinforcement
Learning
- URL: http://arxiv.org/abs/2104.13187v1
- Date: Tue, 27 Apr 2021 13:46:57 GMT
- Title: A Scalable and Reproducible System-on-Chip Simulation for Reinforcement
Learning
- Authors: Tegg Taekyong Sung, Bo Ryu
- Abstract summary: This paper proffers gym-ds3, a scalable and reproducible open environment tailored for a high-fidelity Domain-Specific System-on-Chip (DSSoC) application.
The simulation corroborates to schedule hierarchical jobs onto heterogeneous System-on-Chip (SoC) processors and bridges the system to reinforcement learning research.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Reinforcement Learning (DRL) underlies in a simulated environment and
optimizes objective goals. By extending the conventional interaction scheme,
this paper proffers gym-ds3, a scalable and reproducible open environment
tailored for a high-fidelity Domain-Specific System-on-Chip (DSSoC)
application. The simulation corroborates to schedule hierarchical jobs onto
heterogeneous System-on-Chip (SoC) processors and bridges the system to
reinforcement learning research. We systematically analyze the representative
SoC simulator and discuss the primary challenging aspects that the system (1)
continuously generates indefinite jobs at a rapid injection rate, (2) optimizes
complex objectives, and (3) operates in steady-state scheduling. We provide
exemplary snippets and experimentally demonstrate the run-time performances on
different schedulers that successfully mimic results achieved from the standard
DS3 framework and real-world embedded systems.
Related papers
- Digital Twin-Enabled Real-Time Control in Robotic Additive Manufacturing via Soft Actor-Critic Reinforcement Learning [2.5709786140685633]
This research presents a novel approach integrating Soft Actor-Critic (SAC) reinforcement learning with digital twin technology.
We demonstrate our methodology using a Viper X300s robot arm, implementing two distinct control scenarios.
Results show rapid policy convergence and robust task execution in both simulated and physical environments.
arXiv Detail & Related papers (2025-01-29T22:06:53Z) - Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks.
We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level.
We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z) - Robust-MBDL: A Robust Multi-branch Deep Learning Based Model for
Remaining Useful Life Prediction and Operational Condition Identification of
Rotating Machines [1.2593669712329136]
The proposed system comprises main components: (1) an LSTM-Autoencoder to denoise the vibration data; (2) a feature extraction to generate time-domain, frequency-domain, and time-frequency based features from the denoised data; and (3) a novel and robust multi-branch deep learning network architecture to exploit the multiple features.
The performance of our proposed system was evaluated and compared to the state-of-the-art systems on two benchmark datasets of XJTU-SY and PRONOSTIA.
arXiv Detail & Related papers (2023-09-12T11:58:53Z) - CONSTRUCT: A Program Synthesis Approach for Reconstructing Control
Algorithms from Embedded System Binaries in Cyber-Physical Systems [39.78288224911617]
We introduce a novel approach to automatically synthesize a mathematical representation of the control algorithms implemented in industrial cyber-physical systems.
The output model can be used by subject matter experts to assess the system's compliance with the expected behavior.
arXiv Detail & Related papers (2023-08-01T03:10:55Z) - Optimising Highly-Parallel Simulation-Based Verification of
Cyber-Physical Systems [0.0]
Cyber-Physical Systems (CPSs) arise in many industry-relevant domains and are often mission- or safety-critical.
System-Level Verification (SLV) of CPSs aims at certifying that given (e.g. safety or liveness) specifications are met or at estimating the value of some.
arXiv Detail & Related papers (2023-07-28T08:08:27Z) - ETLP: Event-based Three-factor Local Plasticity for online learning with
neuromorphic hardware [105.54048699217668]
We show a competitive performance in accuracy with a clear advantage in the computational complexity for Event-Based Three-factor Local Plasticity (ETLP)
We also show that when using local plasticity, threshold adaptation in spiking neurons and a recurrent topology are necessary to learntemporal patterns with a rich temporal structure.
arXiv Detail & Related papers (2023-01-19T19:45:42Z) - SimVPv2: Towards Simple yet Powerful Spatiotemporal Predictive Learning [61.419914155985886]
We propose SimVPv2, a streamlined model that eliminates the need for Unet architectures for spatial and temporal modeling.
SimVPv2 not only simplifies the model architecture but also improves both performance and computational efficiency.
On the standard Moving MNIST benchmark, SimVPv2 achieves superior performance compared to SimVP, with fewer FLOPs, about half the training time and 60% faster inference efficiency.
arXiv Detail & Related papers (2022-11-22T08:01:33Z) - Deep Bayesian Active Learning for Accelerating Stochastic Simulation [74.58219903138301]
Interactive Neural Process (INP) is a deep active learning framework for simulations and with active learning approaches.
For active learning, we propose a novel acquisition function, Latent Information Gain (LIG), calculated in the latent space of NP based models.
The results demonstrate STNP outperforms the baselines in the learning setting and LIG achieves the state-of-the-art for active learning.
arXiv Detail & Related papers (2021-06-05T01:31:51Z) - Computational framework for real-time diagnostics and prognostics of
aircraft actuation systems [0.0]
This work addresses the three phases of the prognostic flow: signal acquisition, Fault Detection and Identification, and Remaining Useful Life estimation.
To achieve this goal, we propose to combine information from physical models of different fidelity with machine learning techniques.
The methodology is assessed for the FDI and RUL estimation of an aircraft electromechanical actuator for secondary flight controls.
arXiv Detail & Related papers (2020-09-30T12:53:07Z) - Combining Machine Learning with Knowledge-Based Modeling for Scalable
Forecasting and Subgrid-Scale Closure of Large, Complex, Spatiotemporal
Systems [48.7576911714538]
We attempt to utilize machine learning as the essential tool for integrating pasttemporal data into predictions.
We propose combining two approaches: (i) a parallel machine learning prediction scheme; and (ii) a hybrid technique, for a composite prediction system composed of a knowledge-based component and a machine-learning-based component.
We demonstrate that not only can this method combining (i) and (ii) be scaled to give excellent performance for very large systems, but also that the length of time series data needed to train our multiple, parallel machine learning components is dramatically less than that necessary without parallelization.
arXiv Detail & Related papers (2020-02-10T23:21:50Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.