Theta-Resonance: A Single-Step Reinforcement Learning Method for Design
Space Exploration
- URL: http://arxiv.org/abs/2211.02052v1
- Date: Thu, 3 Nov 2022 16:08:40 GMT
- Title: Theta-Resonance: A Single-Step Reinforcement Learning Method for Design
Space Exploration
- Authors: Masood S. Mortazavi, Tiancheng Qin, Ning Yan
- Abstract summary: We use Theta-Resonance to train an intelligent agent producing progressively more optimal samples.
We specialize existing policy gradient algorithms in deep reinforcement learning (D-RL) to update our policy network.
Although we only present categorical design spaces, we also outline how to use Theta-Resonance in order to explore continuous and mixed continuous-discrete design spaces.
- Score: 10.184056098238766
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Given an environment (e.g., a simulator) for evaluating samples in a
specified design space and a set of weighted evaluation metrics -- one can use
Theta-Resonance, a single-step Markov Decision Process (MDP), to train an
intelligent agent producing progressively more optimal samples. In
Theta-Resonance, a neural network consumes a constant input tensor and produces
a policy as a set of conditional probability density functions (PDFs) for
sampling each design dimension. We specialize existing policy gradient
algorithms in deep reinforcement learning (D-RL) in order to use evaluation
feedback (in terms of cost, penalty or reward) to update our policy network
with robust algorithmic stability and minimal design evaluations. We study
multiple neural architectures (for our policy network) within the context of a
simple SoC design space and propose a method of constructing synthetic space
exploration problems to compare and improve design space exploration (DSE)
algorithms. Although we only present categorical design spaces, we also outline
how to use Theta-Resonance in order to explore continuous and mixed
continuous-discrete design spaces.
Related papers
- CORE: Constraint-Aware One-Step Reinforcement Learning for Simulation-Guided Neural Network Accelerator Design [3.549422886703227]
CORE is a constraint-aware, one-step reinforcement learning method for simulationguided DSE.<n>We instantiate CORE for hardware-mapping co-design of neural network accelerators.
arXiv Detail & Related papers (2025-06-04T01:08:34Z) - Inverse Design in Distributed Circuits Using Single-Step Reinforcement Learning [10.495642893440351]
DCIDA is a design exploration framework that learns a near-optimal design sampling policy for a target transfer function.<n>Our experiments demonstrate DCIDA's Transformer-based policy network achieves significant reductions in design error.
arXiv Detail & Related papers (2025-06-02T02:31:52Z) - Self-Supervised Learning-Based Path Planning and Obstacle Avoidance Using PPO and B-Splines in Unknown Environments [0.0]
Smart BSP is an advanced self-supervised learning framework for real-time path planning and obstacle avoidance in autonomous robotics.
The proposed system integrates Proximal Policy Optimization (PPO) with Convolutional Neural Networks (CNN) and Actor-Critic architecture.
During the training process a nuanced cost function is minimized that accounts for path curvature, endpoint proximity, and obstacle avoidance.
arXiv Detail & Related papers (2024-12-03T05:20:29Z) - Single-Trajectory Distributionally Robust Reinforcement Learning [21.955807398493334]
We propose Distributionally Robust RL (DRRL) to enhance performance across a range of environments.
Existing DRRL algorithms are either model-based or fail to learn from a single sample trajectory.
We design a first fully model-free DRRL algorithm, called distributionally robust Q-learning with single trajectory (DRQ)
arXiv Detail & Related papers (2023-01-27T14:08:09Z) - Exploiting Temporal Structures of Cyclostationary Signals for
Data-Driven Single-Channel Source Separation [98.95383921866096]
We study the problem of single-channel source separation (SCSS)
We focus on cyclostationary signals, which are particularly suitable in a variety of application domains.
We propose a deep learning approach using a U-Net architecture, which is competitive with the minimum MSE estimator.
arXiv Detail & Related papers (2022-08-22T14:04:56Z) - Active Exploration via Experiment Design in Markov Chains [86.41407938210193]
A key challenge in science and engineering is to design experiments to learn about some unknown quantity of interest.
We propose an algorithm that efficiently selects policies whose measurement allocation converges to the optimal one.
In addition to our theoretical analysis, we showcase our framework on applications in ecological surveillance and pharmacology.
arXiv Detail & Related papers (2022-06-29T00:04:40Z) - Few-Shot Audio-Visual Learning of Environment Acoustics [89.16560042178523]
Room impulse response (RIR) functions capture how the surrounding physical environment transforms the sounds heard by a listener.
We explore how to infer RIRs based on a sparse set of images and echoes observed in the space.
In experiments using a state-of-the-art audio-visual simulator for 3D environments, we demonstrate that our method successfully generates arbitrary RIRs.
arXiv Detail & Related papers (2022-06-08T16:38:24Z) - RID-Noise: Towards Robust Inverse Design under Noisy Environments [30.58112077143225]
We propose Robust Inverse Design under Noise (RID-Noise) to train a conditional invertible neural network (cINN)
We estimate the robustness of a design parameter by its predictability, measured by the prediction error of a forward neural network.
With the visual results from experiments, we clearly justify how RID-Noise works by learning the distribution and robustness from data.
arXiv Detail & Related papers (2021-12-07T06:32:27Z) - Bayesian Sequential Optimal Experimental Design for Nonlinear Models
Using Policy Gradient Reinforcement Learning [0.0]
We formulate this sequential optimal experimental design (sOED) problem as a finite-horizon partially observable Markov decision process (POMDP)
It is built to accommodate continuous random variables, general non-Gaussian posteriors, and expensive nonlinear forward models.
We solve for the sOED policy numerically via policy gradient (PG) methods from reinforcement learning, and derive and prove the PG expression for sOED.
The overall PG-sOED method is validated on a linear-Gaussian benchmark, and its advantages over batch and greedy designs are demonstrated through a contaminant source inversion problem in a
arXiv Detail & Related papers (2021-10-28T17:47:31Z) - Robust Topology Optimization Using Multi-Fidelity Variational Autoencoders [1.0124625066746595]
A robust topology optimization (RTO) problem identifies a design with the best average performance.
A neural network method is proposed that offers computational efficiency.
Numerical application of the method is shown on the robust design of L-bracket structure with single point load as well as multiple point loads.
arXiv Detail & Related papers (2021-07-19T20:40:51Z) - An AI-Assisted Design Method for Topology Optimization Without
Pre-Optimized Training Data [68.8204255655161]
An AI-assisted design method based on topology optimization is presented, which is able to obtain optimized designs in a direct way.
Designs are provided by an artificial neural network, the predictor, on the basis of boundary conditions and degree of filling as input data.
arXiv Detail & Related papers (2020-12-11T14:33:27Z) - Sinkhorn Natural Gradient for Generative Models [125.89871274202439]
We propose a novel Sinkhorn Natural Gradient (SiNG) algorithm which acts as a steepest descent method on the probability space endowed with the Sinkhorn divergence.
We show that the Sinkhorn information matrix (SIM), a key component of SiNG, has an explicit expression and can be evaluated accurately in complexity that scales logarithmically.
In our experiments, we quantitatively compare SiNG with state-of-the-art SGD-type solvers on generative tasks to demonstrate its efficiency and efficacy of our method.
arXiv Detail & Related papers (2020-11-09T02:51:17Z) - Robust Reinforcement Learning with Wasserstein Constraint [49.86490922809473]
We show the existence of optimal robust policies, provide a sensitivity analysis for the perturbations, and then design a novel robust learning algorithm.
The effectiveness of the proposed algorithm is verified in the Cart-Pole environment.
arXiv Detail & Related papers (2020-06-01T13:48:59Z) - Learning a Probabilistic Strategy for Computational Imaging Sensor
Selection [16.553234762932938]
We propose a physics-constrained, fully differentiable, autoencoder that learns a probabilistic sensor-sampling strategy for optimized sensor design.
The proposed method learns a system's preferred sampling distribution that characterizes the correlations between different sensor selections as a binary, fully-connected Ising model.
arXiv Detail & Related papers (2020-03-23T17:52:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.