Related papers: A Reinforcement Learning Approach for Process Parameter Optimization in Additive Manufacturing

Related papers

How to Set the Learning Rate for Large-Scale Pre-training? [73.03133634525635]
We formalize this investigation into two distinct research paradigms: Fitting and Transfer.<n>Within the Fitting Paradigm, we introduce a Scaling Law for search factor, effectively reducing the search complexity from O(n3) to O(n*C_D*C_) via predictive modeling.<n>We extend the principles of $$Transfer to the Mixture of Experts (MoE) architecture, broadening its applicability to encompass model depth, weight decay, and token horizons.
arXiv Detail & Related papers (2026-01-08T15:55:13Z)
AI-Driven Optimization under Uncertainty for Mineral Processing Operations [0.7340017786387767]
We introduce an AI-driven approach that formulates mineral processing as a Partially Observable Markov Decision Process (POMDP)<n>We show that this approach has the potential to consistently perform better than traditional approaches at maximizing an overall objective, such as net present value (NPV)<n>Our methodological demonstration of this optimization-under-uncertainty approach for a synthetic case provides a mathematical and computational framework for later real-world application.
arXiv Detail & Related papers (2025-12-01T18:35:54Z)
Intersection of Reinforcement Learning and Bayesian Optimization for Intelligent Control of Industrial Processes: A Safe MPC-based DPG using Multi-Objective BO [0.0]
Model Predictive Control (MPC)-based Reinforcement Learning (RL) offers a structured and interpretable alternative to Deep Neural Network (DNN)-based RL methods.<n>Standard MPC-RL approaches often suffer from slow convergence, suboptimal policy learning due to limited parameterization, and safety issues during online adaptation.<n>We propose a novel framework that integrates MPC-RL with Multi-Objective Bayesian Optimization (MOBO)
arXiv Detail & Related papers (2025-07-14T02:31:52Z)
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs [51.21041884010009]
Ring-lite is a Mixture-of-Experts (MoE)-based large language model optimized via reinforcement learning (RL)<n>Our approach matches the performance of state-of-the-art (SOTA) small-scale reasoning models on challenging benchmarks.
arXiv Detail & Related papers (2025-06-17T17:12:34Z)
Accelerating RL for LLM Reasoning with Optimal Advantage Regression [52.0792918455501]
We propose a novel two-stage policy optimization framework that directly approximates the optimal advantage function.<n>$A$*-PO achieves competitive performance across a wide range of mathematical reasoning benchmarks.<n>It reduces training time by up to 2$times$ and peak memory usage by over 30% compared to PPO, GRPO, and REBEL.
arXiv Detail & Related papers (2025-05-27T03:58:50Z)
A Multi-Scale Quantum Framework for Evaluating Metal-Organic Frameworks in Carbon Capture [0.0]
Metal Organic Frameworks (MOFs) are promising materials to help mitigate the effects of global warming by selectively absorbing $textCO_2$ for direct capture.<n> Accurate quantum chemistry simulations are a useful tool to help select and design optimal MOF structures.<n>Applying simulations over large datasets requires efficient simulation methods.
arXiv Detail & Related papers (2025-05-07T16:00:07Z)
Hardware Co-Designed Optimal Control for Programmable Atomic Quantum Processors via Reinforcement Learning [0.18416014644193068]
We introduce a hardware co-designed quantum control framework to address inherent imperfections in classical control hardware. We demonstrate that the proposed framework enables robust, high-fidelity parallel single-qubit gate operations. We find that while PPO performance degrades as system complexity increases, the end-to-end differentiable RL consistently achieves gate fidelities above 99.9$%$.
arXiv Detail & Related papers (2025-04-16T03:30:40Z)
Supervised Optimism Correction: Be Confident When LLMs Are Sure [91.7459076316849]
We establish a novel theoretical connection between supervised fine-tuning and offline reinforcement learning. We show that the widely used beam search method suffers from unacceptable over-optimism. We propose Supervised Optimism Correction, which introduces a simple yet effective auxiliary loss for token-level $Q$-value estimations.
arXiv Detail & Related papers (2025-04-10T07:50:03Z)
Fourier Neural Operator based surrogates for $CO_2$ storage in realistic geologies [57.23978190717341]
We develop a Neural Operator (FNO) based model for real-time, high-resolution simulation of $CO$ plume migration. The model is trained on a comprehensive dataset generated from realistic subsurface parameters. We present various strategies for improving the reliability of predictions from the model, which is crucial while assessing actual geological sites.
arXiv Detail & Related papers (2025-03-14T02:58:24Z)
RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models [53.571195477043496]
We propose an algorithm named Rotated Straight-Through-Estimator (RoSTE) RoSTE combines quantization-aware supervised fine-tuning (QA-SFT) with an adaptive rotation strategy to reduce activation outliers. Our findings reveal that the prediction error is directly proportional to the quantization error of the converged weights, which can be effectively managed through an optimized rotation configuration.
arXiv Detail & Related papers (2025-02-13T06:44:33Z)
Reward-Guided Speculative Decoding for Efficient LLM Reasoning [80.55186052123196]
We introduce Reward-Guided Speculative Decoding (RSD), a novel framework aimed at improving the efficiency of inference in large language models (LLMs) RSD incorporates a controlled bias to prioritize high-reward outputs, in contrast to existing speculative decoding methods that enforce strict unbiasedness. RSD delivers significant efficiency gains against decoding with the target model only, while achieving significant better accuracy than parallel decoding method on average.
arXiv Detail & Related papers (2025-01-31T17:19:57Z)
Synergistic Development of Perovskite Memristors and Algorithms for Robust Analog Computing [53.77822620185878]
We propose a synergistic methodology to concurrently optimize perovskite memristor fabrication and develop robust analog DNNs. We develop "BayesMulti", a training strategy utilizing BO-guided noise injection to improve the resistance of analog DNNs to memristor imperfections. Our integrated approach enables use of analog computing in much deeper and wider networks, achieving up to 100-fold improvements.
arXiv Detail & Related papers (2024-12-03T19:20:08Z)
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment [66.80143024475635]
We propose VinePPO, a straightforward approach to compute unbiased Monte Carlo-based estimates. We show that VinePPO consistently outperforms PPO and other RL-free baselines across MATH and GSM8K datasets.
arXiv Detail & Related papers (2024-10-02T15:49:30Z)
Reinforcement learning for anisotropic p-adaptation and error estimation in high-order solvers [0.37109226820205005]
We present a novel approach to automate and optimize anisotropic p-adaptation in high-order h/p using Reinforcement Learning (RL) We develop an offline training approach, decoupled from the main solver, which shows minimal overcost when performing simulations. We derive an inexpensive RL-based error estimation approach that enables the quantification of local discretization errors.
arXiv Detail & Related papers (2024-07-26T17:55:23Z)
Improved Optimization for the Neural-network Quantum States and Tests on the Chromium Dimer [11.985673663540688]
Neural-network Quantum States (NQS) has significantly advanced wave function ansatz research. This work introduces three algorithmic enhancements to reduce computational demands of VMC optimization using NQS.
arXiv Detail & Related papers (2024-04-14T15:07:57Z)
Large Language Models to Enhance Bayesian Optimization [57.474613739645605]
We present LLAMBO, a novel approach that integrates the capabilities of Large Language Models (LLM) within Bayesian optimization. At a high level, we frame the BO problem in natural language, enabling LLMs to iteratively propose and evaluate promising solutions conditioned on historical evaluations. Our findings illustrate that LLAMBO is effective at zero-shot warmstarting, and enhances surrogate modeling and candidate sampling, especially in the early stages of search when observations are sparse.
arXiv Detail & Related papers (2024-02-06T11:44:06Z)
Landscape-Sketch-Step: An AI/ML-Based Metaheuristic for Surrogate Optimization Problems [0.0]
We introduce a newimats for global optimization in scenarios where extensive evaluations of the cost function are expensive, inaccessible, or even prohibitive. The method, which we call Landscape-Sketch-and-Step (LSS), combines Machine Learning, Replica Optimization, and Reinforcement Learning techniques.
arXiv Detail & Related papers (2023-09-14T01:53:45Z)
Reduced Order Modeling of a MOOSE-based Advanced Manufacturing Model with Operator Learning [2.517043342442487]
Advanced Manufacturing (AM) has gained significant interest in the nuclear community for its potential application on nuclear materials. One challenge is to obtain desired material properties via controlling the manufacturing process during runtime. Intelligent AM based on deep reinforcement learning (DRL) relies on an automated process-level control mechanism to generate optimal design variables.
arXiv Detail & Related papers (2023-08-18T17:38:00Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
TempoRL: laser pulse temporal shape optimization with Deep Reinforcement Learning [0.577478614918139]
High Power Laser's (HPL) optimal performance is essential for the success of a wide variety of experimental tasks related to light-matter interactions. Traditionally, HPL parameters are optimised in an automated fashion relying on black-box numerical methods. Model-free Deep Reinforcement Learning (DRL) offers a promising alternative framework for optimising HPL performance.
arXiv Detail & Related papers (2023-04-20T22:15:27Z)
An Experimental Design Perspective on Model-Based Reinforcement Learning [73.37942845983417]
In practical applications of RL, it is expensive to observe state transitions from the environment. We propose an acquisition function that quantifies how much information a state-action pair would provide about the optimal solution to a Markov decision process.
arXiv Detail & Related papers (2021-12-09T23:13:57Z)
Energy-Efficient and Federated Meta-Learning via Projected Stochastic Gradient Ascent [79.58680275615752]
We propose an energy-efficient federated meta-learning framework. We assume each task is owned by a separate agent, so a limited number of tasks is used to train a meta-model.
arXiv Detail & Related papers (2021-05-31T08:15:44Z)
Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK) Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework. We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.