Related papers: A Scalable and Reproducible System-on-Chip Simulation for Reinforcement Learning

Related papers

Simulation-Driven Reinforcement Learning in Queuing Network Routing Optimization [0.0]
This study focuses on the development of a simulation-driven reinforcement learning (RL) framework for optimizing routing decisions in complex queueing network systems.<n>We propose a robust RL approach leveraging Deep Deterministic Policy Gradient (DDPG) combined with Dyna-style planning (Dyna-DDPG)<n> Comprehensive experiments and rigorous evaluations demonstrate the framework's capability to rapidly learn effective routing policies.
arXiv Detail & Related papers (2025-07-24T20:32:47Z)
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs [51.21041884010009]
Ring-lite is a Mixture-of-Experts (MoE)-based large language model optimized via reinforcement learning (RL)<n>Our approach matches the performance of state-of-the-art (SOTA) small-scale reasoning models on challenging benchmarks.
arXiv Detail & Related papers (2025-06-17T17:12:34Z)
RoboCerebra: A Large-scale Benchmark for Long-horizon Robotic Manipulation Evaluation [80.20970723577818]
We introduce RoboCerebra, a benchmark for evaluating high-level reasoning in long-horizon robotic manipulation.<n>The dataset is constructed via a top-down pipeline, where GPT generates task instructions and decomposes them into subtask sequences.<n>Compared to prior benchmarks, RoboCerebra features significantly longer action sequences and denser annotations.
arXiv Detail & Related papers (2025-06-07T06:15:49Z)
DOPPLER: Dual-Policy Learning for Device Assignment in Asynchronous Dataflow Graphs [11.966335602618933]
We study the problem of assigning operations in a dataflow graph to devices to minimize execution time in a work-conserving system.<n>Our experiments show that textscDoppler outperforms all baseline methods across tasks.
arXiv Detail & Related papers (2025-05-29T06:04:32Z)
Dynamic Manipulation of Deformable Objects in 3D: Simulation, Benchmark and Learning Strategy [88.8665000676562]
Prior methods often simplify the problem to low-speed or 2D settings, limiting their applicability to real-world 3D tasks.<n>To mitigate data scarcity, we introduce a novel simulation framework and benchmark grounded in reduced-order dynamics.<n>We propose Dynamics Informed Diffusion Policy (DIDP), a framework that integrates imitation pretraining with physics-informed test-time adaptation.
arXiv Detail & Related papers (2025-05-23T03:28:25Z)
LAPSO: A Unified Optimization View for Learning-Augmented Power System Operations [3.754570687412345]
This paper proposes a holistic framework of Learning-Augmented Power System Operations (LAPSO)<n>LAPSO is centered on the operation stage and aims to break the boundary between temporally siloed power system tasks.<n>A dedicated Python package-lapso is introduced to automatically augment existing power system optimization models with learnable components.
arXiv Detail & Related papers (2025-05-08T13:00:24Z)
UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines [64.84631333071728]
We introduce bfUnistage, a unified Transformer-based framework fortemporal modeling. Our work demonstrates that a task-specific vision-text can build a generalizable model fortemporal learning. We also introduce a temporal module to incorporate temporal dynamics explicitly.
arXiv Detail & Related papers (2025-03-26T17:33:23Z)
DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving [62.62464518137153]
DriveTransformer is a simplified E2E-AD framework for the ease of scaling up. It is composed of three unified operations: task self-attention, sensor cross-attention, temporal cross-attention. It achieves state-of-the-art performance in both simulated closed-loop benchmark Bench2Drive and real world open-loop benchmark nuScenes with high FPS.
arXiv Detail & Related papers (2025-03-07T11:41:18Z)
Digital Twin-Enabled Real-Time Control in Robotic Additive Manufacturing via Soft Actor-Critic Reinforcement Learning [2.5709786140685633]
This research presents a novel approach integrating Soft Actor-Critic (SAC) reinforcement learning with digital twin technology. We demonstrate our methodology using a Viper X300s robot arm, implementing two distinct control scenarios. Results show rapid policy convergence and robust task execution in both simulated and physical environments.
arXiv Detail & Related papers (2025-01-29T22:06:53Z)
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks. We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level. We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z)
Robust-MBDL: A Robust Multi-branch Deep Learning Based Model for Remaining Useful Life Prediction and Operational Condition Identification of Rotating Machines [1.2593669712329136]
The proposed system comprises main components: (1) an LSTM-Autoencoder to denoise the vibration data; (2) a feature extraction to generate time-domain, frequency-domain, and time-frequency based features from the denoised data; and (3) a novel and robust multi-branch deep learning network architecture to exploit the multiple features. The performance of our proposed system was evaluated and compared to the state-of-the-art systems on two benchmark datasets of XJTU-SY and PRONOSTIA.
arXiv Detail & Related papers (2023-09-12T11:58:53Z)
CONSTRUCT: A Program Synthesis Approach for Reconstructing Control Algorithms from Embedded System Binaries in Cyber-Physical Systems [39.78288224911617]
We introduce a novel approach to automatically synthesize a mathematical representation of the control algorithms implemented in industrial cyber-physical systems. The output model can be used by subject matter experts to assess the system's compliance with the expected behavior.
arXiv Detail & Related papers (2023-08-01T03:10:55Z)
Optimising Highly-Parallel Simulation-Based Verification of Cyber-Physical Systems [0.0]
Cyber-Physical Systems (CPSs) arise in many industry-relevant domains and are often mission- or safety-critical. System-Level Verification (SLV) of CPSs aims at certifying that given (e.g. safety or liveness) specifications are met or at estimating the value of some.
arXiv Detail & Related papers (2023-07-28T08:08:27Z)
ETLP: Event-based Three-factor Local Plasticity for online learning with neuromorphic hardware [105.54048699217668]
We show a competitive performance in accuracy with a clear advantage in the computational complexity for Event-Based Three-factor Local Plasticity (ETLP) We also show that when using local plasticity, threshold adaptation in spiking neurons and a recurrent topology are necessary to learntemporal patterns with a rich temporal structure.
arXiv Detail & Related papers (2023-01-19T19:45:42Z)
SimVPv2: Towards Simple yet Powerful Spatiotemporal Predictive Learning [61.419914155985886]
We propose SimVPv2, a streamlined model that eliminates the need for Unet architectures for spatial and temporal modeling. SimVPv2 not only simplifies the model architecture but also improves both performance and computational efficiency. On the standard Moving MNIST benchmark, SimVPv2 achieves superior performance compared to SimVP, with fewer FLOPs, about half the training time and 60% faster inference efficiency.
arXiv Detail & Related papers (2022-11-22T08:01:33Z)
Parallel Successive Learning for Dynamic Distributed Model Training over Heterogeneous Wireless Networks [50.68446003616802]
Federated learning (FedL) has emerged as a popular technique for distributing model training over a set of wireless devices. We develop parallel successive learning (PSL), which expands the FedL architecture along three dimensions. Our analysis sheds light on the notion of cold vs. warmed up models, and model inertia in distributed machine learning.
arXiv Detail & Related papers (2022-02-07T05:11:01Z)
Deep Bayesian Active Learning for Accelerating Stochastic Simulation [74.58219903138301]
Interactive Neural Process (INP) is a deep active learning framework for simulations and with active learning approaches. For active learning, we propose a novel acquisition function, Latent Information Gain (LIG), calculated in the latent space of NP based models. The results demonstrate STNP outperforms the baselines in the learning setting and LIG achieves the state-of-the-art for active learning.
arXiv Detail & Related papers (2021-06-05T01:31:51Z)
Computational framework for real-time diagnostics and prognostics of aircraft actuation systems [0.0]
This work addresses the three phases of the prognostic flow: signal acquisition, Fault Detection and Identification, and Remaining Useful Life estimation. To achieve this goal, we propose to combine information from physical models of different fidelity with machine learning techniques. The methodology is assessed for the FDI and RUL estimation of an aircraft electromechanical actuator for secondary flight controls.
arXiv Detail & Related papers (2020-09-30T12:53:07Z)
DISPATCH: Design Space Exploration of Cyber-Physical Systems [5.273291582861981]
Design of cyber-physical systems (CPSs) is a challenging task that involves searching over a large search space of various CPS configurations. We propose DIS, a two-step methodology for sample-efficient search over the design space.
arXiv Detail & Related papers (2020-09-21T23:14:51Z)
Combining Machine Learning with Knowledge-Based Modeling for Scalable Forecasting and Subgrid-Scale Closure of Large, Complex, Spatiotemporal Systems [48.7576911714538]
We attempt to utilize machine learning as the essential tool for integrating pasttemporal data into predictions. We propose combining two approaches: (i) a parallel machine learning prediction scheme; and (ii) a hybrid technique, for a composite prediction system composed of a knowledge-based component and a machine-learning-based component. We demonstrate that not only can this method combining (i) and (ii) be scaled to give excellent performance for very large systems, but also that the length of time series data needed to train our multiple, parallel machine learning components is dramatically less than that necessary without parallelization.
arXiv Detail & Related papers (2020-02-10T23:21:50Z)
Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL. We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.