Related papers: A Negotiation-Based Multi-Agent Reinforcement Learning Approach for Dynamic Scheduling of Reconfigurable Manufacturing Systems

A Negotiation-Based Multi-Agent Reinforcement Learning Approach for Dynamic Scheduling of Reconfigurable Manufacturing Systems

URL: http://arxiv.org/abs/2511.07707v1
Date: Wed, 12 Nov 2025 01:12:21 GMT
Title: A Negotiation-Based Multi-Agent Reinforcement Learning Approach for Dynamic Scheduling of Reconfigurable Manufacturing Systems
Authors: Manonmani Sekar, Nasim Nezamoddini,
Abstract summary: This study explores the application of multi agent reinforcement learning (MARL) for dynamic scheduling in soft planning of the RMS settings.<n>In the proposed framework, deep Qnetwork (DQN) agents trained in centralized training learn optimal job assignments in real time while adapting to events such as machine breakdowns and reconfiguration delays.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reconfigurable manufacturing systems (RMS) are critical for future market adjustment given their rapid adaptation to fluctuations in consumer demands, the introduction of new technological advances, and disruptions in linked supply chain sections. The adjustable hard settings of such systems require a flexible soft planning mechanism that enables realtime production planning and scheduling amid the existing complexity and variability in their configuration settings. This study explores the application of multi agent reinforcement learning (MARL) for dynamic scheduling in soft planning of the RMS settings. In the proposed framework, deep Qnetwork (DQN) agents trained in centralized training learn optimal job machine assignments in real time while adapting to stochastic events such as machine breakdowns and reconfiguration delays. The model also incorporates a negotiation with an attention mechanism to enhance state representation and improve decision focus on critical system features. Key DQN enhancements including prioritized experience replay, nstep returns, double DQN and soft target update are used to stabilize and accelerate learning. Experiments conducted in a simulated RMS environment demonstrate that the proposed approach outperforms baseline heuristics in reducing makespan and tardiness while improving machine utilization. The reconfigurable manufacturing environment was extended to simulate realistic challenges, including machine failures and reconfiguration times. Experimental results show that while the enhanced DQN agent is effective in adapting to dynamic conditions, machine breakdowns increase variability in key performance metrics such as makespan, throughput, and total tardiness. The results confirm the advantages of applying the MARL mechanism for intelligent and adaptive scheduling in dynamic reconfigurable manufacturing environments.

Related papers

Deep Q-Learning-Based Intelligent Scheduling for ETL Optimization in Heterogeneous Data Environments [10.31577390735368]
This paper proposes an intelligent scheduling optimization framework based on deep Q-learning.<n>The framework formalizes the scheduling process as a Markov Decision Process.<n>It enables adaptive decision-making by a reinforcement learning agent in high-dimensional state spaces.
arXiv Detail & Related papers (2025-12-15T07:38:47Z)
Optimizing Predictive Maintenance in Intelligent Manufacturing: An Integrated FNO-DAE-GNN-PPO MDP Framework [1.6921396880325779]
We propose a novel Markov Decision Process (MDP) framework that integrates advanced soft computing techniques.<n>We show that the framework significantly outperforms multiple deep learning baseline models with up to 13% cost reduction.<n>The framework has considerable industrial potential to effectively reduce downtime and operating expenses through data-driven strategies.
arXiv Detail & Related papers (2025-11-05T13:21:29Z)
Flexible Locomotion Learning with Diffusion Model Predictive Control [46.432397190673505]
We present Diffusion-MPC, which leverages a learned generative diffusion model as an approximate dynamics prior for planning.<n>Our design enables strong test-time adaptability, allowing the planner to adjust to new reward specifications without retraining.<n>We validate Diffusion-MPC on real world, demonstrating strong locomotion and flexible adaptation.
arXiv Detail & Related papers (2025-10-05T14:51:13Z)
Adaptive Approach to Enhance Machine Learning Scheduling Algorithms During Runtime Using Reinforcement Learning in Metascheduling Applications [0.0]
We propose an adaptive online learning unit integrated within the metascheduler to enhance performance in real-time.<n>In the online mode, Reinforcement Learning plays a pivotal role by continuously exploring and discovering new scheduling solutions.<n>Several RL models were implemented within the online learning unit, each designed to address specific challenges in scheduling.
arXiv Detail & Related papers (2025-09-24T19:46:22Z)
Simulation-Driven Reinforcement Learning in Queuing Network Routing Optimization [0.0]
This study focuses on the development of a simulation-driven reinforcement learning (RL) framework for optimizing routing decisions in complex queueing network systems.<n>We propose a robust RL approach leveraging Deep Deterministic Policy Gradient (DDPG) combined with Dyna-style planning (Dyna-DDPG)<n> Comprehensive experiments and rigorous evaluations demonstrate the framework's capability to rapidly learn effective routing policies.
arXiv Detail & Related papers (2025-07-24T20:32:47Z)
Efficient Transformed Gaussian Process State-Space Models for Non-Stationary High-Dimensional Dynamical Systems [49.819436680336786]
We propose an efficient transformed Gaussian process state-space model (ETGPSSM) for scalable and flexible modeling of high-dimensional, non-stationary dynamical systems.<n>Specifically, our ETGPSSM integrates a single shared GP with input-dependent normalizing flows, yielding an expressive implicit process prior that captures complex, non-stationary transition dynamics.<n>Our ETGPSSM outperforms existing GPSSMs and neural network-based SSMs in terms of computational efficiency and accuracy.
arXiv Detail & Related papers (2025-03-24T03:19:45Z)
LADs: Leveraging LLMs for AI-Driven DevOps [3.240228178267042]
LADs is a principled approach to configuration optimization through in-depth analysis of what optimization works under which conditions.<n>By leveraging Retrieval-Augmented Generation, Few-Shot Learning, Chain-of-Thought, and Feedback-Based Prompt Chaining, LADs generates accurate configurations and learns from deployment failures to iteratively refine system settings.<n>Our findings reveal key insights into the trade-offs between performance, cost, and scalability, helping practitioners determine the right strategies for different deployment scenarios.
arXiv Detail & Related papers (2025-02-28T08:12:08Z)
The Efficiency vs. Accuracy Trade-off: Optimizing RAG-Enhanced LLM Recommender Systems Using Multi-Head Early Exit [46.37267466656765]
This paper presents an optimization framework that combines Retrieval-Augmented Generation (RAG) with an innovative multi-head early exit architecture.<n>Our experiments demonstrate how this architecture effectively decreases time without sacrificing the accuracy needed for reliable recommendation delivery.
arXiv Detail & Related papers (2025-01-04T03:26:46Z)
Skills Composition Framework for Reconfigurable Cyber-Physical Production Modules [44.99833362998488]
This paper proposes a framework for skills' composition and execution in skill-based reconfigurable cyber-physical production modules. It is based on distributed Behavior trees (BTs) and provides good integration between low-level devices' specific code and AI-based task-oriented frameworks.
arXiv Detail & Related papers (2024-05-22T12:56:05Z)
ASR: Attention-alike Structural Re-parameterization [53.019657810468026]
We propose a simple-yet-effective attention-alike structural re- parameterization (ASR) that allows us to achieve SRP for a given network while enjoying the effectiveness of the attention mechanism. In this paper, we conduct extensive experiments from a statistical perspective and discover an interesting phenomenon Stripe Observation, which reveals that channel attention values quickly approach some constant vectors during training.
arXiv Detail & Related papers (2023-04-13T08:52:34Z)
Collaborative Intelligent Reflecting Surface Networks with Multi-Agent Reinforcement Learning [63.83425382922157]
Intelligent reflecting surface (IRS) is envisioned to be widely applied in future wireless networks. In this paper, we investigate a multi-user communication system assisted by cooperative IRS devices with the capability of energy harvesting.
arXiv Detail & Related papers (2022-03-26T20:37:14Z)
Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL. We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.