Related papers: SortingEnv: An Extendable RL-Environment for an Industrial Sorting Process

SortingEnv: An Extendable RL-Environment for an Industrial Sorting Process

URL: http://arxiv.org/abs/2503.10466v1
Date: Thu, 13 Mar 2025 15:38:25 GMT
Title: SortingEnv: An Extendable RL-Environment for an Industrial Sorting Process
Authors: Tom Maus, Nico Zengeler, Tobias Glasmachers,
Abstract summary: We present a novel reinforcement learning (RL) environment designed to both optimize industrial sorting systems and study agent behavior in evolving spaces.<n>In simulating material flow within a sorting process our environment follows the idea of a digital twin, with operational parameters like belt speed and occupancy level.<n>It thus includes two variants: a basic version focusing on discrete belt speed adjustments and an advanced version introducing multiple sorting modes and enhanced material composition observations.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We present a novel reinforcement learning (RL) environment designed to both optimize industrial sorting systems and study agent behavior in evolving spaces. In simulating material flow within a sorting process our environment follows the idea of a digital twin, with operational parameters like belt speed and occupancy level. To reflect real-world challenges, we integrate common upgrades to industrial setups, like new sensors or advanced machinery. It thus includes two variants: a basic version focusing on discrete belt speed adjustments and an advanced version introducing multiple sorting modes and enhanced material composition observations. We detail the observation spaces, state update mechanisms, and reward functions for both environments. We further evaluate the efficiency of common RL algorithms like Proximal Policy Optimization (PPO), Deep-Q-Networks (DQN), and Advantage Actor Critic (A2C) in comparison to a classical rule-based agent (RBA). This framework not only aids in optimizing industrial processes but also provides a foundation for studying agent behavior and transferability in evolving environments, offering insights into model performance and practical implications for real-world RL applications.

Related papers

A Survey of Direct Preference Optimization [103.59317151002693]
Large Language Models (LLMs) have demonstrated unprecedented generative capabilities. Their alignment with human values remains critical for ensuring helpful and harmless deployments. Direct Preference Optimization (DPO) has recently gained prominence as a streamlined alternative.
arXiv Detail & Related papers (2025-03-12T08:45:15Z)
ODRL: A Benchmark for Off-Dynamics Reinforcement Learning [59.72217833812439]
We introduce ODRL, the first benchmark tailored for evaluating off-dynamics RL methods. ODRL contains four experimental settings where the source and target domains can be either online or offline. We conduct extensive benchmarking experiments, which show that no method has universal advantages across varied dynamics shifts.
arXiv Detail & Related papers (2024-10-28T05:29:38Z)
An advantage based policy transfer algorithm for reinforcement learning with measures of transferability [5.926203312586109]
Reinforcement learning (RL) enables sequential decision-making in complex and high-dimensional environments.<n>This paper proposes an off-policy Advantage-based Policy Transfer algorithm, APT-RL, for fixed domain environments.
arXiv Detail & Related papers (2023-11-12T04:25:53Z)
RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark [69.19502244910632]
Deep reinforcement learning (RL) has shown significant benefits in solving optimization (CO) problems. We introduce RL4CO, a unified benchmark with in-depth library coverage of 23 state-of-the-art methods and more than 20 CO problems. Built on efficient software libraries and best practices in implementation, RL4CO features modularized implementation and flexible configuration of diverse RL algorithms, neural network architectures, inference techniques, and environments.
arXiv Detail & Related papers (2023-06-29T16:57:22Z)
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration [87.53543137162488]
We propose an easy-to-implement online reinforcement learning (online RL) framework called textttMEX. textttMEX integrates estimation and planning components while balancing exploration exploitation automatically. It can outperform baselines by a stable margin in various MuJoCo environments with sparse rewards.
arXiv Detail & Related papers (2023-05-29T17:25:26Z)
Karolos: An Open-Source Reinforcement Learning Framework for Robot-Task Environments [0.3867363075280544]
In reinforcement learning (RL) research, simulations enable benchmarks between algorithms. In this paper, we introduce Karolos, a framework developed for robotic applications. The code is open source and published on GitHub with the aim of promoting research of RL applications in robotics.
arXiv Detail & Related papers (2022-12-01T23:14:02Z)
FORLORN: A Framework for Comparing Offline Methods and Reinforcement Learning for Optimization of RAN Parameters [0.0]
This paper introduces a new framework for benchmarking the performance of an RL agent in network environments simulated with ns-3. Within this framework, we demonstrate that an RL agent without domain-specific knowledge can learn how to efficiently adjust Radio Access Network (RAN) parameters to match offline optimization in static scenarios.
arXiv Detail & Related papers (2022-09-08T12:58:09Z)
Machine Learning Framework for Quantum Sampling of Highly-Constrained, Continuous Optimization Problems [101.18253437732933]
We develop a generic, machine learning-based framework for mapping continuous-space inverse design problems into surrogate unconstrained binary optimization problems. We showcase the framework's performance on two inverse design problems by optimizing thermal emitter topologies for thermophotovoltaic applications and (ii) diffractive meta-gratings for highly efficient beam steering.
arXiv Detail & Related papers (2021-05-06T02:22:23Z)
Learning to Continuously Optimize Wireless Resource in a Dynamic Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment. We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes. Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z)
Towards Standardizing Reinforcement Learning Approaches for Stochastic Production Scheduling [77.34726150561087]
reinforcement learning can be used to solve scheduling problems. Existing studies rely on (sometimes) complex simulations for which the code is unavailable. There is a vast array of RL designs to choose from. standardization of model descriptions - both production setup and RL design - and validation scheme are a prerequisite.
arXiv Detail & Related papers (2021-04-16T16:07:10Z)
Sample-Efficient Automated Deep Reinforcement Learning [33.53903358611521]
We propose a population-based automated RL framework to meta-optimize arbitrary off-policy RL algorithms. By sharing the collected experience across the population, we substantially increase the sample efficiency of the meta-optimization. We demonstrate the capabilities of our sample-efficient AutoRL approach in a case study with the popular TD3 algorithm in the MuJoCo benchmark suite.
arXiv Detail & Related papers (2020-09-03T10:04:06Z)
Concept and the implementation of a tool to convert industry 4.0 environments modeled as FSM to an OpenAI Gym wrapper [2.594420805049218]
This work presents the concept and the implementation of a tool that allows us to convert any dynamic system modeled as an FSM to the open-source Gym wrapper. In the first tests of the proposed tool, we show traditional Q-learning and Deep Q-learning methods running over two simple environments.
arXiv Detail & Related papers (2020-06-29T13:28:41Z)
A Survey of Reinforcement Learning Algorithms for Dynamically Varying Environments [1.713291434132985]
Reinforcement learning (RL) algorithms find applications in inventory control, recommender systems, vehicular traffic management, cloud computing and robotics. Real-world complications of many tasks arising in these domains makes them difficult to solve with the basic assumptions underlying classical RL algorithms. This paper provides a survey of RL methods developed for handling dynamically varying environment models. A representative collection of these algorithms is discussed in detail in this work along with their categorization and their relative merits and demerits.
arXiv Detail & Related papers (2020-05-19T09:42:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.