Optimizing 2D+1 Packing in Constrained Environments Using Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2503.17573v1
- Date: Fri, 21 Mar 2025 23:06:16 GMT
- Title: Optimizing 2D+1 Packing in Constrained Environments Using Deep Reinforcement Learning
- Authors: Victor Ulisses Pugliese, Oséias F. de A. Ferreira, Fabio A. Faria,
- Abstract summary: This paper proposes a novel approach based on deep reinforcement learning (DRL) for the 2D+1 packing problem with spatial constraints.<n>A simulator using the OpenAI Gym framework has been developed to efficiently simulate the packing of rectangular pieces onto two boards with height constraints.
- Score: 0.6827423171182154
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper proposes a novel approach based on deep reinforcement learning (DRL) for the 2D+1 packing problem with spatial constraints. This problem is an extension of the traditional 2D packing problem, incorporating an additional constraint on the height dimension. Therefore, a simulator using the OpenAI Gym framework has been developed to efficiently simulate the packing of rectangular pieces onto two boards with height constraints. Furthermore, the simulator supports multidiscrete actions, enabling the selection of a position on either board and the type of piece to place. Finally, two DRL-based methods (Proximal Policy Optimization -- PPO and the Advantage Actor-Critic -- A2C) have been employed to learn a packing strategy and demonstrate its performance compared to a well-known heuristic baseline (MaxRect-BL). In the experiments carried out, the PPO-based approach proved to be a good solution for solving complex packaging problems and highlighted its potential to optimize resource utilization in various industrial applications, such as the manufacturing of aerospace composites.
Related papers
- Deliberate Planning of 3D Bin Packing on Packing Configuration Trees [40.46267029657914]
Online 3D Bin Packing Problem (3D-BPP) has widespread applications in industrial automation.
We propose to enhance the practical applicability of online 3D-BPP via learning on a novel hierarchical representation, packing configuration tree (PCT)
PCT is a full-fledged description of the state and action space of bin packing which can support packing policy learning based on deep reinforcement learning (DRL)
arXiv Detail & Related papers (2025-04-06T09:07:10Z) - Mitigating Dimensionality in 2D Rectangle Packing Problem under Reinforcement Learning Schema [0.0]
This paper explores the application of Reinforcement Learning (RL) to the two-dimensional rectangular packing problem.
We propose a reduced representation of the state and action spaces that allow us for high granularity.
arXiv Detail & Related papers (2024-09-15T09:58:48Z) - One-Shot Safety Alignment for Large Language Models via Optimal Dualization [64.52223677468861]
This paper presents a perspective of dualization that reduces constrained alignment to an equivalent unconstrained alignment problem.
We do so by pre-optimizing a smooth and convex dual function that has a closed form.
Our strategy leads to two practical algorithms in model-based and preference-based settings.
arXiv Detail & Related papers (2024-05-29T22:12:52Z) - Efficiently Training Deep-Learning Parametric Policies using Lagrangian Duality [55.06411438416805]
Constrained Markov Decision Processes (CMDPs) are critical in many high-stakes applications.
This paper introduces a novel approach, Two-Stage Deep Decision Rules (TS- DDR) to efficiently train parametric actor policies.
It is shown to enhance solution quality and to reduce computation times by several orders of magnitude when compared to current state-of-the-art methods.
arXiv Detail & Related papers (2024-05-23T18:19:47Z) - Leveraging Constraint Programming in a Deep Learning Approach for Dynamically Solving the Flexible Job-Shop Scheduling Problem [1.3927943269211593]
This paper aims to integrate constraint programming (CP) within a deep learning (DL) based methodology, leveraging the benefits of both.
We introduce a method that involves training a DL model using optimal solutions generated by CP, ensuring the model learns from high-quality data.
Our hybrid approach has been extensively tested on three public FJSSP benchmarks, demonstrating superior performance over five state-of-the-art DRL approaches.
arXiv Detail & Related papers (2024-03-14T10:16:57Z) - Learning Constrained Optimization with Deep Augmented Lagrangian Methods [54.22290715244502]
A machine learning (ML) model is trained to emulate a constrained optimization solver.
This paper proposes an alternative approach, in which the ML model is trained to predict dual solution estimates directly.
It enables an end-to-end training scheme is which the dual objective is as a loss function, and solution estimates toward primal feasibility, emulating a Dual Ascent method.
arXiv Detail & Related papers (2024-03-06T04:43:22Z) - Double Duality: Variational Primal-Dual Policy Optimization for
Constrained Reinforcement Learning [132.7040981721302]
We study the Constrained Convex Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure.
Design algorithms for a constrained convex MDP faces several challenges, including handling the large state space.
arXiv Detail & Related papers (2024-02-16T16:35:18Z) - Adjustable Robust Reinforcement Learning for Online 3D Bin Packing [11.157035538606968]
Current deep reinforcement learning methods for online 3D-BPP fail in real-world settings where some worst-case scenarios can materialize.
We propose an adjustable robust reinforcement learning framework that allows efficient adjustment of robustness weights.
Experiments demonstrate that AR2L is versatile in the sense that it improves policy robustness while maintaining an acceptable level of performance for the nominal case.
arXiv Detail & Related papers (2023-10-06T15:34:21Z) - Maximize to Explore: One Objective Function Fusing Estimation, Planning,
and Exploration [87.53543137162488]
We propose an easy-to-implement online reinforcement learning (online RL) framework called textttMEX.
textttMEX integrates estimation and planning components while balancing exploration exploitation automatically.
It can outperform baselines by a stable margin in various MuJoCo environments with sparse rewards.
arXiv Detail & Related papers (2023-05-29T17:25:26Z) - Online 3D Bin Packing Reinforcement Learning Solution with Buffer [1.8060107352742993]
We present a new reinforcement learning framework for a 3D-BPP solution for improving performance.
We implement a model-based RL method adapted from the popular algorithm AlphaGo.
Our adaptation is capable of working in single-player and score based environments.
arXiv Detail & Related papers (2022-08-15T11:28:20Z) - Learning Practically Feasible Policies for Online 3D Bin Packing [36.33774915391967]
We tackle the Online 3D Bin Packing Problem, a challenging yet practically useful variant of the classical Bin Packing Problem.
Online 3D-BPP can be naturally formulated as Markov Decision Process (MDP)
We adopt deep reinforcement learning, in particular, the on-policy actor-critic framework, to solve this MDP with constrained action space.
arXiv Detail & Related papers (2021-08-31T08:37:58Z) - Learning Space Partitions for Path Planning [54.475949279050596]
PlaLaM outperforms existing path planning methods in 2D navigation tasks, especially in the presence of difficult-to-escape local optima.
These gains transfer to highly multimodal real-world tasks, where we outperform strong baselines in compiler phase ordering by up to 245% and in molecular design by up to 0.4 on properties on a 0-1 scale.
arXiv Detail & Related papers (2021-06-19T18:06:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.