Related papers: RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark

RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark

URL: http://arxiv.org/abs/2306.17100v6
Date: Mon, 21 Jul 2025 08:23:56 GMT
Title: RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark
Authors: Federico Berto, Chuanbo Hua, Junyoung Park, Laurin Luttmann, Yining Ma, Fanchen Bu, Jiarui Wang, Haoran Ye, Minsu Kim, Sanghyeok Choi, Nayeli Gast Zepeda, André Hottung, Jianan Zhou, Jieyi Bi, Yu Hu, Fei Liu, Hyeonah Kim, Jiwoo Son, Haeyeon Kim, Davide Angioni, Wouter Kool, Zhiguang Cao, Qingfu Zhang, Joungho Kim, Jie Zhang, Kijung Shin, Cathy Wu, Sungsoo Ahn, Guojie Song, Changhyun Kwon, Kevin Tierney, Lin Xie, Jinkyoo Park,
Abstract summary: Combinatorial optimization (CO) is fundamental to several real-world applications, from logistics and scheduling to hardware design and resource allocation.<n>Deep reinforcement learning has recently shown significant benefits in solving CO problems, reducing reliance on domain expertise and improving computational efficiency.<n>We introduce RL4CO, a unified benchmark with in-depth library coverage of 27 CO problem environments and 23 state-of-the-art baselines.
Score: 69.19502244910632
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Combinatorial optimization (CO) is fundamental to several real-world applications, from logistics and scheduling to hardware design and resource allocation. Deep reinforcement learning (RL) has recently shown significant benefits in solving CO problems, reducing reliance on domain expertise and improving computational efficiency. However, the absence of a unified benchmarking framework leads to inconsistent evaluations, limits reproducibility, and increases engineering overhead, raising barriers to adoption for new researchers. To address these challenges, we introduce RL4CO, a unified and extensive benchmark with in-depth library coverage of 27 CO problem environments and 23 state-of-the-art baselines. Built on efficient software libraries and best practices in implementation, RL4CO features modularized implementation and flexible configurations of diverse environments, policy architectures, RL algorithms, and utilities with extensive documentation. RL4CO helps researchers build on existing successes while exploring and developing their own designs, facilitating the entire research process by decoupling science from heavy engineering. We finally provide extensive benchmark studies to inspire new insights and future work. RL4CO has already attracted numerous researchers in the community and is open-sourced at https://github.com/ai4co/rl4co.

Related papers

SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis [89.99161034065614]
Retrieval-augmented generation (RAG) systems have advanced large language models (LLMs) in complex deep search scenarios.<n>Existing approaches face critical limitations that lack high-quality training trajectories and suffer from distributional mismatches.<n>This paper introduces SimpleDeepSearcher, a framework that bridges the gap through strategic data engineering rather than complex training paradigms.
arXiv Detail & Related papers (2025-05-22T16:05:02Z)
SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks [110.20297293596005]
Large language model (LLM) agents need to perform multi-turn interactions in real-world tasks. Existing multi-turn RL algorithms for optimizing LLM agents fail to perform effective credit assignment over multiple turns while leveraging the generalization capabilities of LLMs. We propose a novel RL algorithm, SWEET-RL, that uses a carefully designed optimization objective to train a critic model with access to additional training-time information. Our experiments demonstrate that SWEET-RL achieves a 6% absolute improvement in success and win rates on ColBench compared to other state-of-the-art multi-turn RL algorithms.
arXiv Detail & Related papers (2025-03-19T17:55:08Z)
Multi-Agent Environments for Vehicle Routing Problems [1.0179489519625304]
We propose a library composed of multi-agent environments that simulates classic vehicle routing problems. The library, built on PyTorch, provides a flexible modular architecture design that allows easy customization and incorporation of new routing problems.
arXiv Detail & Related papers (2024-11-21T18:46:23Z)
Generative AI for Deep Reinforcement Learning: Framework, Analysis, and Use Cases [60.30995339585003]
Deep reinforcement learning (DRL) has been widely applied across various fields and has achieved remarkable accomplishments. DRL faces certain limitations, including low sample efficiency and poor generalization. We present how to leverage generative AI (GAI) to address these issues and enhance the performance of DRL algorithms.
arXiv Detail & Related papers (2024-05-31T01:25:40Z)
Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models [95.96734086126469]
Large language models (LLMs) can serve as the assistant to help users accomplish their jobs, and also support the development of advanced applications. For the wide application of LLMs, the inference efficiency is an essential concern, which has been widely studied in existing work. We perform a detailed coarse-to-fine analysis of the inference performance of various code libraries.
arXiv Detail & Related papers (2024-04-17T15:57:50Z)
Transform then Explore: a Simple and Effective Technique for Exploratory Combinatorial Optimization with Reinforcement Learning [11.531786269804707]
We propose a gauge transformation (GT) technique to solve optimization problems (COPs) over graphs. GT is very simple, which can be implemented with less than 10 lines of Python codes, and can be applied to a vast majority of reinforcement learning models. We show that traditional RL models with GT technique produce the state-of-the-art performances on the MaxCut problem.
arXiv Detail & Related papers (2024-04-06T15:31:17Z)
EasyRL4Rec: An Easy-to-use Library for Reinforcement Learning Based Recommender Systems [18.22130279210423]
We introduce EasyRL4Rec, an easy-to-use code library designed specifically for RL-based RSs. This library provides lightweight and diverse RL environments based on five public datasets. EasyRL4Rec seeks to facilitate the model development and experimental process in the domain of RL-based RSs.
arXiv Detail & Related papers (2024-02-23T07:54:26Z)
Reinforcement Learning-assisted Evolutionary Algorithm: A Survey and Research Opportunities [63.258517066104446]
Reinforcement learning integrated as a component in the evolutionary algorithm has demonstrated superior performance in recent years. We discuss the RL-EA integration method, the RL-assisted strategy adopted by RL-EA, and its applications according to the existing literature. In the applications of RL-EA section, we also demonstrate the excellent performance of RL-EA on several benchmarks and a range of public datasets.
arXiv Detail & Related papers (2023-08-25T15:06:05Z)
SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores [13.948640763797776]
We present a novel abstraction on the dataflows of RL training, which unifies diverse RL training applications into a general framework. We develop a scalable, efficient, and distributed RL system called ReaLly scalableRL, which allows efficient and massively parallelized training. SRL is the first in the academic community to perform RL experiments at a large scale with over 15k CPU cores.
arXiv Detail & Related papers (2023-06-29T05:16:25Z)
MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms. Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return. We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z)
Ensemble Reinforcement Learning in Continuous Spaces -- A Hierarchical Multi-Step Approach for Policy Training [4.982806898121435]
We propose a new technique to train an ensemble of base learners based on an innovative multi-step integration method. This training technique enables us to develop a new hierarchical learning algorithm for ensemble DRL that effectively promotes inter-learner collaboration. The algorithm is also shown empirically to outperform several state-of-the-art DRL algorithms on multiple benchmark RL problems.
arXiv Detail & Related papers (2022-09-29T00:42:44Z)
Sym-NCO: Leveraging Symmetricity for Neural Combinatorial Optimization [16.127824824652077]
Deep reinforcement learning (DRL)-based optimization (CO) methods have shown significant merit over the conventional CO solvers. This paper presents a novel training scheme, Sym-NCO, that achieves significant performance increments to existing DRL-NCO methods.
arXiv Detail & Related papers (2022-05-26T07:55:43Z)
OR-Gym: A Reinforcement Learning Library for Operations Research Problems [0.0]
We introduce OR-Gym, an open-source library for developing reinforcement learning algorithms to address operations research problems. In this paper, we apply reinforcement learning to the knapsack, multi-dimensional bin packing, multi-echelon supply chain, and multi-period asset allocation model problems.
arXiv Detail & Related papers (2020-08-14T12:21:22Z)
CATCH: Context-based Meta Reinforcement Learning for Transferrable Architecture Search [102.67142711824748]
CATCH is a novel Context-bAsed meTa reinforcement learning algorithm for transferrable arChitecture searcH. The combination of meta-learning and RL allows CATCH to efficiently adapt to new tasks while being agnostic to search spaces. It is also capable of handling cross-domain architecture search as competitive networks on ImageNet, COCO, and Cityscapes are identified.
arXiv Detail & Related papers (2020-07-18T09:35:53Z)
Off-Policy Reinforcement Learning for Efficient and Effective GAN Architecture Search [50.40004966087121]
We introduce a new reinforcement learning based neural architecture search (NAS) methodology for generative adversarial network (GAN) architecture search. The key idea is to formulate the GAN architecture search problem as a Markov decision process (MDP) for smoother architecture sampling. We exploit an off-policy GAN architecture search algorithm that makes efficient use of the samples generated by previous policies.
arXiv Detail & Related papers (2020-07-17T18:29:17Z)
Review, Analysis and Design of a Comprehensive Deep Reinforcement Learning Framework [6.527722484694189]
This paper proposes a comprehensive software framework that plays a vital role in designing a connect-the-dots deep RL architecture. We have designed and developed a deep RL-based software framework that strictly ensures flexibility, robustness, and scalability.
arXiv Detail & Related papers (2020-02-27T02:38:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.