Learning to Schedule Heuristics for the Simultaneous Stochastic
Optimization of Mining Complexes
- URL: http://arxiv.org/abs/2202.12866v1
- Date: Fri, 25 Feb 2022 18:20:14 GMT
- Title: Learning to Schedule Heuristics for the Simultaneous Stochastic
Optimization of Mining Complexes
- Authors: Yassine Yaakoubi, Roussos Dimitrakopoulos
- Abstract summary: The proposed learn-to-perturb (L2P) hyper-heuristic is a multi-neighborhood simulated annealing algorithm.
L2P is tested on several real-world mining complexes, with an emphasis on efficiency, robustness, and generalization capacity.
Results show a reduction in the number of iterations by 30-50% and in the computational time by 30-45%.
- Score: 2.538209532048867
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The simultaneous stochastic optimization of mining complexes (SSOMC) is a
large-scale stochastic combinatorial optimization problem that simultaneously
manages the extraction of materials from multiple mines and their processing
using interconnected facilities to generate a set of final products, while
taking into account material supply (geological) uncertainty to manage the
associated risk. Although simulated annealing has been shown to outperform
comparing methods for solving the SSOMC, early performance might dominate
recent performance in that a combination of the heuristics' performance is used
to determine which perturbations to apply. This work proposes a data-driven
framework for heuristic scheduling in a fully self-managed hyper-heuristic to
solve the SSOMC. The proposed learn-to-perturb (L2P) hyper-heuristic is a
multi-neighborhood simulated annealing algorithm. The L2P selects the heuristic
(perturbation) to be applied in a self-adaptive manner using reinforcement
learning to efficiently explore which local search is best suited for a
particular search point. Several state-of-the-art agents have been incorporated
into L2P to better adapt the search and guide it towards better solutions. By
learning from data describing the performance of the heuristics, a
problem-specific ordering of heuristics that collectively finds better
solutions faster is obtained. L2P is tested on several real-world mining
complexes, with an emphasis on efficiency, robustness, and generalization
capacity. Results show a reduction in the number of iterations by 30-50% and in
the computational time by 30-45%.
Related papers
- A Learned Proximal Alternating Minimization Algorithm and Its Induced Network for a Class of Two-block Nonconvex and Nonsmooth Optimization [4.975853671529418]
This work proposes a general learned alternating minimization algorithm, LPAM, for solving learnable two-block nonsmooth problems.
The proposed LPAM-net is parameter-efficient and has favourable performance in comparison with some state-of-the-art methods.
arXiv Detail & Related papers (2024-11-10T02:02:32Z) - LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path.
The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z) - Sample-Efficient Multi-Agent RL: An Optimization Perspective [103.35353196535544]
We study multi-agent reinforcement learning (MARL) for the general-sum Markov Games (MGs) under the general function approximation.
We introduce a novel complexity measure called the Multi-Agent Decoupling Coefficient (MADC) for general-sum MGs.
We show that our algorithm provides comparable sublinear regret to the existing works.
arXiv Detail & Related papers (2023-10-10T01:39:04Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Maximize to Explore: One Objective Function Fusing Estimation, Planning,
and Exploration [87.53543137162488]
We propose an easy-to-implement online reinforcement learning (online RL) framework called textttMEX.
textttMEX integrates estimation and planning components while balancing exploration exploitation automatically.
It can outperform baselines by a stable margin in various MuJoCo environments with sparse rewards.
arXiv Detail & Related papers (2023-05-29T17:25:26Z) - Optimal and Efficient Algorithms for General Mixable Losses against
Switching Oracles [0.0]
We study the online optimization of mixable loss functions in a dynamic environment.
Our results are guaranteed to hold in a strong deterministic sense in an individual sequence manner.
arXiv Detail & Related papers (2021-08-13T21:48:55Z) - A Two-stage Framework and Reinforcement Learning-based Optimization
Algorithms for Complex Scheduling Problems [54.61091936472494]
We develop a two-stage framework, in which reinforcement learning (RL) and traditional operations research (OR) algorithms are combined together.
The scheduling problem is solved in two stages, including a finite Markov decision process (MDP) and a mixed-integer programming process, respectively.
Results show that the proposed algorithms could stably and efficiently obtain satisfactory scheduling schemes for agile Earth observation satellite scheduling problems.
arXiv Detail & Related papers (2021-03-10T03:16:12Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z) - Anytime Behavior of Inexact TSP Solvers and Perspectives for Automated
Algorithm Selection [0.0]
Traveling-Salesperson-Problem (TSP) is arguably one of the best-known NP-hard optimization problems.
We extend existing benchmarking studies by addressing anytime behaviour of inexact TSP solvers.
It turns out that performance ranking of solvers is highly dependent on the focused approximation quality.
arXiv Detail & Related papers (2020-05-27T11:36:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.