Adaptive Operator Selection Based on Dynamic Thompson Sampling for
MOEA/D
- URL: http://arxiv.org/abs/2004.10874v1
- Date: Wed, 22 Apr 2020 21:41:57 GMT
- Title: Adaptive Operator Selection Based on Dynamic Thompson Sampling for
MOEA/D
- Authors: Lei Sun and Ke Li
- Abstract summary: This paper proposes a new AOS mechanism for multi-objective evolutionary algorithm based on decomposition (MOEA/D)
AOS is formulated as a multi-armed bandit problem where the dynamic Thompson sampling (DYTS) is applied to adapt the bandit learning model.
Results fully demonstrate the effectiveness and competitiveness of our proposed AOS mechanism compared with other four state-of-the-art MOEA/D variants.
- Score: 11.034230601053116
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In evolutionary computation, different reproduction operators have various
search dynamics. To strike a well balance between exploration and exploitation,
it is attractive to have an adaptive operator selection (AOS) mechanism that
automatically chooses the most appropriate operator on the fly according to the
current status. This paper proposes a new AOS mechanism for multi-objective
evolutionary algorithm based on decomposition (MOEA/D). More specifically, the
AOS is formulated as a multi-armed bandit problem where the dynamic Thompson
sampling (DYTS) is applied to adapt the bandit learning model, originally
proposed with an assumption of a fixed award distribution, to a non-stationary
setup. In particular, each arm of our bandit learning model represents a
reproduction operator and is assigned with a prior reward distribution. The
parameters of these reward distributions will be progressively updated
according to the performance of its performance collected from the evolutionary
process. When generating an offspring, an operator is chosen by sampling from
those reward distribution according to the DYTS. Experimental results fully
demonstrate the effectiveness and competitiveness of our proposed AOS mechanism
compared with other four state-of-the-art MOEA/D variants.
Related papers
- Improving Portfolio Optimization Results with Bandit Networks [0.0]
We introduce and evaluate novel Bandit algorithms designed for non-stationary environments.
First, we present the Adaptive Discounted Thompson Sampling (ADTS) algorithm.
We then extend this approach to the Portfolio Optimization problem by introducing the Combinatorial Adaptive Discounted Thompson Sampling (CADTS) algorithm.
arXiv Detail & Related papers (2024-10-05T16:17:31Z) - Diversified Batch Selection for Training Acceleration [68.67164304377732]
A prevalent research line, known as online batch selection, explores selecting informative subsets during the training process.
vanilla reference-model-free methods involve independently scoring and selecting data in a sample-wise manner.
We propose Diversified Batch Selection (DivBS), which is reference-model-free and can efficiently select diverse and representative samples.
arXiv Detail & Related papers (2024-06-07T12:12:20Z) - Learning Efficient and Fair Policies for Uncertainty-Aware Collaborative Human-Robot Order Picking [11.997524293204368]
In collaborative human-robot order picking systems, human pickers and Autonomous Mobile Robots (AMRs) travel independently through a warehouse and meet at pick locations where pickers load items onto AMRs.
We propose a novel multi-objective Deep Reinforcement Learning (DRL) approach to learn effective allocation policies to pick efficiency while also aiming to improve workload fairness amongst human pickers.
arXiv Detail & Related papers (2024-04-09T11:45:16Z) - A Bandit Approach with Evolutionary Operators for Model Selection [0.4604003661048266]
This work formulates model selection as an infinite-armed bandit problem, namely, a problem in which a decision maker iteratively selects one of an infinite number of fixed choices (i.e., arms)
The arms are machine learning models to train and selecting an arm corresponds to a partial training of the model (resource allocation)
We propose the algorithm Mutant-UCB that incorporates operators from evolutionary algorithms into the UCB-E bandit algorithm introduced by Audiber et al.
Tests carried out on three open source image classification data sets attest to the relevance of this novel combining approach, which outperforms the state-of
arXiv Detail & Related papers (2024-02-07T08:01:45Z) - Mimicking Better by Matching the Approximate Action Distribution [48.95048003354255]
We introduce MAAD, a novel, sample-efficient on-policy algorithm for Imitation Learning from Observations.
We show that it requires considerable fewer interactions to achieve expert performance, outperforming current state-of-the-art on-policy methods.
arXiv Detail & Related papers (2023-06-16T12:43:47Z) - SGEM: Test-Time Adaptation for Automatic Speech Recognition via
Sequential-Level Generalized Entropy Minimization [30.61075178799518]
A test-time adaptation (TTA) method has recently been proposed to adapt the pre-trained ASR model on unlabeled test instances without source data.
We propose a novel TTA framework, dubbed SGEM, for general ASR models.
SGEM achieves state-of-the-art performance for three mainstream ASR models under various domain shifts.
arXiv Detail & Related papers (2023-06-03T02:27:08Z) - Deep Variational Models for Collaborative Filtering-based Recommender
Systems [63.995130144110156]
Deep learning provides accurate collaborative filtering models to improve recommender system results.
Our proposed models apply the variational concept to injectity in the latent space of the deep architecture.
Results show the superiority of the proposed approach in scenarios where the variational enrichment exceeds the injected noise effect.
arXiv Detail & Related papers (2021-07-27T08:59:39Z) - Model Selection for Bayesian Autoencoders [25.619565817793422]
We propose to optimize the distributional sliced-Wasserstein distance between the output of the autoencoder and the empirical data distribution.
We turn our BAE into a generative model by fitting a flexible Dirichlet mixture model in the latent space.
We evaluate our approach qualitatively and quantitatively using a vast experimental campaign on a number of unsupervised learning tasks and show that, in small-data regimes where priors matter, our approach provides state-of-the-art results.
arXiv Detail & Related papers (2021-06-11T08:55:00Z) - Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise
Rollouts [52.844741540236285]
This paper investigates the model-based methods in multi-agent reinforcement learning (MARL)
We propose a novel decentralized model-based MARL method, named Adaptive Opponent-wise Rollout Policy (AORPO)
arXiv Detail & Related papers (2021-05-07T16:20:22Z) - Efficient UAV Trajectory-Planning using Economic Reinforcement Learning [65.91405908268662]
We introduce REPlanner, a novel reinforcement learning algorithm inspired by economic transactions to distribute tasks between UAVs.
We formulate the path planning problem as a multi-agent economic game, where agents can cooperate and compete for resources.
As the system computes task distributions via UAV cooperation, it is highly resilient to any change in the swarm size.
arXiv Detail & Related papers (2021-03-03T20:54:19Z) - Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions.
Motivated by these theoretical results, we propose learning several approximate proposals for the best model.
In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.