Related papers: Adaptive Operator Selection Based on Dynamic Thompson Sampling for MOEA/D

Adaptive Operator Selection Based on Dynamic Thompson Sampling for MOEA/D

URL: http://arxiv.org/abs/2004.10874v1
Date: Wed, 22 Apr 2020 21:41:57 GMT
Title: Adaptive Operator Selection Based on Dynamic Thompson Sampling for MOEA/D
Authors: Lei Sun and Ke Li
Abstract summary: This paper proposes a new AOS mechanism for multi-objective evolutionary algorithm based on decomposition (MOEA/D) AOS is formulated as a multi-armed bandit problem where the dynamic Thompson sampling (DYTS) is applied to adapt the bandit learning model. Results fully demonstrate the effectiveness and competitiveness of our proposed AOS mechanism compared with other four state-of-the-art MOEA/D variants.
Score: 11.034230601053116
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In evolutionary computation, different reproduction operators have various search dynamics. To strike a well balance between exploration and exploitation, it is attractive to have an adaptive operator selection (AOS) mechanism that automatically chooses the most appropriate operator on the fly according to the current status. This paper proposes a new AOS mechanism for multi-objective evolutionary algorithm based on decomposition (MOEA/D). More specifically, the AOS is formulated as a multi-armed bandit problem where the dynamic Thompson sampling (DYTS) is applied to adapt the bandit learning model, originally proposed with an assumption of a fixed award distribution, to a non-stationary setup. In particular, each arm of our bandit learning model represents a reproduction operator and is assigned with a prior reward distribution. The parameters of these reward distributions will be progressively updated according to the performance of its performance collected from the evolutionary process. When generating an offspring, an operator is chosen by sampling from those reward distribution according to the DYTS. Experimental results fully demonstrate the effectiveness and competitiveness of our proposed AOS mechanism compared with other four state-of-the-art MOEA/D variants.

Related papers

Fake it till You Make it: Reward Modeling as Discriminative Prediction [49.31309674007382]
GAN-RM is an efficient reward modeling framework that eliminates manual preference annotation and explicit quality dimension engineering.<n>Our method trains the reward model through discrimination between a small set of representative, unpaired target samples.<n>Experiments demonstrate our GAN-RM's effectiveness across multiple key applications.
arXiv Detail & Related papers (2025-06-16T17:59:40Z)
Selection Mechanisms for Sequence Modeling using Linear State Space Models [2.4374097382908477]
We introduce an alternative selection mechanism inspired by control theory methodologies.<n>We propose a novel residual generator for selection, drawing an analogy to fault detection strategies in Linear Time-Invariant (LTI) systems.<n>Our approach combines multiple LTI systems, preserving their beneficial properties during training while achieving comparable selectivity.
arXiv Detail & Related papers (2025-05-23T14:08:56Z)
Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection [71.92083784393418]
Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance. We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
Semantic Gaussian Mixture Variational Autoencoder for Sequential Recommendation [49.492451800322144]
We propose a novel VAE-based Sequential Recommendation model named SIGMA. For multi-interest elicitation, SIGMA includes a probabilistic multi-interest extraction module. Experiments on public datasets demonstrate the effectiveness of SIGMA.
arXiv Detail & Related papers (2025-02-22T08:29:52Z)
Towards Robust Multimodal Open-set Test-time Adaptation via Adaptive Entropy-aware Optimization [9.03028904066824]
Open-set test-time adaptation (OSTTA) aims to adapt a source pre-trained model online to an unlabeled target domain that contains unknown classes. We present Adaptive Entropy-aware Optimization (AEO), a novel framework specifically designed to tackle Multimodal Open-set Test-time Adaptation.
arXiv Detail & Related papers (2025-01-23T18:59:30Z)
Improving Portfolio Optimization Results with Bandit Networks [0.0]
We introduce and evaluate novel Bandit algorithms designed for non-stationary environments. First, we present the Adaptive Discounted Thompson Sampling (ADTS) algorithm. We then extend this approach to the Portfolio Optimization problem by introducing the Combinatorial Adaptive Discounted Thompson Sampling (CADTS) algorithm.
arXiv Detail & Related papers (2024-10-05T16:17:31Z)
Diversified Batch Selection for Training Acceleration [68.67164304377732]
A prevalent research line, known as online batch selection, explores selecting informative subsets during the training process. vanilla reference-model-free methods involve independently scoring and selecting data in a sample-wise manner. We propose Diversified Batch Selection (DivBS), which is reference-model-free and can efficiently select diverse and representative samples.
arXiv Detail & Related papers (2024-06-07T12:12:20Z)
Learning Efficient and Fair Policies for Uncertainty-Aware Collaborative Human-Robot Order Picking [11.997524293204368]
In collaborative human-robot order picking systems, human pickers and Autonomous Mobile Robots (AMRs) travel independently through a warehouse and meet at pick locations where pickers load items onto AMRs. We propose a novel multi-objective Deep Reinforcement Learning (DRL) approach to learn effective allocation policies to pick efficiency while also aiming to improve workload fairness amongst human pickers.
arXiv Detail & Related papers (2024-04-09T11:45:16Z)
A Bandit Approach with Evolutionary Operators for Model Selection [0.4604003661048266]
This work formulates model selection as an infinite-armed bandit problem, namely, a problem in which a decision maker iteratively selects one of an infinite number of fixed choices (i.e., arms) The arms are machine learning models to train and selecting an arm corresponds to a partial training of the model (resource allocation) We propose the algorithm Mutant-UCB that incorporates operators from evolutionary algorithms into the UCB-E bandit algorithm introduced by Audiber et al. Tests carried out on three open source image classification data sets attest to the relevance of this novel combining approach, which outperforms the state-of
arXiv Detail & Related papers (2024-02-07T08:01:45Z)
Mimicking Better by Matching the Approximate Action Distribution [48.95048003354255]
We introduce MAAD, a novel, sample-efficient on-policy algorithm for Imitation Learning from Observations. We show that it requires considerable fewer interactions to achieve expert performance, outperforming current state-of-the-art on-policy methods.
arXiv Detail & Related papers (2023-06-16T12:43:47Z)
SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy Minimization [30.61075178799518]
A test-time adaptation (TTA) method has recently been proposed to adapt the pre-trained ASR model on unlabeled test instances without source data. We propose a novel TTA framework, dubbed SGEM, for general ASR models. SGEM achieves state-of-the-art performance for three mainstream ASR models under various domain shifts.
arXiv Detail & Related papers (2023-06-03T02:27:08Z)
Deep Variational Models for Collaborative Filtering-based Recommender Systems [63.995130144110156]
Deep learning provides accurate collaborative filtering models to improve recommender system results. Our proposed models apply the variational concept to injectity in the latent space of the deep architecture. Results show the superiority of the proposed approach in scenarios where the variational enrichment exceeds the injected noise effect.
arXiv Detail & Related papers (2021-07-27T08:59:39Z)
Model Selection for Bayesian Autoencoders [25.619565817793422]
We propose to optimize the distributional sliced-Wasserstein distance between the output of the autoencoder and the empirical data distribution. We turn our BAE into a generative model by fitting a flexible Dirichlet mixture model in the latent space. We evaluate our approach qualitatively and quantitatively using a vast experimental campaign on a number of unsupervised learning tasks and show that, in small-data regimes where priors matter, our approach provides state-of-the-art results.
arXiv Detail & Related papers (2021-06-11T08:55:00Z)
Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts [52.844741540236285]
This paper investigates the model-based methods in multi-agent reinforcement learning (MARL) We propose a novel decentralized model-based MARL method, named Adaptive Opponent-wise Rollout Policy (AORPO)
arXiv Detail & Related papers (2021-05-07T16:20:22Z)
Efficient UAV Trajectory-Planning using Economic Reinforcement Learning [65.91405908268662]
We introduce REPlanner, a novel reinforcement learning algorithm inspired by economic transactions to distribute tasks between UAVs. We formulate the path planning problem as a multi-agent economic game, where agents can cooperate and compete for resources. As the system computes task distributions via UAV cooperation, it is highly resilient to any change in the swarm size.
arXiv Detail & Related papers (2021-03-03T20:54:19Z)
Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions. Motivated by these theoretical results, we propose learning several approximate proposals for the best model. In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.