Related papers: Sequential Learning-based IaaS Composition

Sequential Learning-based IaaS Composition

URL: http://arxiv.org/abs/2102.12598v1
Date: Wed, 24 Feb 2021 23:16:01 GMT
Title: Sequential Learning-based IaaS Composition
Authors: Sajib Mistry, Sheik Mohammad Mostakim Fattah, and Athman Bouguettaya
Abstract summary: Decision variables are included in the temporal conditional preference networks (TempCP-net) The global preference ranking of a set of requests is computed using a textitk-d tree indexing based temporal similarity measure approach. We design the on-policy based sequential selection learning approach that applies the length of request to accept or reject requests in a composition.
Score: 0.11470070927586014
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose a novel IaaS composition framework that selects an optimal set of consumer requests according to the provider's qualitative preferences on long-term service provisions. Decision variables are included in the temporal conditional preference networks (TempCP-net) to represent qualitative preferences for both short-term and long-term consumers. The global preference ranking of a set of requests is computed using a \textit{k}-d tree indexing based temporal similarity measure approach. We propose an extended three-dimensional Q-learning approach to maximize the global preference ranking. We design the on-policy based sequential selection learning approach that applies the length of request to accept or reject requests in a composition. The proposed on-policy based learning method reuses historical experiences or policies of sequential optimization using an agglomerative clustering approach. Experimental results prove the feasibility of the proposed framework.

Related papers

Integrating Response Time and Attention Duration in Bayesian Preference Learning for Multiple Criteria Decision Aiding [2.9457161327910693]
We introduce a multiple criteria Bayesian preference learning framework incorporating behavioral cues for decision aiding. The framework integrates pairwise comparisons, response time, and attention duration to deepen insights into decision-making processes.
arXiv Detail & Related papers (2025-04-21T08:01:44Z)
Hierarchical Reinforcement Learning for Temporal Abstraction of Listwise Recommendation [51.06031200728449]
We propose a novel framework called mccHRL to provide different levels of temporal abstraction on listwise recommendation. Within the hierarchical framework, the high-level agent studies the evolution of user perception, while the low-level agent produces the item selection policy. Results observe significant performance improvement by our method, compared with several well-known baselines.
arXiv Detail & Related papers (2024-09-11T17:01:06Z)
An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences. We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration. Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z)
Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization [105.3612692153615]
We propose a new axis based on eliciting preferences jointly over instruction-response pairs. Joint preferences over instruction and response pairs can significantly enhance the alignment of large language models.
arXiv Detail & Related papers (2024-03-31T02:05:40Z)
SA-LSPL:Sequence-Aware Long- and Short- Term Preference Learning for next POI recommendation [19.40796508546581]
Point of Interest (POI) recommendation aims to recommend the POI for users at a specific time. We propose a novel approach called Sequence-Aware Long- and Short-Term Preference Learning (SA-LSPL) for next-POI recommendation.
arXiv Detail & Related papers (2024-03-30T13:40:25Z)
Sample Complexity of Preference-Based Nonparametric Off-Policy Evaluation with Deep Networks [58.469818546042696]
We study the sample efficiency of OPE with human preference and establish a statistical guarantee for it. By appropriately selecting the size of a ReLU network, we show that one can leverage any low-dimensional manifold structure in the Markov decision process.
arXiv Detail & Related papers (2023-10-16T16:27:06Z)
A Parametric Class of Approximate Gradient Updates for Policy Optimization [47.69337420768319]
We develop a unified perspective that re-expresses the underlying updates in terms of a limited choice of gradient form and scaling function. We obtain novel yet well motivated updates that generalize existing algorithms in a way that can deliver benefits both in terms of convergence speed and final result quality.
arXiv Detail & Related papers (2022-06-17T01:28:38Z)
Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment. Policy gradients for local search are often obtained from random perturbations. We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z)
Probabilistic Planning with Preferences over Temporal Goals [21.35365462532568]
We present a formal language for specifying qualitative preferences over temporal goals and a preference-based planning method in systems. Using automata-theoretic modeling, the proposed specification allows us to express preferences over different sets of outcomes, where each outcome describes a set of temporal sequences of subgoals. We define the value of preference satisfaction given a process over possible outcomes and develop an algorithm for time-constrained probabilistic planning in labeled Markov decision processes.
arXiv Detail & Related papers (2021-03-26T14:26:40Z)
A study of the Multicriteria decision analysis based on the time-series features and a TOPSIS method proposal for a tensorial approach [1.3750624267664155]
We propose a new approach to rank the alternatives based on the criteria time-series features (tendency, variance, etc.) In this novel approach, the data is structured in three dimensions, which require a more complex data structure, as the textittensors. Computational results reveal that it is possible to rank the alternatives from a new perspective by considering meaningful decision-making information.
arXiv Detail & Related papers (2020-10-21T14:37:02Z)
Stochastic batch size for adaptive regularization in deep network optimization [63.68104397173262]
We propose a first-order optimization algorithm incorporating adaptive regularization applicable to machine learning problems in deep learning framework. We empirically demonstrate the effectiveness of our algorithm using an image classification task based on conventional network models applied to commonly used benchmark datasets.
arXiv Detail & Related papers (2020-04-14T07:54:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.