Visualizing the Loss Landscape of Actor Critic Methods with Applications
in Inventory Optimization
- URL: http://arxiv.org/abs/2009.02391v1
- Date: Fri, 4 Sep 2020 20:52:05 GMT
- Title: Visualizing the Loss Landscape of Actor Critic Methods with Applications
in Inventory Optimization
- Authors: Recep Yusuf Bekci, Mehmet G\"um\"u\c{s}
- Abstract summary: We show the characteristics of the actor loss function which is the essential part of the optimization.
We apply our approach to multi-store dynamic inventory control, a notoriously difficult problem in supply chain operations, and explore the shape of the loss function associated with the optimal policy.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continuous control is a widely applicable area of reinforcement learning. The
main players of this area are actor-critic methods that utilize policy
gradients of neural approximators as a common practice. The focus of our study
is to show the characteristics of the actor loss function which is the
essential part of the optimization. We exploit low dimensional visualizations
of the loss function and provide comparisons for loss landscapes of various
algorithms. Furthermore, we apply our approach to multi-store dynamic inventory
control, a notoriously difficult problem in supply chain operations, and
explore the shape of the loss function associated with the optimal policy. We
modelled and solved the problem using reinforcement learning while having a
loss landscape in favor of optimality.
Related papers
- The Central Role of the Loss Function in Reinforcement Learning [46.72524235085568]
We demonstrate how different regression loss functions affect the sample efficiency and adaptivity of value-based decision making algorithms.
Across multiple settings, we prove that algorithms using the binary cross-entropy loss achieve first-order bounds scaling with the optimal policy's cost.
We hope that this paper serves as a guide analyzing decision making algorithms with varying loss functions, and can inspire the reader to seek out better loss functions to improve any decision making algorithm.
arXiv Detail & Related papers (2024-09-19T14:10:38Z) - Can No-Reference Quality-Assessment Methods Serve as Perceptual Losses for Super-Resolution? [0.0]
Perceptual losses play an important role in constructing deep-neural-network-based methods.
This paper investigates direct optimization of several video-superresolution models using no-reference image-quality-assessment methods as perceptual losses.
arXiv Detail & Related papers (2024-05-30T18:04:58Z) - Gradient constrained sharpness-aware prompt learning for vision-language
models [99.74832984957025]
This paper targets a novel trade-off problem in generalizable prompt learning for vision-language models (VLM)
By analyzing the loss landscapes of the state-of-the-art method and vanilla Sharpness-aware Minimization (SAM) based method, we conclude that the trade-off performance correlates to both loss value and loss sharpness.
We propose a novel SAM-based method for prompt learning, denoted as Gradient Constrained Sharpness-aware Context Optimization (GCSCoOp)
arXiv Detail & Related papers (2023-09-14T17:13:54Z) - A survey and taxonomy of loss functions in machine learning [51.35995529962554]
We present a comprehensive overview of the most widely used loss functions across key applications, including regression, classification, generative modeling, ranking, and energy-based modeling.
We introduce 43 distinct loss functions, structured within an intuitive taxonomy that clarifies their theoretical foundations, properties, and optimal application contexts.
arXiv Detail & Related papers (2023-01-13T14:38:24Z) - Lexicographic Multi-Objective Reinforcement Learning [65.90380946224869]
We present a family of both action-value and policy gradient algorithms that can be used to solve such problems.
We show how our algorithms can be used to impose safety constraints on the behaviour of an agent, and compare their performance in this context with that of other constrained reinforcement learning algorithms.
arXiv Detail & Related papers (2022-12-28T10:22:36Z) - Low-Dimensional State and Action Representation Learning with MDP
Homomorphism Metrics [1.5293427903448022]
Deep Reinforcement Learning has shown its ability in solving complicated problems directly from high-dimensional observations.
In end-to-end settings, Reinforcement Learning algorithms are not sample-efficient and requires long training times and quantities of data.
We propose a framework for sample-efficient Reinforcement Learning that take advantage of state and action representations to transform a high-dimensional problem into a low-dimensional one.
arXiv Detail & Related papers (2021-07-04T16:26:04Z) - Logistic Q-Learning [87.00813469969167]
We propose a new reinforcement learning algorithm derived from a regularized linear-programming formulation of optimal control in MDPs.
The main feature of our algorithm is a convex loss function for policy evaluation that serves as a theoretically sound alternative to the widely used squared Bellman error.
arXiv Detail & Related papers (2020-10-21T17:14:31Z) - Online Convex Optimization Perspective for Learning from Dynamically
Revealed Preferences [0.0]
We study the problem of online learning from revealed preferences.
A learner wishes to learn a non-strategic agent's private utility function through observing the agent's utility-maximizing actions in a changing environment.
We adopt an online inverse optimization setup, where the learner observes a stream of agent's actions in an online fashion and the learning performance is measured by regret associated with a loss function.
arXiv Detail & Related papers (2020-08-24T14:05:13Z) - On the Loss Landscape of Adversarial Training: Identifying Challenges
and How to Overcome Them [57.957466608543676]
We analyze the influence of adversarial training on the loss landscape of machine learning models.
We show that the adversarial loss landscape is less favorable to optimization, due to increased curvature and more scattered gradients.
arXiv Detail & Related papers (2020-06-15T13:50:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.