Related papers: Visualizing the Loss Landscape of Actor Critic Methods with Applications in Inventory Optimization

Visualizing the Loss Landscape of Actor Critic Methods with Applications in Inventory Optimization

URL: http://arxiv.org/abs/2009.02391v1
Date: Fri, 4 Sep 2020 20:52:05 GMT
Title: Visualizing the Loss Landscape of Actor Critic Methods with Applications in Inventory Optimization
Authors: Recep Yusuf Bekci, Mehmet G\"um\"u\c{s}
Abstract summary: We show the characteristics of the actor loss function which is the essential part of the optimization. We apply our approach to multi-store dynamic inventory control, a notoriously difficult problem in supply chain operations, and explore the shape of the loss function associated with the optimal policy.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Continuous control is a widely applicable area of reinforcement learning. The main players of this area are actor-critic methods that utilize policy gradients of neural approximators as a common practice. The focus of our study is to show the characteristics of the actor loss function which is the essential part of the optimization. We exploit low dimensional visualizations of the loss function and provide comparisons for loss landscapes of various algorithms. Furthermore, we apply our approach to multi-store dynamic inventory control, a notoriously difficult problem in supply chain operations, and explore the shape of the loss function associated with the optimal policy. We modelled and solved the problem using reinforcement learning while having a loss landscape in favor of optimality.

Related papers

Loss Functions in Deep Learning: A Comprehensive Review [3.8001666556614446]
Loss functions are at the heart of deep learning, shaping how models learn and perform across diverse tasks. This paper presents a comprehensive review of loss functions, covering fundamental metrics like Mean Squared Error and Cross-Entropy to advanced functions such as Adversarial and Diffusion losses.
arXiv Detail & Related papers (2025-04-05T18:07:20Z)
The Central Role of the Loss Function in Reinforcement Learning [46.72524235085568]
We demonstrate how different regression loss functions affect the sample efficiency and adaptivity of value-based decision making algorithms. Across multiple settings, we prove that algorithms using the binary cross-entropy loss achieve first-order bounds scaling with the optimal policy's cost. We hope that this paper serves as a guide analyzing decision making algorithms with varying loss functions, and can inspire the reader to seek out better loss functions to improve any decision making algorithm.
arXiv Detail & Related papers (2024-09-19T14:10:38Z)
Can No-Reference Quality-Assessment Methods Serve as Perceptual Losses for Super-Resolution? [0.0]
Perceptual losses play an important role in constructing deep-neural-network-based methods. This paper investigates direct optimization of several video-superresolution models using no-reference image-quality-assessment methods as perceptual losses.
arXiv Detail & Related papers (2024-05-30T18:04:58Z)
Gradient constrained sharpness-aware prompt learning for vision-language models [99.74832984957025]
This paper targets a novel trade-off problem in generalizable prompt learning for vision-language models (VLM) By analyzing the loss landscapes of the state-of-the-art method and vanilla Sharpness-aware Minimization (SAM) based method, we conclude that the trade-off performance correlates to both loss value and loss sharpness. We propose a novel SAM-based method for prompt learning, denoted as Gradient Constrained Sharpness-aware Context Optimization (GCSCoOp)
arXiv Detail & Related papers (2023-09-14T17:13:54Z)
A survey and taxonomy of loss functions in machine learning [51.35995529962554]
We present a comprehensive overview of the most widely used loss functions across key applications, including regression, classification, generative modeling, ranking, and energy-based modeling. We introduce 43 distinct loss functions, structured within an intuitive taxonomy that clarifies their theoretical foundations, properties, and optimal application contexts.
arXiv Detail & Related papers (2023-01-13T14:38:24Z)
Lexicographic Multi-Objective Reinforcement Learning [65.90380946224869]
We present a family of both action-value and policy gradient algorithms that can be used to solve such problems. We show how our algorithms can be used to impose safety constraints on the behaviour of an agent, and compare their performance in this context with that of other constrained reinforcement learning algorithms.
arXiv Detail & Related papers (2022-12-28T10:22:36Z)
Low-Dimensional State and Action Representation Learning with MDP Homomorphism Metrics [1.5293427903448022]
Deep Reinforcement Learning has shown its ability in solving complicated problems directly from high-dimensional observations. In end-to-end settings, Reinforcement Learning algorithms are not sample-efficient and requires long training times and quantities of data. We propose a framework for sample-efficient Reinforcement Learning that take advantage of state and action representations to transform a high-dimensional problem into a low-dimensional one.
arXiv Detail & Related papers (2021-07-04T16:26:04Z)
Logistic Q-Learning [87.00813469969167]
We propose a new reinforcement learning algorithm derived from a regularized linear-programming formulation of optimal control in MDPs. The main feature of our algorithm is a convex loss function for policy evaluation that serves as a theoretically sound alternative to the widely used squared Bellman error.
arXiv Detail & Related papers (2020-10-21T17:14:31Z)
Online Convex Optimization Perspective for Learning from Dynamically Revealed Preferences [0.0]
We study the problem of online learning from revealed preferences. A learner wishes to learn a non-strategic agent's private utility function through observing the agent's utility-maximizing actions in a changing environment. We adopt an online inverse optimization setup, where the learner observes a stream of agent's actions in an online fashion and the learning performance is measured by regret associated with a loss function.
arXiv Detail & Related papers (2020-08-24T14:05:13Z)
On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them [57.957466608543676]
We analyze the influence of adversarial training on the loss landscape of machine learning models. We show that the adversarial loss landscape is less favorable to optimization, due to increased curvature and more scattered gradients.
arXiv Detail & Related papers (2020-06-15T13:50:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.