Going faster to see further: GPU-accelerated value iteration and
simulation for perishable inventory control using JAX
- URL: http://arxiv.org/abs/2303.10672v1
- Date: Sun, 19 Mar 2023 14:20:44 GMT
- Title: Going faster to see further: GPU-accelerated value iteration and
simulation for perishable inventory control using JAX
- Authors: Joseph Farrington, Kezhi Li, Wai Keong Wong, Martin Utley
- Abstract summary: We use the Python library JAX to implement value iteration and simulators of the underlying Markov decision processes in a high-level API.
Our method can extend use of value iteration to settings that were previously considered infeasible or impractical.
We compare the performance of the optimal replenishment policies to policies, fitted using simulation optimization in JAX which allowed the parallel evaluation of multiple candidate policy parameters.
- Score: 5.856836693166898
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Value iteration can find the optimal replenishment policy for a perishable
inventory problem, but is computationally demanding due to the large state
spaces that are required to represent the age profile of stock. The parallel
processing capabilities of modern GPUs can reduce the wall time required to run
value iteration by updating many states simultaneously. The adoption of
GPU-accelerated approaches has been limited in operational research relative to
other fields like machine learning, in which new software frameworks have made
GPU programming widely accessible. We used the Python library JAX to implement
value iteration and simulators of the underlying Markov decision processes in a
high-level API, and relied on this library's function transformations and
compiler to efficiently utilize GPU hardware. Our method can extend use of
value iteration to settings that were previously considered infeasible or
impractical. We demonstrate this on example scenarios from three recent studies
which include problems with over 16 million states and additional problem
features, such as substitution between products, that increase computational
complexity. We compare the performance of the optimal replenishment policies to
heuristic policies, fitted using simulation optimization in JAX which allowed
the parallel evaluation of multiple candidate policy parameters on thousands of
simulated years. The heuristic policies gave a maximum optimality gap of 2.49%.
Our general approach may be applicable to a wide range of problems in
operational research that would benefit from large-scale parallel computation
on consumer-grade GPU hardware.
Related papers
- Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading [2.8231000588510757]
Transformers and large language models(LLMs) have seen rapid adoption in all domains.
Training of transformers is very expensive and often hits a memory wall''
We propose a novel technique to split the LLM into subgroups, whose update phase is scheduled on either the CPU or the GPU.
arXiv Detail & Related papers (2024-10-26T00:43:59Z) - Implementation and Analysis of GPU Algorithms for Vecchia Approximation [0.8057006406834466]
Vecchia Approximation is widely used to reduce the computational complexity and can be calculated with embarrassingly parallel algorithms.
While multi-core software has been developed for Vecchia Approximation, software designed to run on graphics processing units ( GPU) is lacking.
We show that our new method outperforms the other two and then present it in the GpGpU R package.
arXiv Detail & Related papers (2024-07-03T01:24:44Z) - JaxMARL: Multi-Agent RL Environments and Algorithms in JAX [105.343918678781]
We present JaxMARL, the first open-source, Python-based library that combines GPU-enabled efficiency with support for a large number of commonly used MARL environments.
Our experiments show that, in terms of wall clock time, our JAX-based training pipeline is around 14 times faster than existing approaches.
We also introduce and benchmark SMAX, a JAX-based approximate reimplementation of the popular StarCraft Multi-Agent Challenge.
arXiv Detail & Related papers (2023-11-16T18:58:43Z) - High Performance Computing Applied to Logistic Regression: A CPU and GPU
Implementation Comparison [0.0]
We present a versatile GPU-based parallel version of Logistic Regression (LR)
Our implementation is a direct translation of the parallel Gradient Descent Logistic Regression algorithm proposed by X. Zou et al.
Our method is particularly advantageous for real-time prediction applications like image recognition, spam detection, and fraud detection.
arXiv Detail & Related papers (2023-08-19T14:49:37Z) - Local object crop collision network for efficient simulation of
non-convex objects in GPU-based simulators [6.33790920152602]
Our goal is to develop an efficient contact detection algorithm for large-scale simulation of non-network objects.
We propose a data-driven approach for CD, whose accuracy depends only on the quality and quantity of supplementary materials.
arXiv Detail & Related papers (2023-04-19T06:09:12Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - Towards making the most of NLP-based device mapping optimization for
OpenCL kernels [5.6596607119831575]
We extend the work of Cummins et al., namely Deeptune, that tackles the problem of optimal device selection ( CPU or GPU) for accelerated OpenCL kernels.
We propose four different models that provide enhanced contextual information of source codes.
Experimental results show that our proposed methodology surpasses that of Cummins et al. work, providing up to 4% improvement in prediction accuracy.
arXiv Detail & Related papers (2022-08-30T10:20:55Z) - Adaptive Elastic Training for Sparse Deep Learning on Heterogeneous
Multi-GPU Servers [65.60007071024629]
We show that Adaptive SGD outperforms four state-of-the-art solutions in time-to-accuracy.
We show experimentally that Adaptive SGD outperforms four state-of-the-art solutions in time-to-accuracy.
arXiv Detail & Related papers (2021-10-13T20:58:15Z) - Providing Meaningful Data Summarizations Using Examplar-based Clustering
in Industry 4.0 [67.80123919697971]
We show, that our GPU implementation provides speedups of up to 72x using single-precision and up to 452x using half-precision compared to conventional CPU algorithms.
We apply our algorithm to real-world data from injection molding manufacturing processes and discuss how found summaries help with steering this specific process to cut costs and reduce the manufacturing of bad parts.
arXiv Detail & Related papers (2021-05-25T15:55:14Z) - Kernel methods through the roof: handling billions of points efficiently [94.31450736250918]
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems.
Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections.
Here, we push these efforts further to develop and test a solver that takes full advantage of GPU hardware.
arXiv Detail & Related papers (2020-06-18T08:16:25Z) - MPLP++: Fast, Parallel Dual Block-Coordinate Ascent for Dense Graphical
Models [96.1052289276254]
This work introduces a new MAP-solver, based on the popular Dual Block-Coordinate Ascent principle.
Surprisingly, by making a small change to the low-performing solver, we derive the new solver MPLP++ that significantly outperforms all existing solvers by a large margin.
arXiv Detail & Related papers (2020-04-16T16:20:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.