Related papers: Successive Refinement in Large-Scale Computation: Advancing Model Inference Applications

Successive Refinement in Large-Scale Computation: Advancing Model Inference Applications

URL: http://arxiv.org/abs/2402.07229v1
Date: Sun, 11 Feb 2024 15:36:33 GMT
Title: Successive Refinement in Large-Scale Computation: Advancing Model Inference Applications
Authors: Homa Esfahanizadeh, Alejandro Cohen, Shlomo Shamai (Shitz), Muriel Medard
Abstract summary: We introduce solutions for layered-resolution computation. These solutions allow lower-resolution results to be obtained at an earlier stage than the final result.
Score: 67.76749044675721
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Modern computationally-intensive applications often operate under time constraints, necessitating acceleration methods and distribution of computational workloads across multiple entities. However, the outcome is either achieved within the desired timeline or not, and in the latter case, valuable resources are wasted. In this paper, we introduce solutions for layered-resolution computation. These solutions allow lower-resolution results to be obtained at an earlier stage than the final result. This innovation notably enhances the deadline-based systems, as if a computational job is terminated due to time constraints, an approximate version of the final result can still be generated. Moreover, in certain operational regimes, a high-resolution result might be unnecessary, because the low-resolution result may already deviate significantly from the decision threshold, for example in AI-based decision-making systems. Therefore, operators can decide whether higher resolution is needed or not based on intermediate results, enabling computations with adaptive resolution. We present our framework for two critical and computationally demanding jobs: distributed matrix multiplication (linear) and model inference in machine learning (nonlinear). Our theoretical and empirical results demonstrate that the execution delay for the first resolution is significantly shorter than that for the final resolution, while maintaining overall complexity comparable to the conventional one-shot approach. Our experiments further illustrate how the layering feature increases the likelihood of meeting deadlines and enables adaptability and transparency in massive, large-scale computations.

Related papers

Solving Sparse \& High-Dimensional-Output Regression via Compression [2.7596444457918263]
Multi-Output Regression (MOR) has been widely used in scientific data analysis for decision-making. The increasing dimensionality of the outputs poses significant challenges regarding interpretability and computational scalability for modern MOR applications. This paper proposes a Sparse & High-dimensional-Output REgression model by incorporating additional sparsity requirements to resolve the output interpretability. We show that the proposed framework is computationally scalable while maintaining the same order of training loss and prediction loss before-and-after compression under arbitrary or relatively weak sample set conditions.
arXiv Detail & Related papers (2024-10-21T08:21:25Z)
Predicting Probabilities of Error to Combine Quantization and Early Exiting: QuEE [68.6018458996143]
We propose a more general dynamic network that can combine both quantization and early exit dynamic network: QuEE. Our algorithm can be seen as a form of soft early exiting or input-dependent compression. The crucial factor of our approach is accurate prediction of the potential accuracy improvement achievable through further computation.
arXiv Detail & Related papers (2024-06-20T15:25:13Z)
An Operator Learning Framework for Spatiotemporal Super-resolution of Scientific Simulations [3.921076451326108]
The Super Resolution Operator Network (SRNet) frames super-resolution as an operator learning problem. It draws inspiration from existing operator learning problems to learn continuous representations of parametric differential equations from low-resolution approximations. No restrictions are imposed on the locations of sensors at which the low-resolution approximations are provided.
arXiv Detail & Related papers (2023-11-04T05:33:23Z)
Taking the human out of decomposition-based optimization via artificial intelligence: Part II. Learning to initialize [0.0]
The proposed approach can lead to a significant reduction in solution time. Active and supervised learning is used to learn a surrogate model that predicts the computational performance. The results show that the proposed approach can lead to a significant reduction in solution time.
arXiv Detail & Related papers (2023-10-10T23:49:26Z)
The Statistical Complexity of Interactive Decision Making [126.04974881555094]
We provide a complexity measure, the Decision-Estimation Coefficient, that is proven to be both necessary and sufficient for sample-efficient interactive learning. A unified algorithm design principle, Estimation-to-Decisions (E2D), transforms any algorithm for supervised estimation into an online algorithm for decision making.
arXiv Detail & Related papers (2021-12-27T02:53:44Z)
Combining Deep Learning and Optimization for Security-Constrained Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems. Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs. This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)
Coded Distributed Computing with Partial Recovery [56.08535873173518]
We introduce a novel coded matrix-vector multiplication scheme, called coded computation with partial recovery (CCPR) CCPR reduces both the computation time and the decoding complexity by allowing a trade-off between the accuracy and the speed of computation. We then extend this approach to distributed implementation of more general computation tasks by proposing a coded communication scheme with partial recovery.
arXiv Detail & Related papers (2020-07-04T21:34:49Z)
Polynomial-Time Exact MAP Inference on Discrete Models with Global Dependencies [83.05591911173332]
junction tree algorithm is the most general solution for exact MAP inference with run-time guarantees. We propose a new graph transformation technique via node cloning which ensures a run-time for solving our target problem independently of the form of a corresponding clique tree.
arXiv Detail & Related papers (2019-12-27T13:30:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.