Related papers: A Partitioned Sparse Variational Gaussian Process for Fast, Distributed Spatial Modeling

A Partitioned Sparse Variational Gaussian Process for Fast, Distributed Spatial Modeling

URL: http://arxiv.org/abs/2507.16771v1
Date: Tue, 22 Jul 2025 17:20:07 GMT
Title: A Partitioned Sparse Variational Gaussian Process for Fast, Distributed Spatial Modeling
Authors: Michael Grosskopf, Kellin Rumsey, Ayan Biswas, Earl Lawrence,
Abstract summary: Next generation of Department of Energy supercomputers will be capable of exascale computation.<n>For these machines, far more computation will be possible than that which can be saved to disk.<n>There will be an urgent need for machine learning algorithms which can be trained in situ.
Score: 1.4549461207028445
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The next generation of Department of Energy supercomputers will be capable of exascale computation. For these machines, far more computation will be possible than that which can be saved to disk. As a result, users will be unable to rely on post-hoc access to data for uncertainty quantification and other statistical analyses and there will be an urgent need for sophisticated machine learning algorithms which can be trained in situ. Algorithms deployed in this setting must be highly scalable, memory efficient and capable of handling data which is distributed across nodes as spatially contiguous partitions. One suitable approach involves fitting a sparse variational Gaussian process (SVGP) model independently and in parallel to each spatial partition. The resulting model is scalable, efficient and generally accurate, but produces the undesirable effect of constructing discontinuous response surfaces due to the disagreement between neighboring models at their shared boundary. In this paper, we extend this idea by allowing for a small amount of communication between neighboring spatial partitions which encourages better alignment of the local models, leading to smoother spatial predictions and a better fit in general. Due to our decentralized communication scheme, the proposed extension remains highly scalable and adds very little overhead in terms of computation (and none, in terms of memory). We demonstrate this Partitioned SVGP (PSVGP) approach for the Energy Exascale Earth System Model (E3SM) and compare the results to the independent SVGP case.

Related papers

Joint Optimization of Model Partitioning and Resource Allocation for Anti-Jamming Collaborative Inference Systems [52.842088497389746]
This letter focuses on an anti-jamming collaborative inference system in the presence of a malicious jammer.<n>We first analyze the effects of jamming and DNN partitioning on inference accuracy via data regression.<n>We propose an efficient alternating optimization-based algorithm, which decomposes the problem into three subproblems.
arXiv Detail & Related papers (2026-03-03T03:52:52Z)
Efficient Level-Crossing Probability Calculation for Gaussian Process Modeled Data [40.64649491938366]
Gaussian process regression (GPR) models are a natural way to model data with Gaussian-distributed uncertainties.<n>In this paper, we accelerate the level-crossing probability calculation efficiency on GPR models by subdividing the data spatially.<n>We demonstrate that our value occurrence probability estimation is accurate with a low cost by experiments that calculate the level-crossing probability fields on different datasets.
arXiv Detail & Related papers (2025-12-13T19:48:25Z)
Efficient Identification of High Similarity Clusters in Polygon Datasets [0.0]
We propose a framework that reduces the number of clusters requiring verification, thereby decreasing the computational load on these systems.<n>The framework integrates dynamic similarity index thresholding, supervised scheduling, and recall-constrained optimization.<n>Our approach achieves substantial reductions in computational cost without sacrificing accuracy.
arXiv Detail & Related papers (2025-09-28T15:39:15Z)
Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU.<n>As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z)
Scalable Data Assimilation with Message Passing [9.55393191483615]
We exploit the formulation of data assimilation as a Bayesian inference problem and apply a message-passing algorithm to solve the spatial inference problem. We can scale the algorithm to very large grid sizes while retaining good accuracy and compute and memory requirements.
arXiv Detail & Related papers (2024-04-19T15:54:15Z)
Gaussian Ensemble Belief Propagation for Efficient Inference in High-Dimensional Systems [3.6773638205393198]
Efficient inference in high-dimensional models is a central challenge in machine learning.<n>We introduce the Gaussian Ensemble Belief Propagation (GEnBP) algorithm.<n>We show that GEnBP outperforms existing belief methods in terms of accuracy and computational efficiency.
arXiv Detail & Related papers (2024-02-13T03:31:36Z)
Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized Language Model Finetuning Using Shared Randomness [86.61582747039053]
Language model training in distributed settings is limited by the communication cost of exchanges. We extend recent work using shared randomness to perform distributed fine-tuning with low bandwidth.
arXiv Detail & Related papers (2023-06-16T17:59:51Z)
Flag Aggregator: Scalable Distributed Training under Failures and Augmented Losses using Convex Optimization [14.732408788010313]
ML applications increasingly rely on complex deep learning models and large datasets. To scale computation and data, these models are inevitably trained in a distributed manner in clusters of nodes, and their updates are aggregated before being applied to the model. With data augmentation added to these settings, there is a critical need for robust and efficient aggregation systems. We show that our approach significantly enhances the robustness of state-of-the-art Byzantine resilient aggregators.
arXiv Detail & Related papers (2023-02-12T06:38:30Z)
CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation. We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration. The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z)
Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states. Our method is widely applicable to classical DP-based inference. It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z)
Probabilistic partition of unity networks: clustering based deep approximation [0.0]
Partition of unity networks (POU-Nets) have been shown capable of realizing algebraic convergence rates for regression and solution of PDEs. We enrich POU-Nets with a Gaussian noise model to obtain a probabilistic generalization amenable to gradient-based generalizations of a maximum likelihood loss. We provide benchmarks quantifying performance in high/low-dimensions, demonstrating that convergence rates depend only on the latent dimension of data within high-dimensional space.
arXiv Detail & Related papers (2021-07-07T08:02:00Z)
Real-Time Regression with Dividing Local Gaussian Processes [62.01822866877782]
Local Gaussian processes are a novel, computationally efficient modeling approach based on Gaussian process regression. Due to an iterative, data-driven division of the input space, they achieve a sublinear computational complexity in the total number of training points in practice. A numerical evaluation on real-world data sets shows their advantages over other state-of-the-art methods in terms of accuracy as well as prediction and update speed.
arXiv Detail & Related papers (2020-06-16T18:43:31Z)
FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data. To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.