Related papers: Probabilistic partition of unity networks for high-dimensional regression problems

Probabilistic partition of unity networks for high-dimensional regression problems

URL: http://arxiv.org/abs/2210.02694v2
Date: Sun, 11 Jun 2023 06:16:34 GMT
Title: Probabilistic partition of unity networks for high-dimensional regression problems
Authors: Tiffany Fan, Nathaniel Trask, Marta D'Elia, Eric Darve
Abstract summary: We explore the partition of unity network (PPOU-Net) model in the context of high-dimensional regression problems. We propose a general framework focusing on adaptive dimensionality reduction. The PPOU-Nets consistently outperform the baseline fully-connected neural networks of comparable sizes in numerical experiments.
Score: 1.0227479910430863
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We explore the probabilistic partition of unity network (PPOU-Net) model in the context of high-dimensional regression problems and propose a general framework focusing on adaptive dimensionality reduction. With the proposed framework, the target function is approximated by a mixture of experts model on a low-dimensional manifold, where each cluster is associated with a local fixed-degree polynomial. We present a training strategy that leverages the expectation maximization (EM) algorithm. During the training, we alternate between (i) applying gradient descent to update the DNN coefficients; and (ii) using closed-form formulae derived from the EM algorithm to update the mixture of experts model parameters. Under the probabilistic formulation, step (ii) admits the form of embarrassingly parallelizable weighted least-squares solves. The PPOU-Nets consistently outperform the baseline fully-connected neural networks of comparable sizes in numerical experiments of various data dimensions. We also explore the proposed model in applications of quantum computing, where the PPOU-Nets act as surrogate models for cost landscapes associated with variational quantum circuits.

Related papers

Self-Supervised Coarsening of Unstructured Grid with Automatic Differentiation [55.88862563823878]
In this work, we present an original algorithm to coarsen an unstructured grid based on the concepts of differentiable physics.<n>We demonstrate performance of the algorithm on two PDEs: a linear equation which governs slightly compressible fluid flow in porous media and the wave equation.<n>Our results show that in the considered scenarios, we reduced the number of grid points up to 10 times while preserving the modeled variable dynamics in the points of interest.
arXiv Detail & Related papers (2025-07-24T11:02:13Z)
Generalized Factor Neural Network Model for High-dimensional Regression [50.554377879576066]
We tackle the challenges of modeling high-dimensional data sets with latent low-dimensional structures hidden within complex, non-linear, and noisy relationships. Our approach enables a seamless integration of concepts from non-parametric regression, factor models, and neural networks for high-dimensional regression.
arXiv Detail & Related papers (2025-02-16T23:13:55Z)
Preconditioned Inexact Stochastic ADMM for Deep Model [35.37705488695026]
This paper develops an algorithm, PISA, which enables scalable parallel computing and supports various second-moment schemes. Grounded in rigorous theoretical guarantees, the algorithm converges under the sole assumption of Lipschitz of the gradient. Comprehensive experimental evaluations for or fine-tuning diverse FMs, including vision models, large language models, reinforcement learning models, generative adversarial networks, and recurrent neural networks, demonstrate its superior numerical performance compared to various state-of-the-art Directions.
arXiv Detail & Related papers (2025-02-15T12:28:51Z)
The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes. In this paper we examine the use of convex neural recovery models. We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program. We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z)
Deep Learning-based surrogate models for parametrized PDEs: handling geometric variability through graph neural networks [0.0]
This work explores the potential usage of graph neural networks (GNNs) for the simulation of time-dependent PDEs. We propose a systematic strategy to build surrogate models based on a data-driven time-stepping scheme. We show that GNNs can provide a valid alternative to traditional surrogate models in terms of computational efficiency and generalization to new scenarios.
arXiv Detail & Related papers (2023-08-03T08:14:28Z)
Distributed Bayesian Learning of Dynamic States [65.7870637855531]
The proposed algorithm is a distributed Bayesian filtering task for finite-state hidden Markov models. It can be used for sequential state estimation, as well as for modeling opinion formation over social networks under dynamic environments.
arXiv Detail & Related papers (2022-12-05T19:40:17Z)
Probabilistic partition of unity networks: clustering based deep approximation [0.0]
Partition of unity networks (POU-Nets) have been shown capable of realizing algebraic convergence rates for regression and solution of PDEs. We enrich POU-Nets with a Gaussian noise model to obtain a probabilistic generalization amenable to gradient-based generalizations of a maximum likelihood loss. We provide benchmarks quantifying performance in high/low-dimensions, demonstrating that convergence rates depend only on the latent dimension of data within high-dimensional space.
arXiv Detail & Related papers (2021-07-07T08:02:00Z)
Post-mortem on a deep learning contest: a Simpson's paradox and the complementary roles of scale metrics versus shape metrics [61.49826776409194]
We analyze a corpus of models made publicly-available for a contest to predict the generalization accuracy of neural network (NN) models. We identify what amounts to a Simpson's paradox: where "scale" metrics perform well overall but perform poorly on sub partitions of the data. We present two novel shape metrics, one data-independent, and the other data-dependent, which can predict trends in the test accuracy of a series of NNs.
arXiv Detail & Related papers (2021-06-01T19:19:49Z)
Optimal Transport Based Refinement of Physics-Informed Neural Networks [0.0]
We propose a refinement strategy to the well-known Physics-Informed Neural Networks (PINNs) for solving partial differential equations (PDEs) based on the concept of Optimal Transport (OT) PINNs solvers have been found to suffer from a host of issues: spectral bias in fully-connected pathologies, unstable gradient, and difficulties with convergence and accuracy. We present a novel training strategy for solving the Fokker-Planck-Kolmogorov Equation (FPKE) using OT-based sampling to supplement the existing PINNs framework.
arXiv Detail & Related papers (2021-05-26T02:51:20Z)
Neural Mixture Distributional Regression [0.9023847175654603]
We present a holistic framework to estimate finite mixtures of distributional regressions defined by flexible additive predictors. Our framework is able to handle a large number of mixtures of potentially different distributions in high-dimensional settings.
arXiv Detail & Related papers (2020-10-14T09:00:16Z)
Identification of Probability weighted ARX models with arbitrary domains [75.91002178647165]
PieceWise Affine models guarantees universal approximation, local linearity and equivalence to other classes of hybrid system. In this work, we focus on the identification of PieceWise Auto Regressive with eXogenous input models with arbitrary regions (NPWARX) The architecture is conceived following the Mixture of Expert concept, developed within the machine learning field.
arXiv Detail & Related papers (2020-09-29T12:50:33Z)
Model Fusion with Kullback--Leibler Divergence [58.20269014662046]
We propose a method to fuse posterior distributions learned from heterogeneous datasets. Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors.
arXiv Detail & Related papers (2020-07-13T03:27:45Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
Model Reduction and Neural Networks for Parametric PDEs [9.405458160620533]
We develop a framework for data-driven approximation of input-output maps between infinite-dimensional spaces. The proposed approach is motivated by the recent successes of neural networks and deep learning. For a class of input-output maps, and suitably chosen probability measures on the inputs, we prove convergence of the proposed approximation methodology.
arXiv Detail & Related papers (2020-05-07T00:09:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.