Related papers: Convolutional Neural Networks as Summary Statistics for Approximate Bayesian Computation

Convolutional Neural Networks as Summary Statistics for Approximate Bayesian Computation

URL: http://arxiv.org/abs/2001.11760v5
Date: Mon, 12 Apr 2021 10:23:42 GMT
Title: Convolutional Neural Networks as Summary Statistics for Approximate Bayesian Computation
Authors: Mattias {\AA}kesson, Prashant Singh, Fredrik Wrede, Andreas Hellander
Abstract summary: This paper proposes a convolutional neural network architecture for automatically learning informative summary statistics of temporal responses. We show that the proposed network can effectively circumvent the statistics selection problem of the preprocessing step for ABC inference.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Approximate Bayesian Computation is widely used in systems biology for inferring parameters in stochastic gene regulatory network models. Its performance hinges critically on the ability to summarize high-dimensional system responses such as time series into a few informative, low-dimensional summary statistics. The quality of those statistics acutely impacts the accuracy of the inference task. Existing methods to select the best subset out of a pool of candidate statistics do not scale well with large pools of several tens to hundreds of candidate statistics. Since high quality statistics are imperative for good performance, this becomes a serious bottleneck when performing inference on complex and high-dimensional problems. This paper proposes a convolutional neural network architecture for automatically learning informative summary statistics of temporal responses. We show that the proposed network can effectively circumvent the statistics selection problem of the preprocessing step for ABC inference. The proposed approach is demonstrated on two benchmark problem and one challenging inference problem learning parameters in a high-dimensional stochastic genetic oscillator. We also study the impact of experimental design on network performance by comparing different data richness and data acquisition strategies.

Related papers

Meta-Statistical Learning: Supervised Learning of Statistical Inference [59.463430294611626]
This work demonstrates that the tools and principles driving the success of large language models (LLMs) can be repurposed to tackle distribution-level tasks. We propose meta-statistical learning, a framework inspired by multi-instance learning that reformulates statistical inference tasks as supervised learning problems.
arXiv Detail & Related papers (2025-02-17T18:04:39Z)
Learning Massive-scale Partial Correlation Networks in Clinical Multi-omics Studies with HP-ACCORD [10.459304300065186]
We introduce a novel pseudolikelihood-based graphical model framework. It maintains estimation and selection consistency in various metrics under high-dimensional assumptions. A high-performance computing implementation of our framework was tested in simulated data with up to one million variables.
arXiv Detail & Related papers (2024-12-16T08:38:02Z)
Soft Random Sampling: A Theoretical and Empirical Analysis [59.719035355483875]
Soft random sampling (SRS) is a simple yet effective approach for efficient deep neural networks when dealing with massive data. It selects a uniformly speed at random with replacement from each data set in each epoch. It is shown to be a powerful and competitive strategy with significant and competitive performance on real-world industrial scale.
arXiv Detail & Related papers (2023-11-21T17:03:21Z)
Multiple Imputation via Generative Adversarial Network for High-dimensional Blockwise Missing Value Problems [6.123324869194195]
We propose Multiple Imputation via Generative Adversarial Network (MI-GAN), a deep learning-based (in specific, a GAN-based) multiple imputation method. MI-GAN shows strong performance matching existing state-of-the-art imputation methods on high-dimensional datasets. In particular, MI-GAN significantly outperforms other imputation methods in the sense of statistical inference and computational speed.
arXiv Detail & Related papers (2021-12-21T20:19:37Z)
Differential privacy and robust statistics in high dimensions [49.50869296871643]
High-dimensional Propose-Test-Release (HPTR) builds upon three crucial components: the exponential mechanism, robust statistics, and the Propose-Test-Release mechanism. We show that HPTR nearly achieves the optimal sample complexity under several scenarios studied in the literature.
arXiv Detail & Related papers (2021-11-12T06:36:40Z)
Convolutional generative adversarial imputation networks for spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods. We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z)
Selection of Summary Statistics for Network Model Choice with Approximate Bayesian Computation [1.8884278918443564]
We study the utility of cost-based filter selection methods to account for different summary costs during the selection process. Our findings show that computationally inexpensive summary statistics can be efficiently selected with minimal impact on classification accuracy.
arXiv Detail & Related papers (2021-01-19T18:21:06Z)
Straggler-Resilient Federated Learning: Leveraging the Interplay Between Statistical Accuracy and System Heterogeneity [57.275753974812666]
Federated learning involves learning from data samples distributed across a network of clients while the data remains local. In this paper, we propose a novel straggler-resilient federated learning method that incorporates statistical characteristics of the clients' data to adaptively select the clients in order to speed up the learning procedure.
arXiv Detail & Related papers (2020-12-28T19:21:14Z)
High Dimensional Level Set Estimation with Bayesian Neural Network [58.684954492439424]
This paper proposes novel methods to solve the high dimensional Level Set Estimation problems using Bayesian Neural Networks. For each problem, we derive the corresponding theoretic information based acquisition function to sample the data points. Numerical experiments on both synthetic and real-world datasets show that our proposed method can achieve better results compared to existing state-of-the-art approaches.
arXiv Detail & Related papers (2020-12-17T23:21:53Z)
Statistical model-based evaluation of neural networks [74.10854783437351]
We develop an experimental setup for the evaluation of neural networks (NNs) The setup helps to benchmark a set of NNs vis-a-vis minimum-mean-square-error (MMSE) performance bounds. This allows us to test the effects of training data size, data dimension, data geometry, noise, and mismatch between training and testing conditions.
arXiv Detail & Related papers (2020-11-18T00:33:24Z)
Active Importance Sampling for Variational Objectives Dominated by Rare Events: Consequences for Optimization and Generalization [12.617078020344618]
We introduce an approach that combines rare events sampling techniques with neural network optimization to optimize objective functions dominated by rare events. We show that importance sampling reduces the variance of the solution to a learning problem, suggesting benefits for generalization. Our numerical experiments demonstrate that we can successfully learn even with the compounding difficulties of high-dimensional and rare data.
arXiv Detail & Related papers (2020-08-11T23:38:09Z)
Latent Network Structure Learning from High Dimensional Multivariate Point Processes [5.079425170410857]
We propose a new class of nonstationary Hawkes processes to characterize the complex processes underlying the observed data. We estimate the latent network structure using an efficient sparse least squares estimation approach. We demonstrate the efficacy of our proposed method through simulation studies and an application to a neuron spike train data set.
arXiv Detail & Related papers (2020-04-07T17:48:01Z)
High Dimensional Data Enrichment: Interpretable, Fast, and Data-Efficient [38.40316295019222]
We introduce an estimator for the problem of multiple connected linear regressions known as Data Enrichment/Sharing. We show that the recovery of the common parameter benefits from emphall of the pooled samples. Overall, we present a first thorough statistical and computational analysis of inference in the data-sharing model.
arXiv Detail & Related papers (2018-06-11T15:15:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.