Selection of Summary Statistics for Network Model Choice with
Approximate Bayesian Computation
- URL: http://arxiv.org/abs/2101.07766v1
- Date: Tue, 19 Jan 2021 18:21:06 GMT
- Title: Selection of Summary Statistics for Network Model Choice with
Approximate Bayesian Computation
- Authors: Louis Raynal and Jukka-Pekka Onnela
- Abstract summary: We study the utility of cost-based filter selection methods to account for different summary costs during the selection process.
Our findings show that computationally inexpensive summary statistics can be efficiently selected with minimal impact on classification accuracy.
- Score: 1.8884278918443564
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Approximate Bayesian Computation (ABC) now serves as one of the major
strategies to perform model choice and parameter inference on models with
intractable likelihoods. An essential component of ABC involves comparing a
large amount of simulated data with the observed data through summary
statistics. To avoid the curse of dimensionality, summary statistic selection
is of prime importance, and becomes even more critical when applying ABC to
mechanistic network models. Indeed, while many summary statistics can be used
to encode network structures, their computational complexity can be highly
variable. For large networks, computation of summary statistics can quickly
create a bottleneck, making the use of ABC difficult. To reduce this
computational burden and make the analysis of mechanistic network models more
practical, we investigated two questions in a model choice framework. First, we
studied the utility of cost-based filter selection methods to account for
different summary costs during the selection process. Second, we performed
selection using networks generated with a smaller number of nodes to reduce the
time required for the selection step. Our findings show that computationally
inexpensive summary statistics can be efficiently selected with minimal impact
on classification accuracy. Furthermore, we found that networks with a smaller
number of nodes can only be employed to eliminate a moderate number of
summaries. While this latter finding is network specific, the former is general
and can be adapted to any ABC application.
Related papers
- Compute-Constrained Data Selection [77.06528009072967]
We formalize the problem of data selection with a cost-aware utility function, and model the problem as trading off initial-selection cost for training gain.
We run a comprehensive sweep of experiments across multiple tasks, varying compute budget by scaling finetuning tokens, model sizes, and data selection compute.
arXiv Detail & Related papers (2024-10-21T17:11:21Z) - Adapt-$\infty$: Scalable Lifelong Multimodal Instruction Tuning via Dynamic Data Selection [89.42023974249122]
Adapt-$infty$ is a new multi-way and adaptive data selection approach for Lifelong Instruction Tuning.
We construct pseudo-skill clusters by grouping gradient-based sample vectors.
We select the best-performing data selector for each skill cluster from a pool of selector experts.
arXiv Detail & Related papers (2024-10-14T15:48:09Z) - Sample Complexity of Algorithm Selection Using Neural Networks and Its Applications to Branch-and-Cut [1.4624458429745086]
We build upon recent work in this line of research by considering the setup where, instead of selecting a single algorithm that has the best performance, we allow the possibility of selecting an algorithm based on the instance to be solved.
In particular, given a representative sample of instances, we learn a neural network that maps an instance of the problem to the most appropriate algorithm for that instance.
In other words, the neural network will take as input a mixed-integer optimization instance and output a decision that will result in a small branch-and-cut tree for that instance.
arXiv Detail & Related papers (2024-02-04T03:03:27Z) - Minimally Supervised Learning using Topological Projections in
Self-Organizing Maps [55.31182147885694]
We introduce a semi-supervised learning approach based on topological projections in self-organizing maps (SOMs)
Our proposed method first trains SOMs on unlabeled data and then a minimal number of available labeled data points are assigned to key best matching units (BMU)
Our results indicate that the proposed minimally supervised model significantly outperforms traditional regression techniques.
arXiv Detail & Related papers (2024-01-12T22:51:48Z) - Towards Free Data Selection with General-Purpose Models [71.92151210413374]
A desirable data selection algorithm can efficiently choose the most informative samples to maximize the utility of limited annotation budgets.
Current approaches, represented by active learning methods, typically follow a cumbersome pipeline that iterates the time-consuming model training and batch data selection repeatedly.
FreeSel bypasses the heavy batch selection process, achieving a significant improvement in efficiency and being 530x faster than existing active learning methods.
arXiv Detail & Related papers (2023-09-29T15:50:14Z) - MILO: Model-Agnostic Subset Selection Framework for Efficient Model
Training and Tuning [68.12870241637636]
We propose MILO, a model-agnostic subset selection framework that decouples the subset selection from model training.
Our empirical results indicate that MILO can train models $3times - 10 times$ faster and tune hyperparameters $20times - 75 times$ faster than full-dataset training or tuning without performance.
arXiv Detail & Related papers (2023-01-30T20:59:30Z) - A Statistical-Modelling Approach to Feedforward Neural Network Model Selection [0.8287206589886881]
Feedforward neural networks (FNNs) can be viewed as non-linear regression models.
A novel model selection method is proposed using the Bayesian information criterion (BIC) for FNNs.
The choice of BIC over out-of-sample performance leads to an increased probability of recovering the true model.
arXiv Detail & Related papers (2022-07-09T11:07:04Z) - Approximate Bayesian Computation with Domain Expert in the Loop [13.801835670003008]
We introduce an active learning method for ABC statistics selection which reduces the domain expert's work considerably.
By involving the experts, we are able to handle misspecified models, unlike the existing dimension reduction methods.
empirical results show better posterior estimates than with existing methods, when the simulation budget is limited.
arXiv Detail & Related papers (2022-01-28T12:58:51Z) - A Markov Decision Process Approach to Active Meta Learning [24.50189361694407]
In supervised learning, we fit a single statistical model to a given data set, assuming that the data is associated with a singular task.
In meta-learning, the data is associated with numerous tasks, and we seek a model that may perform well on all tasks simultaneously.
arXiv Detail & Related papers (2020-09-10T15:45:34Z) - The Devil is in Classification: A Simple Framework for Long-tail Object
Detection and Instance Segmentation [93.17367076148348]
We investigate performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset.
We unveil that a major cause is the inaccurate classification of object proposals.
We propose a simple calibration framework to more effectively alleviate classification head bias with a bi-level class balanced sampling approach.
arXiv Detail & Related papers (2020-07-23T12:49:07Z) - Convolutional Neural Networks as Summary Statistics for Approximate
Bayesian Computation [0.0]
This paper proposes a convolutional neural network architecture for automatically learning informative summary statistics of temporal responses.
We show that the proposed network can effectively circumvent the statistics selection problem of the preprocessing step for ABC inference.
arXiv Detail & Related papers (2020-01-31T10:46:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.