Receptive Field Size Optimization with Continuous Time Pooling
- URL: http://arxiv.org/abs/2011.00869v2
- Date: Fri, 6 Nov 2020 21:49:42 GMT
- Title: Receptive Field Size Optimization with Continuous Time Pooling
- Authors: D\'ora Babicz, Soma Kont\'ar, M\'ark Pet\H{o}, Andr\'as F\"ul\"op,
Gergely Szab\'o, Andr\'as Horv\'ath
- Abstract summary: We will present an altered version of the most commonly applied method, maximum pooling, where pooling in theory is substituted by a continuous time differential equation.
We will evaluate the effect of continuous pooling on accuracy and computational need using commonly applied network architectures and datasets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The pooling operation is a cornerstone element of convolutional neural
networks. These elements generate receptive fields for neurons, in which local
perturbations should have minimal effect on the output activations, increasing
robustness and invariance of the network. In this paper we will present an
altered version of the most commonly applied method, maximum pooling, where
pooling in theory is substituted by a continuous time differential equation,
which generates a location sensitive pooling operation, more similar to
biological receptive fields. We will present how this continuous method can be
approximated numerically using discrete operations which fit ideally on a GPU.
In our approach the kernel size is substituted by diffusion strength which is a
continuous valued parameter, this way it can be optimized by gradient descent
algorithms. We will evaluate the effect of continuous pooling on accuracy and
computational need using commonly applied network architectures and datasets.
Related papers
- Tensor network renormalization: application to dynamic correlation
functions and non-hermitian systems [0.0]
We present the implementation of the Loop-TNR algorithm, which allows for the computation of dynamical correlation functions.
We highlight that the Loop-TNR algorithm can also be applied to investigate critical properties of non-Hermitian systems.
arXiv Detail & Related papers (2023-11-30T18:34:32Z) - Randomized Polar Codes for Anytime Distributed Machine Learning [66.46612460837147]
We present a novel distributed computing framework that is robust to slow compute nodes, and is capable of both approximate and exact computation of linear operations.
We propose a sequential decoding algorithm designed to handle real valued data while maintaining low computational complexity for recovery.
We demonstrate the potential applications of this framework in various contexts, such as large-scale matrix multiplication and black-box optimization.
arXiv Detail & Related papers (2023-09-01T18:02:04Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel.
Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU.
Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z) - Structured Optimal Variational Inference for Dynamic Latent Space Models [16.531262817315696]
We consider a latent space model for dynamic networks, where our objective is to estimate the pairwise inner products plus the intercept of the latent positions.
To balance posterior inference and computational scalability, we consider a structured mean-field variational inference framework.
arXiv Detail & Related papers (2022-09-29T22:10:42Z) - Scalable Optimal Transport in High Dimensions for Graph Distances,
Embedding Alignment, and More [7.484063729015126]
We propose two effective log-linear time approximations of the cost matrix for optimal transport.
These approximations enable general log-linear time algorithms for entropy-regularized OT that perform well even for the complex, high-dimensional spaces.
For graph distance regression we propose the graph transport network (GTN), which combines graph neural networks (GNNs) with enhanced Sinkhorn.
arXiv Detail & Related papers (2021-07-14T17:40:08Z) - A Convergence Theory Towards Practical Over-parameterized Deep Neural
Networks [56.084798078072396]
We take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time.
We show that convergence to a global minimum is guaranteed for networks with quadratic widths in the sample size and linear in their depth at a time logarithmic in both.
Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size.
arXiv Detail & Related papers (2021-01-12T00:40:45Z) - Self Normalizing Flows [65.73510214694987]
We propose a flexible framework for training normalizing flows by replacing expensive terms in the gradient by learned approximate inverses at each layer.
This reduces the computational complexity of each layer's exact update from $mathcalO(D3)$ to $mathcalO(D2)$.
We show experimentally that such models are remarkably stable and optimize to similar data likelihood values as their exact gradient counterparts.
arXiv Detail & Related papers (2020-11-14T09:51:51Z) - Relative gradient optimization of the Jacobian term in unsupervised deep
learning [9.385902422987677]
Learning expressive probabilistic models correctly describing the data is a ubiquitous problem in machine learning.
Deep density models have been widely used for this task, but their maximum likelihood based training requires estimating the log-determinant of the Jacobian.
We propose a new approach for exact training of such neural networks.
arXiv Detail & Related papers (2020-06-26T16:41:08Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - Regularized Pooling [12.387676601792899]
In convolutional neural networks (CNNs), pooling operations play important roles such as dimensionality reduction and deformation compensation.
We propose regularized pooling, which enables the value selection direction in the pooling operation to be spatially smooth across adjacent kernels.
arXiv Detail & Related papers (2020-05-06T09:02:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.