Analytic Learning of Convolutional Neural Network For Pattern
Recognition
- URL: http://arxiv.org/abs/2202.06504v1
- Date: Mon, 14 Feb 2022 06:32:21 GMT
- Title: Analytic Learning of Convolutional Neural Network For Pattern
Recognition
- Authors: Huiping Zhuang, Zhiping Lin, Yimin Yang and Kar-Ann Toh
- Abstract summary: Training convolutional neural networks (CNNs) with back-propagation (BP) is time-consuming and resource-intensive.
We propose an analytic convolutional neural network learning (ACnnL)
ACnnL builds a closed-form solution similar to its counterpart, but differs in their regularization constraints.
- Score: 20.916630175697065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training convolutional neural networks (CNNs) with back-propagation (BP) is
time-consuming and resource-intensive particularly in view of the need to visit
the dataset multiple times. In contrast, analytic learning attempts to obtain
the weights in one epoch. However, existing attempts to analytic learning
considered only the multilayer perceptron (MLP). In this article, we propose an
analytic convolutional neural network learning (ACnnL). Theoretically we show
that ACnnL builds a closed-form solution similar to its MLP counterpart, but
differs in their regularization constraints. Consequently, we are able to
answer to a certain extent why CNNs usually generalize better than MLPs from
the implicit regularization point of view. The ACnnL is validated by conducting
classification tasks on several benchmark datasets. It is encouraging that the
ACnnL trains CNNs in a significantly fast manner with reasonably close
prediction accuracies to those using BP. Moreover, our experiments disclose a
unique advantage of ACnnL under the small-sample scenario when training data
are scarce or expensive.
Related papers
- On the rates of convergence for learning with convolutional neural networks [9.772773527230134]
We study approximation and learning capacities of convolutional neural networks (CNNs) with one-side zero-padding and multiple channels.
We derive convergence rates for estimators based on CNNs in many learning problems.
It is also shown that the obtained rates for classification are minimax optimal in some common settings.
arXiv Detail & Related papers (2024-03-25T06:42:02Z) - Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for
XOR Data [24.86314525762012]
We show that ReLU CNN trained by gradient descent can achieve near Bayes-optimal accuracy.
Our result demonstrates that CNNs have a remarkable capacity to efficiently learn XOR problems, even in the presence of highly correlated features.
arXiv Detail & Related papers (2023-10-03T11:31:37Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Learn, Unlearn and Relearn: An Online Learning Paradigm for Deep Neural
Networks [12.525959293825318]
We introduce Learn, Unlearn, and Relearn (LURE) an online learning paradigm for deep neural networks (DNNs)
LURE interchanges between the unlearning phase, which selectively forgets the undesirable information in the model, and the relearning phase, which emphasizes learning on generalizable features.
We show that our training paradigm provides consistent performance gains across datasets in both classification and few-shot settings.
arXiv Detail & Related papers (2023-03-18T16:45:54Z) - Neural networks trained with SGD learn distributions of increasing
complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics.
We then exploit higher-order statistics only later during training.
We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z) - Separation of scales and a thermodynamic description of feature learning
in some CNNs [2.28438857884398]
Deep neural networks (DNNs) are powerful tools for compressing and distilling information.
A common strategy in such cases is to identify slow degrees of freedom that average out the erratic behavior of the underlying fast microscopic variables.
Here, we identify such a separation of scales occurring in over- parameterized deep convolutional neural networks (CNNs) at the end of training.
The resulting thermodynamic theory of deep learning yields accurate predictions on several deep non-linear CNN toy models.
arXiv Detail & Related papers (2021-12-31T10:49:55Z) - Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set.
We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps.
Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z) - How Neural Networks Extrapolate: From Feedforward to Graph Neural
Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution.
Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z) - RIFLE: Backpropagation in Depth for Deep Transfer Learning through
Re-Initializing the Fully-connected LayEr [60.07531696857743]
Fine-tuning the deep convolution neural network(CNN) using a pre-trained model helps transfer knowledge learned from larger datasets to the target task.
We propose RIFLE - a strategy that deepens backpropagation in transfer learning settings.
RIFLE brings meaningful updates to the weights of deep CNN layers and improves low-level feature learning.
arXiv Detail & Related papers (2020-07-07T11:27:43Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.