Related papers: Modeling the Nonsmoothness of Modern Neural Networks

Modeling the Nonsmoothness of Modern Neural Networks

URL: http://arxiv.org/abs/2103.14731v1
Date: Fri, 26 Mar 2021 20:55:19 GMT
Title: Modeling the Nonsmoothness of Modern Neural Networks
Authors: Runze Liu, Chau-Wai Wong, Huaiyu Dai
Abstract summary: We quantify the nonsmoothness using a feature named the sum of the magnitude of peaks (SMP) We envision that the nonsmoothness feature can potentially be used as a forensic tool for regression-based applications of neural networks.
Score: 35.93486244163653
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Modern neural networks have been successful in many regression-based tasks such as face recognition, facial landmark detection, and image generation. In this work, we investigate an intuitive but understudied characteristic of modern neural networks, namely, the nonsmoothness. The experiments using synthetic data confirm that such operations as ReLU and max pooling in modern neural networks lead to nonsmoothness. We quantify the nonsmoothness using a feature named the sum of the magnitude of peaks (SMP) and model the input-output relationships for building blocks of modern neural networks. Experimental results confirm that our model can accurately predict the statistical behaviors of the nonsmoothness as it propagates through such building blocks as the convolutional layer, the ReLU activation, and the max pooling layer. We envision that the nonsmoothness feature can potentially be used as a forensic tool for regression-based applications of neural networks.

Related papers

The sampling complexity of learning invertible residual neural networks [9.614718680817269]
It has been shown that determining a feedforward ReLU neural network to within high uniform accuracy from point samples suffers from the curse of dimensionality. We consider the question of whether the sampling complexity can be improved by restricting the specific neural network architecture. Our main result shows that the residual neural network architecture and invertibility do not help overcome the complexity barriers encountered with simpler feedforward architectures.
arXiv Detail & Related papers (2024-11-08T10:00:40Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise. We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption. They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware. A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z)
SpikiLi: A Spiking Simulation of LiDAR based Real-time Object Detection for Autonomous Driving [0.0]
Spiking Neural Networks are a new neural network design approach that promises tremendous improvements in power efficiency, computation efficiency, and processing latency. We first illustrate the applicability of spiking neural networks to a complex deep learning task namely Lidar based 3D object detection for automated driving.
arXiv Detail & Related papers (2022-06-06T20:05:17Z)
Optimal Learning Rates of Deep Convolutional Neural Networks: Additive Ridge Functions [19.762318115851617]
We consider the mean squared error analysis for deep convolutional neural networks. We show that, for additive ridge functions, convolutional neural networks followed by one fully connected layer with ReLU activation functions can reach optimal mini-max rates.
arXiv Detail & Related papers (2022-02-24T14:22:32Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
A Sparse Coding Interpretation of Neural Networks and Theoretical Implications [0.0]
Deep convolutional neural networks have achieved unprecedented performance in various computer vision tasks. We propose a sparse coding interpretation of neural networks that have ReLU activation. We derive a complete convolutional neural network without normalization and pooling.
arXiv Detail & Related papers (2021-08-14T21:54:47Z)
Sparse Flows: Pruning Continuous-depth Models [107.98191032466544]
We show that pruning improves generalization for neural ODEs in generative modeling. We also show that pruning finds minimal and efficient neural ODE representations with up to 98% less parameters compared to the original network, without loss of accuracy.
arXiv Detail & Related papers (2021-06-24T01:40:17Z)
Stochastic Recurrent Neural Network for Multistep Time Series Forecasting [0.0]
We leverage advances in deep generative models and the concept of state space models to propose an adaptation of the recurrent neural network for time series forecasting. Our model preserves the architectural workings of a recurrent neural network for which all relevant information is encapsulated in its hidden states, and this flexibility allows our model to be easily integrated into any deep architecture for sequential modelling.
arXiv Detail & Related papers (2021-04-26T01:43:43Z)
Flexible Transmitter Network [84.90891046882213]
Current neural networks are mostly built upon the MP model, which usually formulates the neuron as executing an activation function on the real-valued weighted aggregation of signals received from other neurons. We propose the Flexible Transmitter (FT) model, a novel bio-plausible neuron model with flexible synaptic plasticity. We present the Flexible Transmitter Network (FTNet), which is built on the most common fully-connected feed-forward architecture.
arXiv Detail & Related papers (2020-04-08T06:55:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.