Learning Sub-Patterns in Piecewise Continuous Functions
- URL: http://arxiv.org/abs/2010.15571v4
- Date: Wed, 15 Dec 2021 17:08:30 GMT
- Title: Learning Sub-Patterns in Piecewise Continuous Functions
- Authors: Anastasis Kratsios, Behnoosh Zamanlooy
- Abstract summary: Most gradient descent algorithms can optimize neural networks that are sub-differentiable in their parameters.
This paper focuses on the case where the discontinuities arise from distinct sub-patterns.
We propose a new discontinuous deep neural network model trainable via a decoupled two-step procedure.
- Score: 4.18804572788063
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most stochastic gradient descent algorithms can optimize neural networks that
are sub-differentiable in their parameters; however, this implies that the
neural network's activation function must exhibit a degree of continuity which
limits the neural network model's uniform approximation capacity to continuous
functions. This paper focuses on the case where the discontinuities arise from
distinct sub-patterns, each defined on different parts of the input space. We
propose a new discontinuous deep neural network model trainable via a decoupled
two-step procedure that avoids passing gradient updates through the network's
only and strategically placed, discontinuous unit. We provide approximation
guarantees for our architecture in the space of bounded continuous functions
and universal approximation guarantees in the space of piecewise continuous
functions which we introduced herein. We present a novel semi-supervised
two-step training procedure for our discontinuous deep learning model, tailored
to its structure, and we provide theoretical support for its effectiveness. The
performance of our model and trained with the propose procedure is evaluated
experimentally on both real-world financial datasets and synthetic datasets.
Related papers
- Nonlinear functional regression by functional deep neural network with
kernel embedding [20.306390874610635]
We propose a functional deep neural network with an efficient and fully data-dependent dimension reduction method.
The architecture of our functional net consists of a kernel embedding step, a projection step, and a deep ReLU neural network for the prediction.
The utilization of smooth kernel embedding enables our functional net to be discretization invariant, efficient, and robust to noisy observations.
arXiv Detail & Related papers (2024-01-05T16:43:39Z) - Continual Learning via Sequential Function-Space Variational Inference [65.96686740015902]
We propose an objective derived by formulating continual learning as sequential function-space variational inference.
Compared to objectives that directly regularize neural network predictions, the proposed objective allows for more flexible variational distributions.
We demonstrate that, across a range of task sequences, neural networks trained via sequential function-space variational inference achieve better predictive accuracy than networks trained with related methods.
arXiv Detail & Related papers (2023-12-28T18:44:32Z) - ENN: A Neural Network with DCT Adaptive Activation Functions [2.2713084727838115]
We present Expressive Neural Network (ENN), a novel model in which the non-linear activation functions are modeled using the Discrete Cosine Transform (DCT)
This parametrization keeps the number of trainable parameters low, is appropriate for gradient-based schemes, and adapts to different learning tasks.
The performance of ENN outperforms state of the art benchmarks, providing above a 40% gap in accuracy in some scenarios.
arXiv Detail & Related papers (2023-07-02T21:46:30Z) - A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer
Neural Networks [49.870593940818715]
We study the infinite-width limit of a type of three-layer NN model whose first layer is random and fixed.
Our theory accommodates different scaling choices of the model, resulting in two regimes of the MF limit that demonstrate distinctive behaviors.
arXiv Detail & Related papers (2022-10-28T17:26:27Z) - Offline Reinforcement Learning with Differentiable Function
Approximation is Provably Efficient [65.08966446962845]
offline reinforcement learning, which aims at optimizing decision-making strategies with historical data, has been extensively applied in real-life applications.
We take a step by considering offline reinforcement learning with differentiable function class approximation (DFA)
Most importantly, we show offline differentiable function approximation is provably efficient by analyzing the pessimistic fitted Q-learning algorithm.
arXiv Detail & Related papers (2022-10-03T07:59:42Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - On Feature Learning in Neural Networks with Global Convergence
Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF)
We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF.
We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z) - Modern Non-Linear Function-on-Function Regression [8.231050911072755]
We introduce a new class of non-linear function-on-function regression models for functional data using neural networks.
We give two model fitting strategies, Functional Direct Neural Network (FDNN) and Functional Basis Neural Network (FBNN)
arXiv Detail & Related papers (2021-07-29T16:19:59Z) - Non-linear Functional Modeling using Neural Networks [6.624726878647541]
We introduce a new class of non-linear models for functional data based on neural networks.
We propose two variations of our framework: a functional neural network with continuous hidden layers, and a second version that utilizes basis expansions and continuous hidden layers.
arXiv Detail & Related papers (2021-04-19T14:59:55Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - A Shooting Formulation of Deep Learning [19.51427735087011]
We introduce a shooting formulation which shifts the perspective from parameterizing a network layer-by-layer to parameterizing over optimal networks.
For scalability, we propose a novel particle-ensemble parametrization which fully specifies the optimal weight trajectory of the continuous-depth neural network.
arXiv Detail & Related papers (2020-06-18T07:36:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.