Penalized deep neural networks estimator with general loss functions
under weak dependence
- URL: http://arxiv.org/abs/2305.06230v1
- Date: Wed, 10 May 2023 15:06:53 GMT
- Title: Penalized deep neural networks estimator with general loss functions
under weak dependence
- Authors: William Kengne and Modou Wade
- Abstract summary: This paper carries out sparse-penalized deep neural networks predictors for learning weakly dependent processes.
Some simulation results are provided, and application to the forecast of the particulate matter in the Vit'oria metropolitan area is also considered.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper carries out sparse-penalized deep neural networks predictors for
learning weakly dependent processes, with a broad class of loss functions. We
deal with a general framework that includes, regression estimation,
classification, times series prediction, $\cdots$ The $\psi$-weak dependence
structure is considered, and for the specific case of bounded observations,
$\theta_\infty$-coefficients are also used. In this case of
$\theta_\infty$-weakly dependent, a non asymptotic generalization bound within
the class of deep neural networks predictors is provided. For learning both
$\psi$ and $\theta_\infty$-weakly dependent processes, oracle inequalities for
the excess risk of the sparse-penalized deep neural networks estimators are
established. When the target function is sufficiently smooth, the convergence
rate of these excess risk is close to $\mathcal{O}(n^{-1/3})$. Some simulation
results are provided, and application to the forecast of the particulate matter
in the Vit\'{o}ria metropolitan area is also considered.
Related papers
- Robust deep learning from weakly dependent data [0.0]
This paper considers robust deep learning from weakly dependent observations, with unbounded loss function and unbounded input/output.
We derive a relationship between these bounds and $r$, and when the data have moments of any order (that is $r=infty$), the convergence rate is close to some well-known results.
arXiv Detail & Related papers (2024-05-08T14:25:40Z) - On Excess Risk Convergence Rates of Neural Network Classifiers [8.329456268842227]
We study the performance of plug-in classifiers based on neural networks in a binary classification setting as measured by their excess risks.
We analyze the estimation and approximation properties of neural networks to obtain a dimension-free, uniform rate of convergence.
arXiv Detail & Related papers (2023-09-26T17:14:10Z) - Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence.
We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers.
This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z) - Sparse-penalized deep neural networks estimator under weak dependence [0.0]
We consider the nonparametric regression and the classification problems for $psi$-weakly dependent processes.
A penalized estimation method for sparse deep neural networks is performed.
arXiv Detail & Related papers (2023-03-02T16:53:51Z) - Semantic Strengthening of Neuro-Symbolic Learning [85.6195120593625]
Neuro-symbolic approaches typically resort to fuzzy approximations of a probabilistic objective.
We show how to compute this efficiently for tractable circuits.
We test our approach on three tasks: predicting a minimum-cost path in Warcraft, predicting a minimum-cost perfect matching, and solving Sudoku puzzles.
arXiv Detail & Related papers (2023-02-28T00:04:22Z) - Excess risk bound for deep learning under weak dependence [0.0]
This paper considers deep neural networks for learning weakly dependent processes.
We derive the required depth, width and sparsity of a deep neural network to approximate any H"older smooth function.
arXiv Detail & Related papers (2023-02-15T07:23:48Z) - Deep learning for $\psi$-weakly dependent processes [0.0]
We perform deep neural networks for learning $psi$-weakly dependent processes.
The consistency of the empirical risk minimization algorithm in the class of deep neural networks predictors is established.
Some simulation results are provided, as well as an application to the US recession data.
arXiv Detail & Related papers (2023-02-01T09:31:15Z) - Sample Complexity of Nonparametric Off-Policy Evaluation on
Low-Dimensional Manifolds using Deep Networks [71.95722100511627]
We consider the off-policy evaluation problem of reinforcement learning using deep neural networks.
We show that, by choosing network size appropriately, one can leverage the low-dimensional manifold structure in the Markov decision process.
arXiv Detail & Related papers (2022-06-06T20:25:20Z) - On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks [91.3755431537592]
We study how random pruning of the weights affects a neural network's neural kernel (NTK)
In particular, this work establishes an equivalence of the NTKs between a fully-connected neural network and its randomly pruned version.
arXiv Detail & Related papers (2022-03-27T15:22:19Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss.
We examine how these benign overfitting phenomena occur in a two-layer neural network setting.
We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.