Related papers: Deep Learning: A Tutorial

Deep Learning: A Tutorial

URL: http://arxiv.org/abs/2310.06251v1
Date: Tue, 10 Oct 2023 01:55:22 GMT
Title: Deep Learning: A Tutorial
Authors: Nick Polson and Vadim Sokolov
Abstract summary: We provide a review of deep learning methods which provide insight into structured high-dimensional data. Deep learning uses layers of semi-affine input transformations to provide a predictive rule. Applying these layers of transformations leads to a set of attributes (or, features) to which probabilistic statistical methods can be applied.
Score: 0.8158530638728498
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Our goal is to provide a review of deep learning methods which provide insight into structured high-dimensional data. Rather than using shallow additive architectures common to most statistical models, deep learning uses layers of semi-affine input transformations to provide a predictive rule. Applying these layers of transformations leads to a set of attributes (or, features) to which probabilistic statistical methods can be applied. Thus, the best of both worlds can be achieved: scalable prediction rules fortified with uncertainty quantification, where sparse regularization finds the features.

Related papers

Statistically guided deep learning [10.619901778151336]
We present a theoretically well-founded deep learning algorithm for nonparametric regression. We show that a theoretical analysis of deep learning which takes into account simultaneously optimization, generalization and approximation can result in a new deep learning estimate.
arXiv Detail & Related papers (2025-04-11T12:36:06Z)
Meta-Statistical Learning: Supervised Learning of Statistical Inference [59.463430294611626]
This work demonstrates that the tools and principles driving the success of large language models (LLMs) can be repurposed to tackle distribution-level tasks. We propose meta-statistical learning, a framework inspired by multi-instance learning that reformulates statistical inference tasks as supervised learning problems.
arXiv Detail & Related papers (2025-02-17T18:04:39Z)
Prospector Heads: Generalized Feature Attribution for Large Models & Data [82.02696069543454]
We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods. We demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data.
arXiv Detail & Related papers (2024-02-18T23:01:28Z)
Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data. Main aim of the identified model is to predict new data from previous observations. We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z)
Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks. Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z)
Deep Explainable Learning with Graph Based Data Assessing and Rule Reasoning [4.369058206183195]
We propose an end-to-end deep explainable learning approach that combines the advantage of deep model in noise handling and expert rule-based interpretability. The proposed method is tested in an industry production system, showing comparable prediction accuracy, much higher generalization stability and better interpretability.
arXiv Detail & Related papers (2022-11-09T05:58:56Z)
Merging Two Cultures: Deep and Statistical Learning [3.15863303008255]
Merging the two cultures of deep and statistical learning provides insights into structured high-dimensional data. We show that prediction, optimisation and uncertainty can be achieved using probabilistic methods at the output layer of the model.
arXiv Detail & Related papers (2021-10-22T02:57:21Z)
Rank-R FNN: A Tensor-Based Learning Model for High-Order Data Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters. First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension. We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z)
Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties. Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z)
Deep learning: a statistical viewpoint [120.94133818355645]
Deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-perfect solutions to non-optimal training problems. We conjecture that specific principles underlie these phenomena.
arXiv Detail & Related papers (2021-03-16T16:26:36Z)
A Probabilistically Motivated Learning Rate Adaptation for Stochastic Optimization [20.77923050735746]
We provide a probabilistic motivation, in terms of Gaussian inference, for popular first-order methods. The inference allows us to relate the learning rate to a dimensionless quantity that can be automatically adapted during training. The resulting meta-algorithm is shown to adapt learning rates in a robust manner across a large range of initial values.
arXiv Detail & Related papers (2021-02-22T10:26:31Z)
Feature space approximation for kernel-based supervised learning [2.653409741248232]
The goal is to reduce the size of the training data, resulting in lower storage consumption and computational complexity. We demonstrate significant improvements in comparison to the computation of data-driven predictions involving the full training data set. The method is applied to classification and regression problems from different application areas such as image recognition, system identification, and oceanographic time series analysis.
arXiv Detail & Related papers (2020-11-25T11:23:58Z)
A Framework to Learn with Interpretation [2.3741312212138896]
We present a novel framework to jointly learn a predictive model and its associated interpretation model. We seek for a small-size dictionary of high level attribute functions that take as inputs the outputs of selected hidden layers. A detailed pipeline to visualize the learnt features is also developed.
arXiv Detail & Related papers (2020-10-19T09:26:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.