Related papers: Mathematical Perspective of Machine Learning

Mathematical Perspective of Machine Learning

URL: http://arxiv.org/abs/2007.01503v1
Date: Fri, 3 Jul 2020 05:26:02 GMT
Title: Mathematical Perspective of Machine Learning
Authors: Yarema Boryshchak
Abstract summary: We take a closer look at some theoretical challenges of Machine Learning as a function approximation, gradient descent as the default optimization algorithm, limitations of fixed length and width networks and a different approach to RNNs from a mathematical perspective.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We take a closer look at some theoretical challenges of Machine Learning as a function approximation, gradient descent as the default optimization algorithm, limitations of fixed length and width networks and a different approach to RNNs from a mathematical perspective.

Related papers

Cross-Entropy Optimization for Hyperparameter Optimization in Stochastic Gradient-based Approaches to Train Deep Neural Networks [2.1046873879077794]
We present a cross-entropy optimization method for hyperparameter optimization of a learning algorithm. The presented method can be applied to other areas of optimization problems in deep learning.
arXiv Detail & Related papers (2024-09-14T00:39:37Z)
Metric Learning to Accelerate Convergence of Operator Splitting Methods for Differentiable Parametric Programming [46.26499759722771]
This paper shows how differentiable optimization can enable the end-to-end learning of proximal metrics. Results illustrate a strong connection between the learned proximal metrics and active constraints at the optima.
arXiv Detail & Related papers (2024-04-01T03:23:43Z)
A Gentle Introduction to Gradient-Based Optimization and Variational Inequalities for Machine Learning [46.98201017084005]
We provide a framework for gradient-based algorithms in machine learning. We start with saddle points and monotone games, and proceed to general variational inequalities. While we provide convergence proofs for several of the algorithms, our main focus is that of providing motivation and intuition.
arXiv Detail & Related papers (2023-09-09T21:36:51Z)
Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity. We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level. Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z)
Unsupervised Legendre-Galerkin Neural Network for Stiff Partial Differential Equations [9.659504024299896]
We propose an unsupervised machine learning algorithm based on the Legendre-Galerkin neural network to find an accurate approximation to the solution of different types of PDEs. The proposed neural network is applied to the general 1D and 2D PDEs as well as singularly perturbed PDEs that possess boundary layer behavior.
arXiv Detail & Related papers (2022-07-21T00:47:47Z)
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation. We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z)
Scalable computation of prediction intervals for neural networks via matrix sketching [79.44177623781043]
Existing algorithms for uncertainty estimation require modifying the model architecture and training procedure. This work proposes a new algorithm that can be applied to a given trained neural network and produces approximate prediction intervals.
arXiv Detail & Related papers (2022-05-06T13:18:31Z)
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks [23.038631072178735]
We consider a broad class of optimization algorithms that are commonly used in practice. As a consequence, we can leverage the convergence behavior of neural networks. We believe our approach can also be extended to other optimization algorithms and network theory.
arXiv Detail & Related papers (2020-10-25T17:10:22Z)
Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$. We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.