Related papers: Learning a Single Neuron with Gradient Methods

Learning a Single Neuron with Gradient Methods

URL: http://arxiv.org/abs/2001.05205v3
Date: Sun, 27 Feb 2022 11:59:15 GMT
Title: Learning a Single Neuron with Gradient Methods
Authors: Gilad Yehudai and Ohad Shamir
Abstract summary: We consider the fundamental problem of learning a single neuron $x mapstosigma(wtop x)$ using standard gradient methods. We ask whether a more general result is attainable, under milder assumptions.
Score: 39.291483556116454
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the fundamental problem of learning a single neuron $x \mapsto\sigma(w^\top x)$ using standard gradient methods. As opposed to previous works, which considered specific (and not always realistic) input distributions and activation functions $\sigma(\cdot)$, we ask whether a more general result is attainable, under milder assumptions. On the one hand, we show that some assumptions on the distribution and the activation function are necessary. On the other hand, we prove positive guarantees under mild assumptions, which go beyond those studied in the literature so far. We also point out and study the challenges in further strengthening and generalizing our results.

Related papers

The informativeness of the gradient revisited [4.178980693837599]
We give a general bound on the variance in terms of a parameter related to the pairwise independence of the target function class.<n>In addition to the theoretical analysis, we present experiments to understand better the nature of recent deep learning-based attacks on Learning with Errors.
arXiv Detail & Related papers (2025-05-28T09:23:37Z)
Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits [58.63897489864948]
Reinforcement learning with outcome-based feedback faces a fundamental challenge.<n>How do we assign credit to the right actions?<n>This paper provides the first comprehensive analysis of this problem in online RL with general function approximation.
arXiv Detail & Related papers (2025-05-26T17:44:08Z)
Methods with Local Steps and Random Reshuffling for Generally Smooth Non-Convex Federated Optimization [52.61737731453222]
Non-Machine Learning problems typically do not adhere to the standard smoothness assumption. We propose and analyze new methods with local steps, partial participation of clients, and Random Random Reshuffling. Our theory is consistent with the known results for standard smooth problems.
arXiv Detail & Related papers (2024-12-03T19:20:56Z)
Learning Narrow One-Hidden-Layer ReLU Networks [30.63086568341548]
First-time algorithm succeeds whenever $k$ is a constant. We use a multi-scale analysis to argue that sufficiently close neurons can be collapsed together.
arXiv Detail & Related papers (2023-04-20T17:53:09Z)
Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation [69.0695698566235]
We study reinforcement learning with linear function approximation and adversarially changing cost functions. We present a computationally efficient policy optimization algorithm for the challenging general setting of unknown dynamics and bandit feedback.
arXiv Detail & Related papers (2023-01-30T17:26:39Z)
Cardinality-Minimal Explanations for Monotonic Neural Networks [25.212444848632515]
In this paper, we investigate whether tractability can be regained by focusing on neural models implementing a monotonic function. Although the relevant decision problems remain intractable, we can show that they become solvable in favourable time.
arXiv Detail & Related papers (2022-05-19T23:47:25Z)
Learning a Single Neuron for Non-monotonic Activation Functions [3.890410443467757]
Non-monotonic activation functions outperform the traditional monotonic ones in many applications. We show that mild conditions on $sigma$ are sufficient to guarantee the learnability in samples time. We also discuss how our positive results are related to existing negative results on training two-layer neural networks.
arXiv Detail & Related papers (2022-02-16T13:44:25Z)
Reinforcement Learning in Reward-Mixing MDPs [74.41782017817808]
episodic reinforcement learning in a reward-mixing Markov decision process (MDP) cdot S2 A2)$ episodes, where $H$ is time-horizon and $S, A$ are the number of states and actions respectively. epsilon$-optimal policy after exploring $tildeO(poly(H,epsilon-1) cdot S2 A2)$ episodes, where $H$ is time-horizon and $S, A$ are the number of states and actions respectively.
arXiv Detail & Related papers (2021-10-07T18:55:49Z)
Learning a Single Neuron with Bias Using Gradient Descent [53.15475693468925]
We study the fundamental problem of learning a single neuron with a bias term. We show that this is a significantly different and more challenging problem than the bias-less case.
arXiv Detail & Related papers (2021-06-02T12:09:55Z)
Causal Inference Under Unmeasured Confounding With Negative Controls: A Minimax Learning Approach [84.29777236590674]
We study the estimation of causal parameters when not all confounders are observed and instead negative controls are available. Recent work has shown how these can enable identification and efficient estimation via two so-called bridge functions.
arXiv Detail & Related papers (2021-03-25T17:59:19Z)
Approximation Schemes for ReLU Regression [80.33702497406632]
We consider the fundamental problem of ReLU regression. The goal is to output the best fitting ReLU with respect to square loss given to draws from some unknown distribution.
arXiv Detail & Related papers (2020-05-26T16:26:17Z)
Differentiable Bandit Exploration [38.81737411000074]
We learn such policies for an unknown distribution $mathcalP$ using samples from $mathcalP$. Our approach is a form of meta-learning and exploits properties of $mathcalP$ without making strong assumptions about its form.
arXiv Detail & Related papers (2020-02-17T05:07:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.