Understanding Deep Learning via Decision Boundary
- URL: http://arxiv.org/abs/2206.01515v2
- Date: Sun, 24 Dec 2023 15:18:37 GMT
- Title: Understanding Deep Learning via Decision Boundary
- Authors: Shiye Lei, Fengxiang He, Yancheng Yuan, Dacheng Tao
- Abstract summary: We show that the neural network with lower decision boundary (DB) variability has better generalizability.
Two new notions, algorithm DB variability and $(epsilon, eta)$-data DB variability, are proposed to measure the decision boundary variability.
- Score: 81.49114762506287
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper discovers that the neural network with lower decision boundary
(DB) variability has better generalizability. Two new notions, algorithm DB
variability and $(\epsilon, \eta)$-data DB variability, are proposed to measure
the decision boundary variability from the algorithm and data perspectives.
Extensive experiments show significant negative correlations between the
decision boundary variability and the generalizability. From the theoretical
view, two lower bounds based on algorithm DB variability are proposed and do
not explicitly depend on the sample size. We also prove an upper bound of order
$\mathcal{O}\left(\frac{1}{\sqrt{m}}+\epsilon+\eta\log\frac{1}{\eta}\right)$
based on data DB variability. The bound is convenient to estimate without the
requirement of labels, and does not explicitly depend on the network size which
is usually prohibitively large in deep learning.
Related papers
- Sharper Guarantees for Learning Neural Network Classifiers with Gradient Methods [43.32546195968771]
We study the data-dependent convergence and generalization behavior of gradient methods for neural networks with smooth activation.
Our results improve upon the shortcomings of the well-established Rademacher complexity-based bounds.
We show that a large step-size significantly improves upon the NTK regime's results in classifying the XOR distribution.
arXiv Detail & Related papers (2024-10-13T21:49:29Z) - Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits [53.281230333364505]
This paper studies the problem of contextual dueling bandits, where the binary comparison of dueling arms is generated from a generalized linear model (GLM)
We propose a new SupLinUCB-type algorithm that enjoys computational efficiency and a variance-aware regret bound $tilde Obig(dsqrtsum_t=1Tsigma_t2 + dbig)$.
Our regret bound naturally aligns with the intuitive expectation in scenarios where the comparison is deterministic, the algorithm only suffers from an $tilde O(d)$ regret.
arXiv Detail & Related papers (2023-10-02T08:15:52Z) - Typical and atypical solutions in non-convex neural networks with
discrete and continuous weights [2.7127628066830414]
We study the binary and continuous negative-margin perceptrons as simple non-constrained network models learning random rules and associations.
Both models exhibit subdominant minimizers which are extremely flat and wide.
For both models, the generalization performance as a learning device is shown to be greatly improved by the existence of wide flat minimizers.
arXiv Detail & Related papers (2023-04-26T23:34:40Z) - Deep Neural Networks for Nonparametric Interaction Models with Diverging
Dimension [6.939768185086753]
We analyze a $kth$ order nonparametric interaction model in both growing dimension scenarios ($d$ grows with $n$ but at a slower rate) and in high dimension ($d gtrsim n$)
We show that under certain standard assumptions, debiased deep neural networks achieve a minimax optimal rate both in terms of $(n, d)$.
arXiv Detail & Related papers (2023-02-12T04:19:39Z) - SUN: Exploring Intrinsic Uncertainties in Text-to-SQL Parsers [61.48159785138462]
This paper aims to improve the performance of text-to-dependence by exploring the intrinsic uncertainties in the neural network based approaches (called SUN)
Extensive experiments on five benchmark datasets demonstrate that our method significantly outperforms competitors and achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-09-14T06:27:51Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Causal Order Identification to Address Confounding: Binary Variables [4.56877715768796]
This paper considers an extension of the linear non-Gaussian acyclic model (LiNGAM)
LiNGAM determines the causal order among variables from a dataset when the variables are expressed by a set of linear equations, including noise.
arXiv Detail & Related papers (2021-08-10T22:09:43Z) - Instance-optimality in optimal value estimation: Adaptivity via
variance-reduced Q-learning [99.34907092347733]
We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete states and actions.
Using a local minimax framework, we show that this functional arises in lower bounds on the accuracy on any estimation procedure.
In the other direction, we establish the sharpness of our lower bounds, up to factors logarithmic in the state and action spaces, by analyzing a variance-reduced version of $Q$-learning.
arXiv Detail & Related papers (2021-06-28T00:38:54Z) - Analysis of Generalizability of Deep Neural Networks Based on the
Complexity of Decision Boundary [0.0]
We create the decision boundary complexity (DBC) score to define and measure the complexity of decision boundary of deep neural network (DNN) models.
The DBC score is shown to provide an effective method to measure the complexity of a decision boundary and gives a quantitative measure of the generalizability of DNNs.
arXiv Detail & Related papers (2020-09-16T23:25:52Z) - Effective Version Space Reduction for Convolutional Neural Networks [61.84773892603885]
In active learning, sampling bias could pose a serious inconsistency problem and hinder the algorithm from finding the optimal hypothesis.
We examine active learning with convolutional neural networks through the principled lens of version space reduction.
arXiv Detail & Related papers (2020-06-22T17:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.