On the regularized risk of distributionally robust learning over deep
neural networks
- URL: http://arxiv.org/abs/2109.06294v1
- Date: Mon, 13 Sep 2021 20:10:39 GMT
- Title: On the regularized risk of distributionally robust learning over deep
neural networks
- Authors: Camilo Garcia Trillos and Nicolas Garcia Trillos
- Abstract summary: We study the relation between distributionally robust learning and different forms of regularization to enforce robustness of deep neural networks.
We motivate a family of scalable algorithms for the training of robust neural networks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we explore the relation between distributionally robust
learning and different forms of regularization to enforce robustness of deep
neural networks. In particular, starting from a concrete min-max
distributionally robust problem, and using tools from optimal transport theory,
we derive first order and second order approximations to the distributionally
robust problem in terms of appropriate regularized risk minimization problems.
In the context of deep ResNet models, we identify the structure of the
resulting regularization problems as mean-field optimal control problems where
the number and dimension of state variables is within a dimension-free factor
of the dimension of the original unrobust problem. Using the Pontryagin maximum
principles associated to these problems we motivate a family of scalable
algorithms for the training of robust neural networks. Our analysis recovers
some results and algorithms known in the literature (in settings explained
throughout the paper) and provides many other theoretical and algorithmic
insights that to our knowledge are novel. In our analysis we employ tools that
we deem useful for a future analysis of more general adversarial learning
problems.
Related papers
- The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning [71.14237199051276]
We consider classical distribution-agnostic framework and algorithms minimising empirical risks.
We show that there is a large family of tasks for which computing and verifying ideal stable and accurate neural networks is extremely challenging.
arXiv Detail & Related papers (2023-09-13T16:33:27Z) - Approximation Power of Deep Neural Networks: an explanatory mathematical
survey [0.0]
The goal of this survey is to present an explanatory review of the approximation properties of deep neural networks.
We aim at understanding how and why deep neural networks outperform other classical linear and nonlinear approximation methods.
arXiv Detail & Related papers (2022-07-19T18:47:44Z) - Imbedding Deep Neural Networks [0.0]
Continuous depth neural networks, such as Neural ODEs, have refashioned the understanding of residual neural networks in terms of non-linear vector-valued optimal control problems.
We propose a new approach which explicates the network's depth' as a fundamental variable, thus reducing the problem to a system of forward-facing initial value problems.
arXiv Detail & Related papers (2022-01-31T22:00:41Z) - Intrinsic Dimension, Persistent Homology and Generalization in Neural
Networks [19.99615698375829]
We show that generalization error can be equivalently bounded in terms of a notion called the 'persistent homology dimension' (PHD)
We develop an efficient algorithm to estimate PHD in the scale of modern deep neural networks.
Our experiments show that the proposed approach can efficiently compute a network's intrinsic dimension in a variety of settings.
arXiv Detail & Related papers (2021-11-25T17:06:15Z) - Path Regularization: A Convexity and Sparsity Inducing Regularization
for Parallel ReLU Networks [75.33431791218302]
We study the training problem of deep neural networks and introduce an analytic approach to unveil hidden convexity in the optimization landscape.
We consider a deep parallel ReLU network architecture, which also includes standard deep networks and ResNets as its special cases.
arXiv Detail & Related papers (2021-10-18T18:00:36Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Developing Constrained Neural Units Over Time [81.19349325749037]
This paper focuses on an alternative way of defining Neural Networks, that is different from the majority of existing approaches.
The structure of the neural architecture is defined by means of a special class of constraints that are extended also to the interaction with data.
The proposed theory is cast into the time domain, in which data are presented to the network in an ordered manner.
arXiv Detail & Related papers (2020-09-01T09:07:25Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Generalization bound of globally optimal non-convex neural network
training: Transportation map estimation by infinite dimensional Langevin
dynamics [50.83356836818667]
We introduce a new theoretical framework to analyze deep learning optimization with connection to its generalization error.
Existing frameworks such as mean field theory and neural tangent kernel theory for neural network optimization analysis typically require taking limit of infinite width of the network to show its global convergence.
arXiv Detail & Related papers (2020-07-11T18:19:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.