Related papers: On the regularized risk of distributionally robust learning over deep neural networks

On the regularized risk of distributionally robust learning over deep neural networks

URL: http://arxiv.org/abs/2109.06294v1
Date: Mon, 13 Sep 2021 20:10:39 GMT
Title: On the regularized risk of distributionally robust learning over deep neural networks
Authors: Camilo Garcia Trillos and Nicolas Garcia Trillos
Abstract summary: We study the relation between distributionally robust learning and different forms of regularization to enforce robustness of deep neural networks. We motivate a family of scalable algorithms for the training of robust neural networks.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper we explore the relation between distributionally robust learning and different forms of regularization to enforce robustness of deep neural networks. In particular, starting from a concrete min-max distributionally robust problem, and using tools from optimal transport theory, we derive first order and second order approximations to the distributionally robust problem in terms of appropriate regularized risk minimization problems. In the context of deep ResNet models, we identify the structure of the resulting regularization problems as mean-field optimal control problems where the number and dimension of state variables is within a dimension-free factor of the dimension of the original unrobust problem. Using the Pontryagin maximum principles associated to these problems we motivate a family of scalable algorithms for the training of robust neural networks. Our analysis recovers some results and algorithms known in the literature (in settings explained throughout the paper) and provides many other theoretical and algorithmic insights that to our knowledge are novel. In our analysis we employ tools that we deem useful for a future analysis of more general adversarial learning problems.

Related papers

A simple algorithm for output range analysis for deep neural networks [0.0]
This paper presents a novel approach for the output range estimation problem in Deep Neural Networks (DNNs) by integrating a Simulated Annealing (SA) algorithm. The method effectively addresses the challenges by the lack of geometric information and non-linearity inherent inResNets.
arXiv Detail & Related papers (2024-07-02T22:47:40Z)
The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning [71.14237199051276]
We consider classical distribution-agnostic framework and algorithms minimising empirical risks. We show that there is a large family of tasks for which computing and verifying ideal stable and accurate neural networks is extremely challenging.
arXiv Detail & Related papers (2023-09-13T16:33:27Z)
Approximation Power of Deep Neural Networks: an explanatory mathematical survey [0.0]
The goal of this survey is to present an explanatory review of the approximation properties of deep neural networks. We aim at understanding how and why deep neural networks outperform other classical linear and nonlinear approximation methods.
arXiv Detail & Related papers (2022-07-19T18:47:44Z)
Imbedding Deep Neural Networks [0.0]
Continuous depth neural networks, such as Neural ODEs, have refashioned the understanding of residual neural networks in terms of non-linear vector-valued optimal control problems. We propose a new approach which explicates the network's depth' as a fundamental variable, thus reducing the problem to a system of forward-facing initial value problems.
arXiv Detail & Related papers (2022-01-31T22:00:41Z)
Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks [19.99615698375829]
We show that generalization error can be equivalently bounded in terms of a notion called the 'persistent homology dimension' (PHD) We develop an efficient algorithm to estimate PHD in the scale of modern deep neural networks. Our experiments show that the proposed approach can efficiently compute a network's intrinsic dimension in a variety of settings.
arXiv Detail & Related papers (2021-11-25T17:06:15Z)
Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks [75.33431791218302]
We study the training problem of deep neural networks and introduce an analytic approach to unveil hidden convexity in the optimization landscape. We consider a deep parallel ReLU network architecture, which also includes standard deep networks and ResNets as its special cases.
arXiv Detail & Related papers (2021-10-18T18:00:36Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
Developing Constrained Neural Units Over Time [81.19349325749037]
This paper focuses on an alternative way of defining Neural Networks, that is different from the majority of existing approaches. The structure of the neural architecture is defined by means of a special class of constraints that are extended also to the interaction with data. The proposed theory is cast into the time domain, in which data are presented to the network in an ordered manner.
arXiv Detail & Related papers (2020-09-01T09:07:25Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamics [50.83356836818667]
We introduce a new theoretical framework to analyze deep learning optimization with connection to its generalization error. Existing frameworks such as mean field theory and neural tangent kernel theory for neural network optimization analysis typically require taking limit of infinite width of the network to show its global convergence.
arXiv Detail & Related papers (2020-07-11T18:19:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.