Related papers: Deep Learning with Nonsmooth Objectives

Deep Learning with Nonsmooth Objectives

URL: http://arxiv.org/abs/2107.08800v1
Date: Wed, 14 Jul 2021 02:01:53 GMT
Title: Deep Learning with Nonsmooth Objectives
Authors: Vinesha Peiris, Nadezda Sukhorukova, Vera Roshchina
Abstract summary: We explore the potential for using a nonsmooth loss function based on the max-norm in the training of an artificial neural network. We hypothesise that this may lead to superior classification results in some special cases where the training data is either very small or unbalanced.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We explore the potential for using a nonsmooth loss function based on the max-norm in the training of an artificial neural network. We hypothesise that this may lead to superior classification results in some special cases where the training data is either very small or unbalanced. Our numerical experiments performed on a simple artificial neural network with no hidden layers (a setting immediately amenable to standard nonsmooth optimisation techniques) appear to confirm our hypothesis that uniform approximation based approaches may be more suitable for the datasets with reliable training data that either is limited size or biased in terms of relative cluster sizes.

Related papers

Private Training & Data Generation by Clustering Embeddings [74.00687214400021]
Differential privacy (DP) provides a robust framework for protecting individual data.<n>We introduce a novel principled method for DP synthetic image embedding generation.<n> Empirically, a simple two-layer neural network trained on synthetically generated embeddings achieves state-of-the-art (SOTA) classification accuracy.
arXiv Detail & Related papers (2025-06-20T00:17:14Z)
Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters. In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z)
An unfolding method based on conditional Invertible Neural Networks (cINN) using iterative training [0.0]
Generative networks like invertible neural networks(INN) enable a probabilistic unfolding. We introduce the iterative conditional INN(IcINN) for unfolding that adjusts for deviations between simulated training samples and data.
arXiv Detail & Related papers (2022-12-16T19:00:05Z)
Precision Machine Learning [5.15188009671301]
We compare various function approximation methods and study how they scale with increasing parameters and data. We find that neural networks can often outperform classical approximation methods on high-dimensional examples. We develop training tricks which enable us to train neural networks to extremely low loss, close to the limits allowed by numerical precision.
arXiv Detail & Related papers (2022-10-24T17:58:30Z)
Prototype-Anchored Learning for Learning with Imperfect Annotations [83.7763875464011]
It is challenging to learn unbiased classification models from imperfectly annotated datasets. We propose a prototype-anchored learning (PAL) method, which can be easily incorporated into various learning-based classification schemes. We verify the effectiveness of PAL on class-imbalanced learning and noise-tolerant learning by extensive experiments on synthetic and real-world datasets.
arXiv Detail & Related papers (2022-06-23T10:25:37Z)
Likelihood-Free Inference with Generative Neural Networks via Scoring Rule Minimization [0.0]
Inference methods yield posterior approximations for simulator models with intractable likelihood. Many works trained neural networks to approximate either the intractable likelihood or the posterior directly. Here, we propose to approximate the posterior with generative networks trained by Scoring Rule minimization.
arXiv Detail & Related papers (2022-05-31T13:32:55Z)
Benign Overfitting without Linearity: Neural Network Classifiers Trained by Gradient Descent for Noisy Linear Data [44.431266188350655]
We consider the generalization error of two-layer neural networks trained to generalize by gradient descent. We show that neural networks exhibit benign overfitting: they can be driven to zero training error, perfectly fitting any noisy training labels, and simultaneously achieve minimax optimal test error. In contrast to previous work on benign overfitting that require linear or kernel-based predictors, our analysis holds in a setting where both the model and learning dynamics are fundamentally nonlinear.
arXiv Detail & Related papers (2022-02-11T23:04:00Z)
Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss. We examine how these benign overfitting phenomena occur in a two-layer neural network setting. We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z)
Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties. Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z)
Convergence rates for gradient descent in the training of overparameterized artificial neural networks with biases [3.198144010381572]
In recent years, artificial neural networks have developed into a powerful tool for dealing with a multitude of problems for which classical solution approaches. It is still unclear why randomly gradient descent algorithms reach their limits.
arXiv Detail & Related papers (2021-02-23T18:17:47Z)
Attribute-Guided Adversarial Training for Robustness to Natural Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space. Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z)
Deep Magnification-Flexible Upsampling over 3D Point Clouds [103.09504572409449]
We propose a novel end-to-end learning-based framework to generate dense point clouds. We first formulate the problem explicitly, which boils down to determining the weights and high-order approximation errors. Then, we design a lightweight neural network to adaptively learn unified and sorted weights as well as the high-order refinements.
arXiv Detail & Related papers (2020-11-25T14:00:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.