Related papers: A generalized quadratic loss for SVM and Deep Neural Networks

A generalized quadratic loss for SVM and Deep Neural Networks

URL: http://arxiv.org/abs/2102.07606v1
Date: Mon, 15 Feb 2021 15:49:08 GMT
Title: A generalized quadratic loss for SVM and Deep Neural Networks
Authors: Filippo Portera
Abstract summary: We consider some supervised binary classification tasks and a regression task, whereas SVM and Deep Learning, at present, exhibit the best generalization performances. We extend the work [3] on a generalized quadratic loss for learning problems that examines pattern correlations in order to concentrate the learning problem into input space regions where patterns are more densely distributed.
Score: 0.0
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: We consider some supervised binary classification tasks and a regression task, whereas SVM and Deep Learning, at present, exhibit the best generalization performances. We extend the work [3] on a generalized quadratic loss for learning problems that examines pattern correlations in order to concentrate the learning problem into input space regions where patterns are more densely distributed. From a shallow methods point of view (e.g.: SVM), since the following mathematical derivation of problem (9) in [3] is incorrect, we restart from problem (8) in [3] and we try to solve it with one procedure that iterates over the dual variables until the primal and dual objective functions converge. In addition we propose another algorithm that tries to solve the classification problem directly from the primal problem formulation. We make also use of Multiple Kernel Learning to improve generalization performances. Moreover, we introduce for the first time a custom loss that takes in consideration pattern correlation for a shallow and a Deep Learning task. We propose some pattern selection criteria and the results on 4 UCI data-sets for the SVM method. We also report the results on a larger binary classification data-set based on Twitter, again drawn from UCI, combined with shallow Learning Neural Networks, with and without the generalized quadratic loss. At last, we test our loss with a Deep Neural Network within a larger regression task taken from UCI. We compare the results of our optimizers with the well known solver SVMlight and with Keras Multi-Layers Neural Networks with standard losses and with a parameterized generalized quadratic loss, and we obtain comparable results.

Related papers

On Sequential Maximum a Posteriori Inference for Continual Learning [0.0]
We formulate sequential maximum a posteriori inference as a recursion of loss functions and reduce the problem of continual learning to approximating the previous loss function. We propose two coreset-free methods: autodiff quadratic consolidation, which uses an accurate and full quadratic approximation, and neural consolidation, which uses a neural network approximation. We find that neural consolidation performs well in the classical task sequences, where the input dimension is small, while autodiff quadratic consolidation performs consistently well in image task sequences with a fixed pre-trained feature extractor.
arXiv Detail & Related papers (2024-05-26T09:20:47Z)
What to Do When Your Discrete Optimization Is the Size of a Neural Network? [24.546550334179486]
Machine learning applications using neural networks involve solving discrete optimization problems. classical approaches used in discrete settings do not scale well to large neural networks. We take continuation path (CP) methods to represent using purely the former and Monte Carlo (MC) methods to represent the latter.
arXiv Detail & Related papers (2024-02-15T21:57:43Z)
Learning To Dive In Branch And Bound [95.13209326119153]
We propose L2Dive to learn specific diving structurals with graph neural networks. We train generative models to predict variable assignments and leverage the duality of linear programs to make diving decisions.
arXiv Detail & Related papers (2023-01-24T12:01:45Z)
Towards Better Out-of-Distribution Generalization of Neural Algorithmic Reasoning Tasks [51.8723187709964]
We study the OOD generalization of neural algorithmic reasoning tasks. The goal is to learn an algorithm from input-output pairs using deep neural networks.
arXiv Detail & Related papers (2022-11-01T18:33:20Z)
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization [118.50301177912381]
We show that Adam can converge to different solutions of the objective with provably different errors, even with weight decay globalization. We show that if convex, and the weight decay regularization is employed, any optimization algorithms including Adam will converge to the same solution.
arXiv Detail & Related papers (2021-08-25T17:58:21Z)
Network Support for High-performance Distributed Machine Learning [17.919773898228716]
We propose a system model that captures both learning nodes (that perform computations) and information nodes (that provide data) We then formulate the problem of selecting (i) which learning and information nodes should cooperate to complete the learning task, and (ii) the number of iterations to perform. We devise an algorithm, named DoubleClimb, that can find a 1+1/|I|-competitive solution with cubic worst-case complexity.
arXiv Detail & Related papers (2021-02-05T19:38:57Z)
Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks. We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator. To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z)
Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice. We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z)
An Efficient Framework for Clustered Federated Learning [26.24231986590374]
We address the problem of federated learning (FL) where users are distributed into clusters. We propose the Iterative Federated Clustering Algorithm (IFCA) We show that our algorithm is efficient in non- partitioned problems such as neural networks.
arXiv Detail & Related papers (2020-06-07T08:48:59Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.