All your loss are belong to Bayes
- URL: http://arxiv.org/abs/2006.04633v2
- Date: Thu, 5 Nov 2020 07:05:16 GMT
- Title: All your loss are belong to Bayes
- Authors: Christian Walder and Richard Nock
- Abstract summary: Loss functions are a cornerstone of machine learning and the starting point of most algorithms.
We introduce a trick on squared Gaussian Processes to obtain a random process whose paths are compliant source functions.
Experimental results demonstrate substantial improvements over the state of the art.
- Score: 28.393499629583786
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Loss functions are a cornerstone of machine learning and the starting point
of most algorithms. Statistics and Bayesian decision theory have contributed,
via properness, to elicit over the past decades a wide set of admissible losses
in supervised learning, to which most popular choices belong (logistic, square,
Matsushita, etc.). Rather than making a potentially biased ad hoc choice of the
loss, there has recently been a boost in efforts to fit the loss to the domain
at hand while training the model itself. The key approaches fit a canonical
link, a function which monotonically relates the closed unit interval to R and
can provide a proper loss via integration. In this paper, we rely on a broader
view of proper composite losses and a recent construct from information
geometry, source functions, whose fitting alleviates constraints faced by
canonical links. We introduce a trick on squared Gaussian Processes to obtain a
random process whose paths are compliant source functions with many desirable
properties in the context of link estimation. Experimental results demonstrate
substantial improvements over the state of the art.
Related papers
- Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - Accelerated Neural Network Training with Rooted Logistic Objectives [13.400503928962756]
We derive a novel sequence of em strictly convex functions that are at least as strict as logistic loss.
Our results illustrate that training with rooted loss function is converged faster and gains performance improvements.
arXiv Detail & Related papers (2023-10-05T20:49:48Z) - Robust Outlier Rejection for 3D Registration with Variational Bayes [70.98659381852787]
We develop a novel variational non-local network-based outlier rejection framework for robust alignment.
We propose a voting-based inlier searching strategy to cluster the high-quality hypothetical inliers for transformation estimation.
arXiv Detail & Related papers (2023-04-04T03:48:56Z) - A survey and taxonomy of loss functions in machine learning [60.41650195728953]
Most state-of-the-art machine learning techniques revolve around the optimisation of loss functions.
This survey aims to provide a reference of the most essential loss functions for both beginner and advanced machine learning practitioners.
arXiv Detail & Related papers (2023-01-13T14:38:24Z) - The Geometry and Calculus of Losses [10.451984251615512]
We develop the theory of loss functions for binary and multiclass classification and class probability estimation problems.
The perspective provides three novel opportunities.
It enables the development of a fundamental relationship between losses and (anti)-norms that appears to have not been noticed before.
Second, it enables the development of a calculus of losses induced by the calculus of convex sets.
Third, the perspective leads to a natural theory of polar'' loss functions, which are derived from the polar dual of the convex set defining the loss.
arXiv Detail & Related papers (2022-09-01T05:57:19Z) - Gleo-Det: Deep Convolution Feature-Guided Detector with Local Entropy
Optimization for Salient Points [5.955667705173262]
We propose to achieve fine constraint based on the requirement of repeatability while coarse constraint with guidance of deep convolution features.
With the guidance of convolution features, we define the cost function from both positive and negative sides.
arXiv Detail & Related papers (2022-04-27T12:40:21Z) - Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise
Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances.
Online descent (OGD) is a popular approach to handle streaming data in pairwise learning.
In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z) - Ensemble of Loss Functions to Improve Generalizability of Deep Metric
Learning methods [0.609170287691728]
We propose novel approaches to combine different losses built on top of a shared deep feature extractor.
We evaluate our methods on some popular datasets from the machine vision domain in conventional Zero-Shot-Learning (ZSL) settings.
arXiv Detail & Related papers (2021-07-02T15:19:46Z) - Approximation Schemes for ReLU Regression [80.33702497406632]
We consider the fundamental problem of ReLU regression.
The goal is to output the best fitting ReLU with respect to square loss given to draws from some unknown distribution.
arXiv Detail & Related papers (2020-05-26T16:26:17Z) - A Comparison of Metric Learning Loss Functions for End-To-End Speaker
Verification [4.617249742207066]
We compare several metric learning loss functions in a systematic manner on the VoxCeleb dataset.
We show that the additive angular margin loss function outperforms all other loss functions in the study.
Based on a combination of SincNet trainable features and the x-vector architecture, the network used in this paper brings us a step closer to a really-end-to-end speaker verification system.
arXiv Detail & Related papers (2020-03-31T08:36:07Z) - Supervised Learning: No Loss No Cry [51.07683542418145]
Supervised learning requires the specification of a loss function to minimise.
This paper revisits the sc SLIsotron algorithm of Kakade et al. (2011) through a novel lens.
We show how it provides a principled procedure for learning the loss.
arXiv Detail & Related papers (2020-02-10T05:30:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.