Real order total variation with applications to the loss functions in
learning schemes
- URL: http://arxiv.org/abs/2204.04582v1
- Date: Sun, 10 Apr 2022 02:44:04 GMT
- Title: Real order total variation with applications to the loss functions in
learning schemes
- Authors: Pan Liu, Xin Yang Lu, Kunlun He
- Abstract summary: We propose a loss function consisting of a $r$-order (an)-isotropic total variation semi-norms $TVr$, $rin mathbbR+$.
We focus on studying key theoretical properties, such as the lower semi-continuity and compactness with respect to both the function and the order of derivative $r$, of such loss functions.
- Score: 5.8868325478050165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Loss function are an essential part in modern data-driven approach, such as
bi-level training scheme and machine learnings. In this paper we propose a loss
function consisting of a $r$-order (an)-isotropic total variation semi-norms
$TV^r$, $r\in \mathbb{R}^+$, defined via the Riemann-Liouville (R-L) fractional
derivative. We focus on studying key theoretical properties, such as the lower
semi-continuity and compactness with respect to both the function and the order
of derivative $r$, of such loss functions.
Related papers
- $\alpha$-Divergence Loss Function for Neural Density Ratio Estimation [0.0]
An $alpha$-divergence loss function ($alpha$-Div) that offers concise implementation and stable optimization is proposed in this paper.
The stability of the proposed loss function is empirically demonstrated and the estimation accuracy of DRE tasks is investigated.
arXiv Detail & Related papers (2024-02-03T05:33:01Z) - Adam-like Algorithm with Smooth Clipping Attains Global Minima: Analysis
Based on Ergodicity of Functional SDEs [0.0]
We show that an Adam-type algorithm with clipping the globalized non--1 loss function minimizes the regularized non--1 error form.
We also apply the ergodic theory of smooth groups to investigate approaches to learn inverse temperature and time.
arXiv Detail & Related papers (2023-11-29T14:38:59Z) - A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning
with General Function Approximation [66.26739783789387]
We propose a new algorithm, Monotonic Q-Learning with Upper Confidence Bound (MQL-UCB) for reinforcement learning.
MQL-UCB achieves minimax optimal regret of $tildeO(dsqrtHK)$ when $K$ is sufficiently large and near-optimal policy switching cost.
Our work sheds light on designing provably sample-efficient and deployment-efficient Q-learning with nonlinear function approximation.
arXiv Detail & Related papers (2023-11-26T08:31:57Z) - A survey and taxonomy of loss functions in machine learning [60.41650195728953]
Most state-of-the-art machine learning techniques revolve around the optimisation of loss functions.
This survey aims to provide a reference of the most essential loss functions for both beginner and advanced machine learning practitioners.
arXiv Detail & Related papers (2023-01-13T14:38:24Z) - The Geometry and Calculus of Losses [10.451984251615512]
We develop the theory of loss functions for binary and multiclass classification and class probability estimation problems.
The perspective provides three novel opportunities.
It enables the development of a fundamental relationship between losses and (anti)-norms that appears to have not been noticed before.
Second, it enables the development of a calculus of losses induced by the calculus of convex sets.
Third, the perspective leads to a natural theory of polar'' loss functions, which are derived from the polar dual of the convex set defining the loss.
arXiv Detail & Related papers (2022-09-01T05:57:19Z) - The Computational Complexity of ReLU Network Training Parameterized by
Data Dimensionality [8.940054309023525]
We analyze the influence of the dimension $d$ of the training data on the computational complexity.
We prove that known brute-force strategies are essentially optimal.
In particular, we extend a known-time algorithm for constant $d$ and convex loss functions to a more general class of loss functions.
arXiv Detail & Related papers (2021-05-18T17:05:26Z) - Deep neural network approximation of analytic functions [91.3755431537592]
entropy bound for the spaces of neural networks with piecewise linear activation functions.
We derive an oracle inequality for the expected error of the considered penalized deep neural network estimators.
arXiv Detail & Related papers (2021-04-05T18:02:04Z) - On Function Approximation in Reinforcement Learning: Optimism in the
Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning.
In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function.
Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z) - $\sigma^2$R Loss: a Weighted Loss by Multiplicative Factors using
Sigmoidal Functions [0.9569316316728905]
We introduce a new loss function called squared reduction loss ($sigma2$R loss), which is regulated by a sigmoid function to inflate/deflate the error per instance.
Our loss has clear intuition and geometric interpretation, we demonstrate by experiments the effectiveness of our proposal.
arXiv Detail & Related papers (2020-09-18T12:34:40Z) - Reinforcement Learning with General Value Function Approximation:
Provably Efficient Approach via Bounded Eluder Dimension [124.7752517531109]
We establish a provably efficient reinforcement learning algorithm with general value function approximation.
We show that our algorithm achieves a regret bound of $widetildeO(mathrmpoly(dH)sqrtT)$ where $d$ is a complexity measure.
Our theory generalizes recent progress on RL with linear value function approximation and does not make explicit assumptions on the model of the environment.
arXiv Detail & Related papers (2020-05-21T17:36:09Z) - Automatic Differentiation in ROOT [62.997667081978825]
In mathematics and computer algebra, automatic differentiation (AD) is a set of techniques to evaluate the derivative of a function specified by a computer program.
This paper presents AD techniques available in ROOT, supported by Cling, to produce derivatives of arbitrary C/C++ functions.
arXiv Detail & Related papers (2020-04-09T09:18:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.