Differentiable Segmentation of Sequences
- URL: http://arxiv.org/abs/2006.13105v2
- Date: Mon, 18 Jan 2021 11:11:04 GMT
- Title: Differentiable Segmentation of Sequences
- Authors: Erik Scharw\"achter and Jonathan Lennartz and Emmanuel M\"uller
- Abstract summary: We build on advances in learning continuous warping functions and propose a novel family of warping functions based on the two-sided power (TSP) distribution.
Our formulation includes the important class of segmented generalized linear models as a special case.
We use our approach to model the spread of COVID-19 with Poisson regression, apply it on a change point detection task, and learn classification models with concept drift.
- Score: 2.1485350418225244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Segmented models are widely used to describe non-stationary sequential data
with discrete change points. Their estimation usually requires solving a mixed
discrete-continuous optimization problem, where the segmentation is the
discrete part and all other model parameters are continuous. A number of
estimation algorithms have been developed that are highly specialized for their
specific model assumptions. The dependence on non-standard algorithms makes it
hard to integrate segmented models in state-of-the-art deep learning
architectures that critically depend on gradient-based optimization techniques.
In this work, we formulate a relaxed variant of segmented models that enables
joint estimation of all model parameters, including the segmentation, with
gradient descent. We build on recent advances in learning continuous warping
functions and propose a novel family of warping functions based on the
two-sided power (TSP) distribution. TSP-based warping functions are
differentiable, have simple closed-form expressions, and can represent
segmentation functions exactly. Our formulation includes the important class of
segmented generalized linear models as a special case, which makes it highly
versatile. We use our approach to model the spread of COVID-19 with Poisson
regression, apply it on a change point detection task, and learn classification
models with concept drift. The experiments show that our approach effectively
learns all these tasks with standard algorithms for gradient descent.
Related papers
- Gradient Estimation and Variance Reduction in Stochastic and Deterministic Models [0.0]
This dissertation considers unconstrained, nonlinear optimization problems.
We focus on the gradient itself, that key quantity which enables the solution of such problems.
We present a new framework for calculating the gradient of problems involving both deterministic and elements.
arXiv Detail & Related papers (2024-05-14T14:41:58Z) - Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Towards Better Certified Segmentation via Diffusion Models [62.21617614504225]
segmentation models can be vulnerable to adversarial perturbations, which hinders their use in critical-decision systems like healthcare or autonomous driving.
Recently, randomized smoothing has been proposed to certify segmentation predictions by adding Gaussian noise to the input to obtain theoretical guarantees.
In this paper, we address the problem of certifying segmentation prediction using a combination of randomized smoothing and diffusion models.
arXiv Detail & Related papers (2023-06-16T16:30:39Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Scalable mixed-domain Gaussian process modeling and model reduction for longitudinal data [5.00301731167245]
We derive a basis function approximation scheme for mixed-domain covariance functions.
We show that we can approximate the exact GP model accurately in a fraction of the runtime.
We also demonstrate a scalable model reduction workflow for obtaining smaller and more interpretable models.
arXiv Detail & Related papers (2021-11-03T04:47:37Z) - Differentiable Spline Approximations [48.10988598845873]
Differentiable programming has significantly enhanced the scope of machine learning.
Standard differentiable programming methods (such as autodiff) typically require that the machine learning models be differentiable.
We show that leveraging this redesigned Jacobian in the form of a differentiable "layer" in predictive models leads to improved performance in diverse applications.
arXiv Detail & Related papers (2021-10-04T16:04:46Z) - Kernel Clustering with Sigmoid-based Regularization for Efficient
Segmentation of Sequential Data [3.8326963933937885]
segmentation aims at partitioning a data sequence into several non-overlapping segments that may have nonlinear and complex structures.
A popular Kernel for optimally solving this problem is dynamic programming (DP), which has quadratic computation and memory requirements.
Although many algorithms have been proposed to approximate the optimal segmentation, they have no guarantee on the quality of their solutions.
arXiv Detail & Related papers (2021-06-22T04:32:21Z) - Gaussian Process Latent Class Choice Models [7.992550355579791]
We present a non-parametric class of probabilistic machine learning within discrete choice models (DCMs)
The proposed model would assign individuals probabilistically to behaviorally homogeneous clusters (latent classes) using GPs.
The model is tested on two different mode choice applications and compared against different LCCM benchmarks.
arXiv Detail & Related papers (2021-01-28T19:56:42Z) - Probabilistic Circuits for Variational Inference in Discrete Graphical
Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult.
Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO)
We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN)
We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.