Related papers: Sequential Density Ratio Estimation for Simultaneous Optimization of Speed and Accuracy

Sequential Density Ratio Estimation for Simultaneous Optimization of Speed and Accuracy

URL: http://arxiv.org/abs/2006.05587v3
Date: Sat, 6 Feb 2021 06:30:42 GMT
Title: Sequential Density Ratio Estimation for Simultaneous Optimization of Speed and Accuracy
Authors: Akinori F. Ebihara, Taiki Miyagawa, Kazuyuki Sakurai, Hitoshi Imaoka
Abstract summary: We propose the SPRT-TANDEM, a deep neural network-based SPRT algorithm that overcomes the above two obstacles. In tests on one original and two public video databases, the SPRT-TANDEM achieves statistically significantly better classification accuracy than other baselines.
Score: 11.470070927586017
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Classifying sequential data as early and as accurately as possible is a challenging yet critical problem, especially when a sampling cost is high. One algorithm that achieves this goal is the sequential probability ratio test (SPRT), which is known as Bayes-optimal: it can keep the expected number of data samples as small as possible, given the desired error upper-bound. However, the original SPRT makes two critical assumptions that limit its application in real-world scenarios: (i) samples are independently and identically distributed, and (ii) the likelihood of the data being derived from each class can be calculated precisely. Here, we propose the SPRT-TANDEM, a deep neural network-based SPRT algorithm that overcomes the above two obstacles. The SPRT-TANDEM sequentially estimates the log-likelihood ratio of two alternative hypotheses by leveraging a novel Loss function for Log-Likelihood Ratio estimation (LLLR) while allowing correlations up to $N (\in \mathbb{N})$ preceding samples. In tests on one original and two public video databases, Nosaic MNIST, UCF101, and SiW, the SPRT-TANDEM achieves statistically significantly better classification accuracy than other baseline classifiers, with a smaller number of data samples. The code and Nosaic MNIST are publicly available at https://github.com/TaikiMiyagawa/SPRT-TANDEM.

Related papers

Adaptive Sampled Softmax with Inverted Multi-Index: Methods, Theory and Applications [79.53938312089308]
The MIDX-Sampler is a novel adaptive sampling strategy based on an inverted multi-index approach. Our method is backed by rigorous theoretical analysis, addressing key concerns such as sampling bias, gradient bias, convergence rates, and generalization error bounds.
arXiv Detail & Related papers (2025-01-15T04:09:21Z)
Statistical Properties of Deep Neural Networks with Dependent Data [0.0]
This paper establishes statistical properties of deep neural network (DNN) estimators under dependent data. The framework provided also offers potential for research into other DNN architectures and time-series applications.
arXiv Detail & Related papers (2024-10-14T21:46:57Z)
Flexible Heteroscedastic Count Regression with Deep Double Poisson Networks [4.58556584533865]
We propose the Deep Double Poisson Network (DDPN) to produce accurate, input-conditional uncertainty representations. DDPN vastly outperforms existing discrete models. It can be applied to a variety of count regression datasets.
arXiv Detail & Related papers (2024-06-13T16:02:03Z)
Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples. Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance. We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z)
Favour: FAst Variance Operator for Uncertainty Rating [0.034530027457862]
Bayesian Neural Networks (BNN) have emerged as a crucial approach for interpreting ML predictions. By sampling from the posterior distribution, data scientists may estimate the uncertainty of an inference. Previous work proposed propagating the first and second moments of the posterior directly through the network. This method is even slower than sampling, so the propagated variance needs to be approximated. Our contribution is a more principled variance propagation framework.
arXiv Detail & Related papers (2023-11-21T22:53:20Z)
A Specialized Semismooth Newton Method for Kernel-Based Optimal Transport [92.96250725599958]
Kernel-based optimal transport (OT) estimators offer an alternative, functional estimation procedure to address OT problems from samples. We show that our SSN method achieves a global convergence rate of $O (1/sqrtk)$, and a local quadratic convergence rate under standard regularity conditions.
arXiv Detail & Related papers (2023-10-21T18:48:45Z)
Toward Asymptotic Optimality: Sequential Unsupervised Regression of Density Ratio for Early Classification [11.470070927586017]
Theoretically-inspired sequential density ratio estimation (SDRE) algorithms are proposed for the early classification of time series. Two novel SPRT-based algorithms, B2Bsqrt-TANDEM and TANDEMformer, are designed to avoid the overnormalization problem for precise unsupervised regression of SDRs.
arXiv Detail & Related papers (2023-02-20T07:20:53Z)
Generalized Differentiable RANSAC [95.95627475224231]
$nabla$-RANSAC is a differentiable RANSAC that allows learning the entire randomized robust estimation pipeline. $nabla$-RANSAC is superior to the state-of-the-art in terms of accuracy while running at a similar speed to its less accurate alternatives.
arXiv Detail & Related papers (2022-12-26T15:13:13Z)
Data Subsampling for Bayesian Neural Networks [0.0]
Penalty Bayesian Neural Networks - PBNNs - are a new algorithm that allows the evaluation of the likelihood using subsampled batch data. We show that PBNN achieves good predictive performance even for small mini-batch sizes of data.
arXiv Detail & Related papers (2022-10-17T14:43:35Z)
Adaptive Sketches for Robust Regression with Importance Sampling [64.75899469557272]
We introduce data structures for solving robust regression through gradient descent (SGD) Our algorithm effectively runs $T$ steps of SGD with importance sampling while using sublinear space and just making a single pass over the data.
arXiv Detail & Related papers (2022-07-16T03:09:30Z)
Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs) PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors. We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z)
The Power of Log-Sum-Exp: Sequential Density Ratio Matrix Estimation for Speed-Accuracy Optimization [0.0]
We propose a model for multiclass classification of time series to make a prediction as early and as accurate as possible. Our overall architecture for early classification, MSPRT-TANDEM, statistically significantly outperforms baseline models on four datasets.
arXiv Detail & Related papers (2021-05-28T07:21:58Z)
Improved, Deterministic Smoothing for L1 Certified Robustness [119.86676998327864]
We propose a non-additive and deterministic smoothing method, Deterministic Smoothing with Splitting Noise (DSSN) In contrast to uniform additive smoothing, the SSN certification does not require the random noise components used to be independent. This is the first work to provide deterministic "randomized smoothing" for a norm-based adversarial threat model.
arXiv Detail & Related papers (2021-03-17T21:49:53Z)
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model [50.38446482252857]
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator) We first consider $gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space $mathcalS$ and action space $mathcalA$. We prove that a plain model-based planning algorithm suffices to achieve minimax-optimal sample complexity given any target accuracy level.
arXiv Detail & Related papers (2020-05-26T17:53:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.