Sequential Density Ratio Estimation for Simultaneous Optimization of
Speed and Accuracy
- URL: http://arxiv.org/abs/2006.05587v3
- Date: Sat, 6 Feb 2021 06:30:42 GMT
- Title: Sequential Density Ratio Estimation for Simultaneous Optimization of
Speed and Accuracy
- Authors: Akinori F. Ebihara, Taiki Miyagawa, Kazuyuki Sakurai, Hitoshi Imaoka
- Abstract summary: We propose the SPRT-TANDEM, a deep neural network-based SPRT algorithm that overcomes the above two obstacles.
In tests on one original and two public video databases, the SPRT-TANDEM achieves statistically significantly better classification accuracy than other baselines.
- Score: 11.470070927586017
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Classifying sequential data as early and as accurately as possible is a
challenging yet critical problem, especially when a sampling cost is high. One
algorithm that achieves this goal is the sequential probability ratio test
(SPRT), which is known as Bayes-optimal: it can keep the expected number of
data samples as small as possible, given the desired error upper-bound.
However, the original SPRT makes two critical assumptions that limit its
application in real-world scenarios: (i) samples are independently and
identically distributed, and (ii) the likelihood of the data being derived from
each class can be calculated precisely. Here, we propose the SPRT-TANDEM, a
deep neural network-based SPRT algorithm that overcomes the above two
obstacles. The SPRT-TANDEM sequentially estimates the log-likelihood ratio of
two alternative hypotheses by leveraging a novel Loss function for
Log-Likelihood Ratio estimation (LLLR) while allowing correlations up to $N
(\in \mathbb{N})$ preceding samples. In tests on one original and two public
video databases, Nosaic MNIST, UCF101, and SiW, the SPRT-TANDEM achieves
statistically significantly better classification accuracy than other baseline
classifiers, with a smaller number of data samples. The code and Nosaic MNIST
are publicly available at https://github.com/TaikiMiyagawa/SPRT-TANDEM.
Related papers
- Statistical Properties of Deep Neural Networks with Dependent Data [0.0]
This paper establishes statistical properties of deep neural network (DNN) estimators under dependent data.
The framework provided also offers potential for research into other DNN architectures and time-series applications.
arXiv Detail & Related papers (2024-10-14T21:46:57Z) - Flexible Heteroscedastic Count Regression with Deep Double Poisson Networks [4.58556584533865]
We propose the Deep Double Poisson Network (DDPN) to produce accurate, input-conditional uncertainty representations.
DDPN vastly outperforms existing discrete models.
It can be applied to a variety of count regression datasets.
arXiv Detail & Related papers (2024-06-13T16:02:03Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Favour: FAst Variance Operator for Uncertainty Rating [0.034530027457862]
Bayesian Neural Networks (BNN) have emerged as a crucial approach for interpreting ML predictions.
By sampling from the posterior distribution, data scientists may estimate the uncertainty of an inference.
Previous work proposed propagating the first and second moments of the posterior directly through the network.
This method is even slower than sampling, so the propagated variance needs to be approximated.
Our contribution is a more principled variance propagation framework.
arXiv Detail & Related papers (2023-11-21T22:53:20Z) - A Specialized Semismooth Newton Method for Kernel-Based Optimal
Transport [92.96250725599958]
Kernel-based optimal transport (OT) estimators offer an alternative, functional estimation procedure to address OT problems from samples.
We show that our SSN method achieves a global convergence rate of $O (1/sqrtk)$, and a local quadratic convergence rate under standard regularity conditions.
arXiv Detail & Related papers (2023-10-21T18:48:45Z) - Toward Asymptotic Optimality: Sequential Unsupervised Regression of
Density Ratio for Early Classification [11.470070927586017]
Theoretically-inspired sequential density ratio estimation (SDRE) algorithms are proposed for the early classification of time series.
Two novel SPRT-based algorithms, B2Bsqrt-TANDEM and TANDEMformer, are designed to avoid the overnormalization problem for precise unsupervised regression of SDRs.
arXiv Detail & Related papers (2023-02-20T07:20:53Z) - Data Subsampling for Bayesian Neural Networks [0.0]
Penalty Bayesian Neural Networks - PBNNs - are a new algorithm that allows the evaluation of the likelihood using subsampled batch data.
We show that PBNN achieves good predictive performance even for small mini-batch sizes of data.
arXiv Detail & Related papers (2022-10-17T14:43:35Z) - Adaptive Sketches for Robust Regression with Importance Sampling [64.75899469557272]
We introduce data structures for solving robust regression through gradient descent (SGD)
Our algorithm effectively runs $T$ steps of SGD with importance sampling while using sublinear space and just making a single pass over the data.
arXiv Detail & Related papers (2022-07-16T03:09:30Z) - Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs)
PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors.
We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z) - Improved, Deterministic Smoothing for L1 Certified Robustness [119.86676998327864]
We propose a non-additive and deterministic smoothing method, Deterministic Smoothing with Splitting Noise (DSSN)
In contrast to uniform additive smoothing, the SSN certification does not require the random noise components used to be independent.
This is the first work to provide deterministic "randomized smoothing" for a norm-based adversarial threat model.
arXiv Detail & Related papers (2021-03-17T21:49:53Z) - Breaking the Sample Size Barrier in Model-Based Reinforcement Learning
with a Generative Model [50.38446482252857]
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator)
We first consider $gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space $mathcalS$ and action space $mathcalA$.
We prove that a plain model-based planning algorithm suffices to achieve minimax-optimal sample complexity given any target accuracy level.
arXiv Detail & Related papers (2020-05-26T17:53:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.