Sequential Density Ratio Estimation for Simultaneous Optimization of
Speed and Accuracy
- URL: http://arxiv.org/abs/2006.05587v3
- Date: Sat, 6 Feb 2021 06:30:42 GMT
- Title: Sequential Density Ratio Estimation for Simultaneous Optimization of
Speed and Accuracy
- Authors: Akinori F. Ebihara, Taiki Miyagawa, Kazuyuki Sakurai, Hitoshi Imaoka
- Abstract summary: We propose the SPRT-TANDEM, a deep neural network-based SPRT algorithm that overcomes the above two obstacles.
In tests on one original and two public video databases, the SPRT-TANDEM achieves statistically significantly better classification accuracy than other baselines.
- Score: 11.470070927586017
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Classifying sequential data as early and as accurately as possible is a
challenging yet critical problem, especially when a sampling cost is high. One
algorithm that achieves this goal is the sequential probability ratio test
(SPRT), which is known as Bayes-optimal: it can keep the expected number of
data samples as small as possible, given the desired error upper-bound.
However, the original SPRT makes two critical assumptions that limit its
application in real-world scenarios: (i) samples are independently and
identically distributed, and (ii) the likelihood of the data being derived from
each class can be calculated precisely. Here, we propose the SPRT-TANDEM, a
deep neural network-based SPRT algorithm that overcomes the above two
obstacles. The SPRT-TANDEM sequentially estimates the log-likelihood ratio of
two alternative hypotheses by leveraging a novel Loss function for
Log-Likelihood Ratio estimation (LLLR) while allowing correlations up to $N
(\in \mathbb{N})$ preceding samples. In tests on one original and two public
video databases, Nosaic MNIST, UCF101, and SiW, the SPRT-TANDEM achieves
statistically significantly better classification accuracy than other baseline
classifiers, with a smaller number of data samples. The code and Nosaic MNIST
are publicly available at https://github.com/TaikiMiyagawa/SPRT-TANDEM.
Related papers
- Estimating the Robustness Radius for Randomized Smoothing with 100$\times$ Sample Efficiency [6.199300239433395]
This work demonstrates that reducing the number of samples by one or two orders of magnitude can still enable the computation of a slightly smaller robustness radius.
We provide the mathematical foundation for explaining the phenomenon while experimentally showing promising results on the standard CIFAR-10 and ImageNet datasets.
arXiv Detail & Related papers (2024-04-26T12:43:19Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Favour: FAst Variance Operator for Uncertainty Rating [0.034530027457862]
Bayesian Neural Networks (BNN) have emerged as a crucial approach for interpreting ML predictions.
By sampling from the posterior distribution, data scientists may estimate the uncertainty of an inference.
Previous work proposed propagating the first and second moments of the posterior directly through the network.
This method is even slower than sampling, so the propagated variance needs to be approximated.
Our contribution is a more principled variance propagation framework.
arXiv Detail & Related papers (2023-11-21T22:53:20Z) - A Specialized Semismooth Newton Method for Kernel-Based Optimal
Transport [92.96250725599958]
Kernel-based optimal transport (OT) estimators offer an alternative, functional estimation procedure to address OT problems from samples.
We show that our SSN method achieves a global convergence rate of $O (1/sqrtk)$, and a local quadratic convergence rate under standard regularity conditions.
arXiv Detail & Related papers (2023-10-21T18:48:45Z) - Toward Asymptotic Optimality: Sequential Unsupervised Regression of
Density Ratio for Early Classification [11.470070927586017]
Theoretically-inspired sequential density ratio estimation (SDRE) algorithms are proposed for the early classification of time series.
Two novel SPRT-based algorithms, B2Bsqrt-TANDEM and TANDEMformer, are designed to avoid the overnormalization problem for precise unsupervised regression of SDRs.
arXiv Detail & Related papers (2023-02-20T07:20:53Z) - Adaptive Sketches for Robust Regression with Importance Sampling [64.75899469557272]
We introduce data structures for solving robust regression through gradient descent (SGD)
Our algorithm effectively runs $T$ steps of SGD with importance sampling while using sublinear space and just making a single pass over the data.
arXiv Detail & Related papers (2022-07-16T03:09:30Z) - Transformers Can Do Bayesian Inference [28.936428431504165]
We present Prior-Data Fitted Networks (PFNs)
PFNs leverage large-scale machine learning techniques to approximate a large set of posteriors.
We demonstrate that PFNs can near-perfectly mimic Gaussian processes.
arXiv Detail & Related papers (2021-12-20T13:07:39Z) - Noise-Resistant Deep Metric Learning with Probabilistic Instance
Filtering [59.286567680389766]
Noisy labels are commonly found in real-world data, which cause performance degradation of deep neural networks.
We propose Probabilistic Ranking-based Instance Selection with Memory (PRISM) approach for DML.
PRISM calculates the probability of a label being clean, and filters out potentially noisy samples.
arXiv Detail & Related papers (2021-08-03T12:15:25Z) - The Power of Log-Sum-Exp: Sequential Density Ratio Matrix Estimation for
Speed-Accuracy Optimization [0.0]
We propose a model for multiclass classification of time series to make a prediction as early and as accurate as possible.
Our overall architecture for early classification, MSPRT-TANDEM, statistically significantly outperforms baseline models on four datasets.
arXiv Detail & Related papers (2021-05-28T07:21:58Z) - Improved, Deterministic Smoothing for L1 Certified Robustness [119.86676998327864]
We propose a non-additive and deterministic smoothing method, Deterministic Smoothing with Splitting Noise (DSSN)
In contrast to uniform additive smoothing, the SSN certification does not require the random noise components used to be independent.
This is the first work to provide deterministic "randomized smoothing" for a norm-based adversarial threat model.
arXiv Detail & Related papers (2021-03-17T21:49:53Z) - Breaking the Sample Size Barrier in Model-Based Reinforcement Learning
with a Generative Model [50.38446482252857]
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator)
We first consider $gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space $mathcalS$ and action space $mathcalA$.
We prove that a plain model-based planning algorithm suffices to achieve minimax-optimal sample complexity given any target accuracy level.
arXiv Detail & Related papers (2020-05-26T17:53:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.