Toward Asymptotic Optimality: Sequential Unsupervised Regression of
Density Ratio for Early Classification
- URL: http://arxiv.org/abs/2302.09810v1
- Date: Mon, 20 Feb 2023 07:20:53 GMT
- Title: Toward Asymptotic Optimality: Sequential Unsupervised Regression of
Density Ratio for Early Classification
- Authors: Akinori F. Ebihara, Taiki Miyagawa, Kazuyuki Sakurai, Hitoshi Imaoka
- Abstract summary: Theoretically-inspired sequential density ratio estimation (SDRE) algorithms are proposed for the early classification of time series.
Two novel SPRT-based algorithms, B2Bsqrt-TANDEM and TANDEMformer, are designed to avoid the overnormalization problem for precise unsupervised regression of SDRs.
- Score: 11.470070927586017
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Theoretically-inspired sequential density ratio estimation (SDRE) algorithms
are proposed for the early classification of time series. Conventional SDRE
algorithms can fail to estimate DRs precisely due to the internal
overnormalization problem, which prevents the DR-based sequential algorithm,
Sequential Probability Ratio Test (SPRT), from reaching its asymptotic Bayes
optimality. Two novel SPRT-based algorithms, B2Bsqrt-TANDEM and TANDEMformer,
are designed to avoid the overnormalization problem for precise unsupervised
regression of SDRs. The two algorithms statistically significantly reduce DR
estimation errors and classification errors on an artificial sequential
Gaussian dataset and real datasets (SiW, UCF101, and HMDB51), respectively. The
code is available at:
https://github.com/Akinori-F-Ebihara/LLR_saturation_problem.
Related papers
- Towards Understanding the Generalizability of Delayed Stochastic
Gradient Descent [63.43247232708004]
A gradient descent performed in an asynchronous manner plays a crucial role in training large-scale machine learning models.
Existing generalization error bounds are rather pessimistic and cannot reveal the correlation between asynchronous delays and generalization.
Our theoretical results indicate that asynchronous delays reduce the generalization error of the delayed SGD algorithm.
arXiv Detail & Related papers (2023-08-18T10:00:27Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - Anomaly Rule Detection in Sequence Data [2.3757190901941736]
We present a new anomaly detection framework called DUOS that enables Discovery of Utility-aware Outlier Sequential rules from a set of sequences.
In this work, we incorporate both the anomalousness and utility of a group, and then introduce the concept of utility-aware outlier rule (UOSR)
arXiv Detail & Related papers (2021-11-29T23:52:31Z) - Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality [131.45028999325797]
We develop a doubly robust off-policy AC (DR-Off-PAC) for discounted MDP.
DR-Off-PAC adopts a single timescale structure, in which both actor and critics are updated simultaneously with constant stepsize.
We study the finite-time convergence rate and characterize the sample complexity for DR-Off-PAC to attain an $epsilon$-accurate optimal policy.
arXiv Detail & Related papers (2021-02-23T18:56:13Z) - Stochastic Hard Thresholding Algorithms for AUC Maximization [49.00683387735522]
We develop a hard thresholding algorithm for AUC in distributiond classification.
We conduct experiments to show the efficiency and effectiveness of the proposed algorithms.
arXiv Detail & Related papers (2020-11-04T16:49:29Z) - Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence
Analysis [27.679514676804057]
We develop a variance reduction scheme for the two time-scale TDC algorithm in the off-policy setting.
Experiments demonstrate that the proposed variance-reduced TDC achieves a smaller convergence error than both the conventional TDC and the variance-reduced TD.
arXiv Detail & Related papers (2020-10-26T01:33:05Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - Sequential Density Ratio Estimation for Simultaneous Optimization of
Speed and Accuracy [11.470070927586017]
We propose the SPRT-TANDEM, a deep neural network-based SPRT algorithm that overcomes the above two obstacles.
In tests on one original and two public video databases, the SPRT-TANDEM achieves statistically significantly better classification accuracy than other baselines.
arXiv Detail & Related papers (2020-06-10T01:05:00Z) - A Robust Matching Pursuit Algorithm Using Information Theoretic Learning [37.968665739578185]
A new OMP algorithm is developed based on the information theoretic learning (ITL)
The experimental results on both simulated and real-world data consistently demonstrate the superiority of the proposed OMP algorithm in data recovery, image reconstruction, and classification.
arXiv Detail & Related papers (2020-05-10T01:36:00Z) - Simple and Effective Prevention of Mode Collapse in Deep One-Class
Classification [93.2334223970488]
We propose two regularizers to prevent hypersphere collapse in deep SVDD.
The first regularizer is based on injecting random noise via the standard cross-entropy loss.
The second regularizer penalizes the minibatch variance when it becomes too small.
arXiv Detail & Related papers (2020-01-24T03:44:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.