New bootstrap tests for categorical time series. A comparative study
- URL: http://arxiv.org/abs/2305.00465v1
- Date: Sun, 30 Apr 2023 12:35:28 GMT
- Title: New bootstrap tests for categorical time series. A comparative study
- Authors: \'Angel L\'opez-Oriona, Jos\'e Antonio Vilar Fern\'andez and Pierpaolo
D'Urso
- Abstract summary: We propose three tests relying on a dissimilarity measure between categorical processes.
Tests are constructed by considering three specific distances evaluating discrepancy between the marginal distributions and the serial dependence patterns of both processes.
An application involving biological sequences highlights the usefulness of the proposed techniques.
- Score: 4.869045108760265
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The problem of testing the equality of the generating processes of two
categorical time series is addressed in this work. To this aim, we propose
three tests relying on a dissimilarity measure between categorical processes.
Particular versions of these tests are constructed by considering three
specific distances evaluating discrepancy between the marginal distributions
and the serial dependence patterns of both processes. Proper estimates of these
dissimilarities are an essential element of the constructed tests, which are
based on the bootstrap. Specifically, a parametric bootstrap method assuming
the true generating models and extensions of the moving blocks bootstrap and
the stationary bootstrap are considered. The approaches are assessed in a broad
simulation study including several types of categorical models with different
degrees of complexity. Advantages and disadvantages of each one of the methods
are properly discussed according to their behavior under the null and the
alternative hypothesis. The impact that some important input parameters have on
the results of the tests is also analyzed. An application involving biological
sequences highlights the usefulness of the proposed techniques.
Related papers
- Deep anytime-valid hypothesis testing [29.273915933729057]
We propose a general framework for constructing powerful, sequential hypothesis tests for nonparametric testing problems.
We develop a principled approach of leveraging the representation capability of machine learning models within the testing-by-betting framework.
Empirical results on synthetic and real-world datasets demonstrate that tests instantiated using our general framework are competitive against specialized baselines.
arXiv Detail & Related papers (2023-10-30T09:46:19Z) - A framework for paired-sample hypothesis testing for high-dimensional
data [7.400168551191579]
We put forward the idea that scoring functions can be produced by the decision rules defined by the bisecting hyperplanes of the line segments connecting each pair of instances.
First, we estimate the bisecting hyperplanes for each pair of instances and an aggregated rule derived through the Hodges-Lehmann estimator.
arXiv Detail & Related papers (2023-09-28T09:17:11Z) - Bootstrapped Edge Count Tests for Nonparametric Two-Sample Inference
Under Heterogeneity [5.8010446129208155]
We develop a new nonparametric testing procedure that accurately detects differences between the two samples.
A comprehensive simulation study and an application to detecting user behaviors in online games demonstrates the excellent non-asymptotic performance of the proposed test.
arXiv Detail & Related papers (2023-04-26T22:25:44Z) - A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts [117.72709110877939]
Test-time adaptation (TTA) has the potential to adapt a pre-trained model to unlabeled data during testing, before making predictions.
We categorize TTA into several distinct groups based on the form of test data, namely, test-time domain adaptation, test-time batch adaptation, and online test-time adaptation.
arXiv Detail & Related papers (2023-03-27T16:32:21Z) - A Multiple kernel testing procedure for non-proportional hazards in
factorial designs [4.358626952482687]
We propose a Multiple kernel testing procedure to infer survival data when several factors are of interest simultaneously.
Our method is able to deal with complex data and can be seen as an alternative to the omnipresent Cox model when assumptions such as proportionality cannot be justified.
arXiv Detail & Related papers (2022-06-15T01:53:49Z) - Nonparametric Conditional Local Independence Testing [69.31200003384122]
Conditional local independence is an independence relation among continuous time processes.
No nonparametric test of conditional local independence has been available.
We propose such a nonparametric test based on double machine learning.
arXiv Detail & Related papers (2022-03-25T10:31:02Z) - Cluster-and-Conquer: A Framework For Time-Series Forecasting [94.63501563413725]
We propose a three-stage framework for forecasting high-dimensional time-series data.
Our framework is highly general, allowing for any time-series forecasting and clustering method to be used in each step.
When instantiated with simple linear autoregressive models, we are able to achieve state-of-the-art results on several benchmark datasets.
arXiv Detail & Related papers (2021-10-26T20:41:19Z) - A Statistical Analysis of Summarization Evaluation Metrics using
Resampling Methods [60.04142561088524]
We find that the confidence intervals are rather wide, demonstrating high uncertainty in how reliable automatic metrics truly are.
Although many metrics fail to show statistical improvements over ROUGE, two recent works, QAEval and BERTScore, do in some evaluation settings.
arXiv Detail & Related papers (2021-03-31T18:28:14Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Marginal likelihood computation for model selection and hypothesis
testing: an extensive review [66.37504201165159]
This article provides a comprehensive study of the state-of-the-art of the topic.
We highlight limitations, benefits, connections and differences among the different techniques.
Problems and possible solutions with the use of improper priors are also described.
arXiv Detail & Related papers (2020-05-17T18:31:58Z) - A Bootstrap-based Method for Testing Network Similarity [0.0]
This paper studies the matched network inference problem.
The goal is to determine if two networks, defined on a common set of nodes, exhibit a specific form of similarity.
Two notions of similarity are considered: (i) equality, i.e., testing whether the networks arise from the same random graph model, and (ii) scaling, i.e., testing whether their probability are proportional for some unknown scaling constant.
arXiv Detail & Related papers (2019-11-15T20:50:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.