Related papers: Fusion of classical and quantum kernels enables accurate and robust two-sample tests

Fusion of classical and quantum kernels enables accurate and robust two-sample tests

URL: http://arxiv.org/abs/2511.20941v1
Date: Wed, 26 Nov 2025 00:25:17 GMT
Title: Fusion of classical and quantum kernels enables accurate and robust two-sample tests
Authors: Yu Terada, Yugo Ogio, Ken Arai, Hiroyuki Tezuka, Yu Tanaka,
Abstract summary: We propose a novel hybrid testing strategy that fuses classical and quantum kernels.<n>This approach creates a powerful and adaptive test by combining the domain-specific inductive biases of classical kernels with the unique expressive power of quantum kernels.
Score: 0.8178073457017482
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Two-sample tests have been extensively employed in various scientific fields and machine learning such as evaluation on the effectiveness of drugs and A/B testing on different marketing strategies to discriminate whether two sets of samples come from the same distribution or not. Kernel-based procedures for hypothetical testing have been proposed to efficiently disentangle high-dimensional complex structures in data to obtain accurate results in a model-free way by embedding the data into the reproducing kernel Hilbert space (RKHS). While the choice of kernels plays a crucial role for their performance, little is understood about how to choose kernel especially for small datasets. Here we aim to construct a hypothetical test which is effective even for small datasets, based on the theoretical foundation of kernel-based tests using maximum mean discrepancy, which is called MMD-FUSE. To address this, we enhance the MMD-FUSE framework by incorporating quantum kernels and propose a novel hybrid testing strategy that fuses classical and quantum kernels. This approach creates a powerful and adaptive test by combining the domain-specific inductive biases of classical kernels with the unique expressive power of quantum kernels. We evaluate our method on various synthetic and real-world clinical datasets, and our experiments reveal two key findings: 1) With appropriate hyperparameter tuning, MMD-FUSE with quantum kernels consistently improves test power over classical counterparts, especially for small and high-dimensional data. 2) The proposed hybrid framework demonstrates remarkable robustness, adapting to different data characteristics and achieving high test power across diverse scenarios. These results highlight the potential of quantum-inspired and hybrid kernel strategies to build more effective statistical tests, offering a versatile tool for data analysis where sample sizes are limited.

Related papers

Theoretical Convergence of SMOTE-Generated Samples [47.26889442476884]
We provide a rigorous theoretical analysis of SMOTE's convergence properties.<n>We prove that the synthetic random variable Z converges in probability to the underlying random variable X.<n>Lower values of the nearest neighbor rank lead to faster convergence.
arXiv Detail & Related papers (2026-01-05T09:19:45Z)
DUAL: Learning Diverse Kernels for Aggregated Two-sample and Independence Testing [21.083713063070586]
We propose an aggregated statistic that explicitly incorporates kernel diversity based on the covariance between different kernels.<n>This motivates a testing framework with selection inference, which leverages information from the training phase to select kernels with strong individual performance.
arXiv Detail & Related papers (2025-10-13T08:30:42Z)
MDNS: Masked Diffusion Neural Sampler via Stochastic Optimal Control [48.504188275208556]
We study the problem of learning a neural sampler to generate samples from discrete state spaces where the target probability mass function $piproptomathrme-U$ is known up to normalizing constant.<n>We propose $textbfM$asked $textbfDiffusion, a novel framework for discrete neural samplers by aligning measures through a family of learning objectives.
arXiv Detail & Related papers (2025-08-14T14:27:16Z)
Kernel Two-Sample Testing via Directional Components Analysis [0.0]
We propose a novel kernel-based two-sample test to identify and utilize well-estimated directional components in reproducing kernel Hilbert space (RKHS)<n>By focusing on these directions and aggregating information across multiple kernels, the proposed test achieves higher power and improved robustness, especially in high-dimensional and unbalanced sample settings.
arXiv Detail & Related papers (2025-08-12T02:04:55Z)
A Scalable Nyström-Based Kernel Two-Sample Test with Permutations [9.849635250118912]
Two-sample hypothesis testing is a fundamental problem in statistics and machine learning.<n>In this work, we use a Nystr"om approximation of the maximum mean discrepancy (MMD) to design a computationally efficient and practical testing algorithm.
arXiv Detail & Related papers (2025-02-19T09:22:48Z)
Learning Representations for Independence Testing [13.842061060076004]
We show how to construct powerful tests with finite-sample validity using variational estimators of mutual information.<n>Second, we establish a close connection between these variational mutual information-based tests and tests based on the Hilbert-Schmidt Independence Criterion (HSIC)<n>Finally, we show how to, rather than selecting a representation to maximize the statistic itself, select a representation which can maximize the power of a test.
arXiv Detail & Related papers (2024-09-10T22:18:07Z)
Collaborative non-parametric two-sample testing [55.98760097296213]
The goal is to identify nodes where the null hypothesis $p_v = q_v$ should be rejected. We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure. Our methodology integrates elements from f-divergence estimation, Kernel Methods, and Multitask Learning.
arXiv Detail & Related papers (2024-02-08T14:43:56Z)
MMD-FUSE: Learning and Combining Kernels for Two-Sample Testing Without Data Splitting [28.59390881834003]
We propose novel statistics which maximise the power of a two-sample test based on the Maximum Mean Discrepancy (MMD) We show how these kernels can be chosen in a data-dependent but permutation-independent way, in a well-calibrated test, avoiding data splitting. We highlight the applicability of our MMD-FUSE test on both synthetic low-dimensional and real-world high-dimensional data, and compare its performance in terms of power against current state-of-the-art kernel tests.
arXiv Detail & Related papers (2023-06-14T23:13:03Z)
Quantifying & Modeling Multimodal Interactions: An Information Decomposition Framework [89.8609061423685]
We propose an information-theoretic approach to quantify the degree of redundancy, uniqueness, and synergy relating input modalities with an output task. To validate PID estimation, we conduct extensive experiments on both synthetic datasets where the PID is known and on large-scale multimodal benchmarks. We demonstrate their usefulness in (1) quantifying interactions within multimodal datasets, (2) quantifying interactions captured by multimodal models, (3) principled approaches for model selection, and (4) three real-world case studies.
arXiv Detail & Related papers (2023-02-23T18:59:05Z)
FaDIn: Fast Discretized Inference for Hawkes Processes with General Parametric Kernels [82.53569355337586]
This work offers an efficient solution to temporal point processes inference using general parametric kernels with finite support. The method's effectiveness is evaluated by modeling the occurrence of stimuli-induced patterns from brain signals recorded with magnetoencephalography (MEG) Results show that the proposed approach leads to an improved estimation of pattern latency than the state-of-the-art.
arXiv Detail & Related papers (2022-10-10T12:35:02Z)
Hybrid Random Features [60.116392415715275]
We propose a new class of random feature methods for linearizing softmax and Gaussian kernels called hybrid random features (HRFs) HRFs automatically adapt the quality of kernel estimation to provide most accurate approximation in the defined regions of interest.
arXiv Detail & Related papers (2021-10-08T20:22:59Z)
Learning Deep Kernels for Non-Parametric Two-Sample Tests [50.92621794426821]
We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution. Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test power.
arXiv Detail & Related papers (2020-02-21T03:54:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.