On the connection between least squares, regularization, and classical shadows
- URL: http://arxiv.org/abs/2310.16921v3
- Date: Thu, 22 Aug 2024 18:15:30 GMT
- Title: On the connection between least squares, regularization, and classical shadows
- Authors: Zhihui Zhu, Joseph M. Lukens, Brian T. Kirby,
- Abstract summary: We show that both RLS and CS can be viewed as regularizers for the underdetermined regime.
We evaluate RLS and CS from three distinct angles: the tradeoff in bias and variance, mismatch between the expected and actual measurement distributions, and the interplay between the number of measurements and number of shots per measurement.
- Score: 17.633238342851925
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Classical shadows (CS) offer a resource-efficient means to estimate quantum observables, circumventing the need for exhaustive state tomography. Here, we clarify and explore the connection between CS techniques and least squares (LS) and regularized least squares (RLS) methods commonly used in machine learning and data analysis. By formal identification of LS and RLS "shadows" completely analogous to those in CS -- namely, point estimators calculated from the empirical frequencies of single measurements -- we show that both RLS and CS can be viewed as regularizers for the underdetermined regime, replacing the pseudoinverse with invertible alternatives. Through numerical simulations, we evaluate RLS and CS from three distinct angles: the tradeoff in bias and variance, mismatch between the expected and actual measurement distributions, and the interplay between the number of measurements and number of shots per measurement. Compared to CS, RLS attains lower variance at the expense of bias, is robust to distribution mismatch, and is more sensitive to the number of shots for a fixed number of state copies -- differences that can be understood from the distinct approaches taken to regularization. Conceptually, our integration of LS, RLS, and CS under a unifying "shadow" umbrella aids in advancing the overall picture of CS techniques, while practically our results highlight the tradeoffs intrinsic to these measurement approaches, illuminating the circumstances under which either RLS or CS would be preferred, such as unverified randomness for the former or unbiased estimation for the latter.
Related papers
- Shuffled Linear Regression via Spectral Matching [6.24954299842136]
Shuffled linear regression seeks to estimate latent features through a linear transformation.
This problem extends traditional least-squares (LS) and Least Absolute Shrinkage and Selection Operator (LASSO) approaches.
We propose a spectral matching method that efficiently resolves permutations.
arXiv Detail & Related papers (2024-09-30T16:26:40Z) - Empowering Snapshot Compressive Imaging: Spatial-Spectral State Space Model with Across-Scanning and Local Enhancement [51.557804095896174]
We introduce a State Space Model with Across-Scanning and Local Enhancement, named ASLE-SSM, that employs a Spatial-Spectral SSM for global-local balanced context encoding and cross-channel interaction promoting.
Experimental results illustrate ASLE-SSM's superiority over existing state-of-the-art methods, with an inference speed 2.4 times faster than Transformer-based MST and saving 0.12 (M) of parameters.
arXiv Detail & Related papers (2024-08-01T15:14:10Z) - Selecting the Number of Communities for Weighted Degree-Corrected Stochastic Block Models [5.117940794592611]
We investigate how to select the number of communities for weighted networks without a full likelihood modeling.
We propose a novel weighted degree-corrected block model (DCSBM), in which the mean adjacency matrix is modeled as the same as in standard DCSBM.
Our method of selecting the number of communities is based on a sequential testing framework, and in each step the weighted DCSBM is fitted via some spectral clustering method.
arXiv Detail & Related papers (2024-06-08T03:47:38Z) - Low-Rank Approximation of Structural Redundancy for Self-Supervised Learning [2.3072402651280517]
We study the data-generating mechanism for reconstructive SSL to shed light on its effectiveness.
With an infinite amount of labeled samples, we provide a sufficient and necessary condition for perfect linear approximation.
Motivated by the condition, we propose to approximate the redundant component by a low-rank factorization.
arXiv Detail & Related papers (2024-02-10T04:45:27Z) - Sample Complexity of the Sign-Perturbed Sums Identification Method:
Scalar Case [0.0]
Sign-Perturbed Sum (SPS) is a powerful finite-sample system identification algorithm.
This paper studies the behaviour of SPS confidence intervals.
arXiv Detail & Related papers (2024-01-28T22:44:41Z) - A U-turn on Double Descent: Rethinking Parameter Counting in Statistical
Learning [68.76846801719095]
We show that double descent appears exactly when and where it occurs, and that its location is not inherently tied to the threshold p=n.
This provides a resolution to tensions between double descent and statistical intuition.
arXiv Detail & Related papers (2023-10-29T12:05:39Z) - An Information-Theoretic Perspective on Variance-Invariance-Covariance Regularization [52.44068740462729]
We present an information-theoretic perspective on the VICReg objective.
We derive a generalization bound for VICReg, revealing its inherent advantages for downstream tasks.
We introduce a family of SSL methods derived from information-theoretic principles that outperform existing SSL techniques.
arXiv Detail & Related papers (2023-03-01T16:36:25Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction [138.04956118993934]
We propose a novel Transformer-based method, coarse-to-fine sparse Transformer (CST)
CST embedding HSI sparsity into deep learning for HSI reconstruction.
In particular, CST uses our proposed spectra-aware screening mechanism (SASM) for coarse patch selecting. Then the selected patches are fed into our customized spectra-aggregation hashing multi-head self-attention (SAH-MSA) for fine pixel clustering and self-similarity capturing.
arXiv Detail & Related papers (2022-03-09T16:17:47Z) - Context-Specific Likelihood Weighting [0.0]
We introduce context-specific likelihood weighting (CS-LW) for approximate inference.
Unlike the standard likelihood weighting, CS-LW is based on partial assignments of random variables.
We empirically show that CS-LW is competitive with state-of-the-art algorithms for approximate inference.
arXiv Detail & Related papers (2021-01-24T20:23:14Z) - Accelerated Convergence for Counterfactual Learning to Rank [65.63997193915257]
We show that convergence rate of SGD approaches with IPS-weighted gradients suffers from the large variance introduced by the IPS weights.
We propose a novel learning algorithm, called CounterSample, that has provably better convergence than standard IPS-weighted gradient descent methods.
We prove that CounterSample converges faster and complement our theoretical findings with empirical results.
arXiv Detail & Related papers (2020-05-21T12:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.