Predictive Performance of Deep Quantum Data Re-uploading Models
- URL: http://arxiv.org/abs/2505.20337v1
- Date: Sat, 24 May 2025 13:11:31 GMT
- Title: Predictive Performance of Deep Quantum Data Re-uploading Models
- Authors: Xin Wang, Han-Xiao Tao, Re-Bing Wu,
- Abstract summary: This study reveals a fundamental limitation in predictive performance when deep encoding layers are employed within the data re-uploading model.<n>We theoretically demonstrate that when processing high-dimensional data with limited-qubit data re-uploading models, their predictive performance progressively degenerates to near random-guessing levels.
- Score: 4.852613028421959
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Quantum machine learning models incorporating data re-uploading circuits have garnered significant attention due to their exceptional expressivity and trainability. However, their ability to generate accurate predictions on unseen data, referred to as the predictive performance, remains insufficiently investigated. This study reveals a fundamental limitation in predictive performance when deep encoding layers are employed within the data re-uploading model. Concretely, we theoretically demonstrate that when processing high-dimensional data with limited-qubit data re-uploading models, their predictive performance progressively degenerates to near random-guessing levels as the number of encoding layers increases. In this context, the repeated data uploading cannot mitigate the performance degradation. These findings are validated through experiments on both synthetic linearly separable datasets and real-world datasets. Our results demonstrate that when processing high-dimensional data, the quantum data re-uploading models should be designed with wider circuit architectures rather than deeper and narrower ones.
Related papers
- A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training Loops [55.07063067759609]
High-quality data is essential for training large generative models, yet the vast reservoir of real data available online has become nearly depleted.<n>Models increasingly generate their own data for further training, forming Self-consuming Training Loops (STLs)<n>Some models degrade or even collapse, while others successfully avoid these failures, leaving a significant gap in theoretical understanding.
arXiv Detail & Related papers (2025-02-26T06:18:13Z) - Computationally and Memory-Efficient Robust Predictive Analytics Using Big Data [0.0]
This study navigates through the challenges of data uncertainties, storage limitations, and predictive data-driven modeling using big data.
We utilize Robust Principal Component Analysis (RPCA) for effective noise reduction and outlier elimination, and Optimal Sensor Placement (OSP) for efficient data compression and storage.
arXiv Detail & Related papers (2024-03-27T22:39:08Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task.
We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z) - ClusterQ: Semantic Feature Distribution Alignment for Data-Free
Quantization [111.12063632743013]
We propose a new and effective data-free quantization method termed ClusterQ.
To obtain high inter-class separability of semantic features, we cluster and align the feature distribution statistics.
We also incorporate the intra-class variance to solve class-wise mode collapse.
arXiv Detail & Related papers (2022-04-30T06:58:56Z) - How Well Do Sparse Imagenet Models Transfer? [75.98123173154605]
Transfer learning is a classic paradigm by which models pretrained on large "upstream" datasets are adapted to yield good results on "downstream" datasets.
In this work, we perform an in-depth investigation of this phenomenon in the context of convolutional neural networks (CNNs) trained on the ImageNet dataset.
We show that sparse models can match or even outperform the transfer performance of dense models, even at high sparsities.
arXiv Detail & Related papers (2021-11-26T11:58:51Z) - Improving Neural Networks for Time Series Forecasting using Data
Augmentation and AutoML [0.0]
This paper presents an easy to implement data augmentation method to significantly improve the performance of neural networks.
It shows that data augmentation, when paired Automated Machine Learning techniques such as Neural Architecture Search, can help to find the best neural architecture for a given time series.
arXiv Detail & Related papers (2021-03-02T19:20:49Z) - Synthesizing Irreproducibility in Deep Networks [2.28438857884398]
Modern day deep networks suffer from irreproducibility (also referred to as nondeterminism or underspecification)
We show that even with a single nonlinearity and for very simple data and models, irreproducibility occurs.
Model complexity and the choice of nonlinearity also play significant roles in making deep models irreproducible.
arXiv Detail & Related papers (2021-02-21T21:51:28Z) - On the performance of deep learning models for time series
classification in streaming [0.0]
This work is to assess the performance of different types of deep architectures for data streaming classification.
We evaluate models such as multi-layer perceptrons, recurrent, convolutional and temporal convolutional neural networks over several time-series datasets.
arXiv Detail & Related papers (2020-03-05T11:41:29Z) - Forecasting Industrial Aging Processes with Machine Learning Methods [0.0]
We evaluate a wider range of data-driven models, comparing some traditional stateless models to more complex recurrent neural networks.
Our results show that recurrent models produce near perfect predictions when trained on larger datasets.
arXiv Detail & Related papers (2020-02-05T13:06:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.