Related papers: Data Compression using Rank-1 Lattices for Parameter Estimation in Machine Learning

Data Compression using Rank-1 Lattices for Parameter Estimation in Machine Learning

URL: http://arxiv.org/abs/2409.13453v1
Date: Fri, 20 Sep 2024 12:35:24 GMT
Title: Data Compression using Rank-1 Lattices for Parameter Estimation in Machine Learning
Authors: Michael Gnewuch, Kumar Harsha, Marcin Wnuk,
Abstract summary: Mean squared error and regularized versions of it are standard loss functions in supervised machine learning. We present algorithms to reduce extensive data sets to a smaller size using rank-1 lattices.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The mean squared error and regularized versions of it are standard loss functions in supervised machine learning. However, calculating these losses for large data sets can be computationally demanding. Modifying an approach of J. Dick and M. Feischl [Journal of Complexity 67 (2021)], we present algorithms to reduce extensive data sets to a smaller size using rank-1 lattices. Rank-1 lattices are quasi-Monte Carlo (QMC) point sets that are, if carefully chosen, well-distributed in a multidimensional unit cube. The compression strategy in the preprocessing step assigns every lattice point a pair of weights depending on the original data and responses, representing its relative importance. As a result, the compressed data makes iterative loss calculations in optimization steps much faster. We analyze the errors of our QMC data compression algorithms and the cost of the preprocessing step for functions whose Fourier coefficients decay sufficiently fast so that they lie in certain Wiener algebras or Korobov spaces. In particular, we prove that our approach can lead to arbitrary high convergence rates as long as the functions are sufficiently smooth.

Related papers

Low-rank computation of the posterior mean in Multi-Output Gaussian Processes [0.6445605125467574]
We consider the case of multi-output Gaussian processes (MOGP) and present low-rank approaches for efficiently computing the posterior mean of a MOGP.
arXiv Detail & Related papers (2025-04-30T11:19:58Z)
Worth Their Weight: Randomized and Regularized Block Kaczmarz Algorithms without Preprocessing [2.542838926315103]
We analyze the convergence of the randomized block Kaczmarz method (RBK) when the data is sampled uniformly, showing that its iterates converge in a Monte Carlo sense to a $textit$ least-squares solution. We resolve these issues by incorporating regularization into the RBK. Numerical experiments, including examples arising from natural gradient optimization, suggest that the regularized algorithm, ReBlocK, outperforms minibatch gradient descent for realistic problems that exhibit fast singular value decay.
arXiv Detail & Related papers (2025-02-02T19:23:46Z)
Compression of Structured Data with Autoencoders: Provable Benefit of Nonlinearities and Depth [83.15263499262824]
We prove that gradient descent converges to a solution that completely disregards the sparse structure of the input. We show how to improve upon Gaussian performance for the compression of sparse data by adding a denoising function to a shallow architecture. We validate our findings on image datasets, such as CIFAR-10 and MNIST.
arXiv Detail & Related papers (2024-02-07T16:32:29Z)
Learning-Augmented K-Means Clustering Using Dimensional Reduction [1.7243216387069678]
We propose a solution to reduce the dimensionality of the dataset using Principal Component Analysis (PCA) PCA is well-established in the literature and has become one of the most useful tools for data modeling, compression, and visualization.
arXiv Detail & Related papers (2024-01-06T12:02:33Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
CORE: Common Random Reconstruction for Distributed Optimization with Provable Low Communication Complexity [110.50364486645852]
Communication complexity has become a major bottleneck for speeding up training and scaling up machine numbers. We propose Common Om REOm, which can be used to compress information transmitted between machines.
arXiv Detail & Related papers (2023-09-23T08:45:27Z)
Prediction and compression of lattice QCD data using machine learning algorithms on quantum annealer [4.987315310656657]
We present regression and compression algorithms for lattice QCD data. In the regression algorithm, we encode the correlation between the input and output variables into a sparse coding machine learning algorithm. In the compression algorithm, we define a mapping from lattice QCD data of floating-point numbers to the binary coefficients that closely reconstruct the input data.
arXiv Detail & Related papers (2021-12-03T19:04:35Z)
SreaMRAK a Streaming Multi-Resolution Adaptive Kernel Algorithm [60.61943386819384]
Existing implementations of KRR require that all the data is stored in the main memory. We propose StreaMRAK - a streaming version of KRR. We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum.
arXiv Detail & Related papers (2021-08-23T21:03:09Z)
Hybrid Stochastic-Deterministic Minibatch Proximal Gradient: Less-Than-Single-Pass Optimization with Nearly Optimal Generalization [83.80460802169999]
We show that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model. For loss factors, we prove that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
arXiv Detail & Related papers (2020-09-18T02:18:44Z)
Provably Convergent Working Set Algorithm for Non-Convex Regularized Regression [0.0]
This paper proposes a working set algorithm for non-regular regularizers with convergence guarantees. Our results demonstrate high gain compared to the full problem solver for both block-coordinates or a gradient solver.
arXiv Detail & Related papers (2020-06-24T07:40:31Z)
Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching. We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)
Online stochastic gradient descent on non-convex losses from high-dimensional inference [2.2344764434954256]
gradient descent (SGD) is a popular algorithm for optimization problems in high-dimensional tasks. In this paper we produce an estimator of non-trivial correlation from data. We illustrate our approach by applying it to a set of tasks such as phase retrieval, and estimation for generalized models.
arXiv Detail & Related papers (2020-03-23T17:34:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.