Related papers: Concentration of Non-Isotropic Random Tensors with Applications to Learning and Empirical Risk Minimization

Concentration of Non-Isotropic Random Tensors with Applications to Learning and Empirical Risk Minimization

URL: http://arxiv.org/abs/2102.04259v4
Date: Tue, 11 Feb 2025 10:29:23 GMT
Title: Concentration of Non-Isotropic Random Tensors with Applications to Learning and Empirical Risk Minimization
Authors: Mathieu Even, Laurent Massoulié,
Abstract summary: Dimension is an inherent bottleneck to some modern learning tasks, where optimization methods suffer from the size of the data.<n>We develop tools that aim at reducing these dimensional costs by a dependency on an effective dimension rather than the ambient one.<n>We show the importance of taking advantage of non-isotropic properties in learning problems with the following applications.
Score: 13.572602792770292
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Dimension is an inherent bottleneck to some modern learning tasks, where optimization methods suffer from the size of the data. In this paper, we study non-isotropic distributions of data and develop tools that aim at reducing these dimensional costs by a dependency on an effective dimension rather than the ambient one. Based on non-asymptotic estimates of the metric entropy of ellipsoids -- that prove to generalize to infinite dimensions -- and on a chaining argument, our uniform concentration bounds involve an effective dimension instead of the global dimension, improving over existing results. We show the importance of taking advantage of non-isotropic properties in learning problems with the following applications: i) we improve state-of-the-art results in statistical preconditioning for communication-efficient distributed optimization, ii) we introduce a non-isotropic randomized smoothing for non-smooth optimization. Both applications cover a class of functions that encompasses empirical risk minization (ERM) for linear models.

Related papers

Fréchet Cumulative Covariance Net for Deep Nonlinear Sufficient Dimension Reduction with Random Objects [22.156257535146004]
We introduce a new statistical dependence measure termed Fr'echet Cumulative Covariance (FCCov) and develop a novel nonlinear SDR framework based on FCCov. Our approach is not only applicable to complex non-Euclidean data, but also exhibits robustness against outliers. We prove that our method with squared Frobenius norm regularization achieves unbiasedness at the $sigma$-field level.
arXiv Detail & Related papers (2025-02-21T10:55:50Z)
Symmetry-Preserving Diffusion Models via Target Symmetrization [43.83899968118655]
We propose a novel approach that enforces equivariance through a symmetrized loss function. Our method uses Monte Carlo sampling to estimate the average, incurring minimal computational overhead. Experiments show improved sample quality compared to existing methods.
arXiv Detail & Related papers (2025-02-14T03:26:57Z)
Assumption-Lean Post-Integrated Inference with Negative Control Outcomes [0.0]
We introduce a robust post-integrated inference (PII) method that adjusts for latent heterogeneity using negative control outcomes. Our method extends to projected direct effect estimands, accounting for hidden mediators, confounders, and moderators. The proposed doubly robust estimators are consistent and efficient under minimal assumptions and potential misspecification.
arXiv Detail & Related papers (2024-10-07T12:52:38Z)
A Universal Class of Sharpness-Aware Minimization Algorithms [57.29207151446387]
We introduce a new class of sharpness measures, leading to new sharpness-aware objective functions. We prove that these measures are textitly expressive, allowing any function of the training loss Hessian matrix to be represented by appropriate hyper and determinants.
arXiv Detail & Related papers (2024-06-06T01:52:09Z)
Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL) We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking. We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z)
A Metaheuristic for Amortized Search in High-Dimensional Parameter Spaces [0.0]
We propose a new metaheuristic that drives dimensionality reductions from feature-informed transformations. DR-FFIT implements an efficient sampling strategy that facilitates a gradient-free parameter search in high-dimensional spaces. Our test data show that DR-FFIT boosts the performances of random-search and simulated-annealing against well-established metaheuristics.
arXiv Detail & Related papers (2023-09-28T14:25:14Z)
Nonparametric Linear Feature Learning in Regression Through Regularisation [0.0]
We propose a novel method for joint linear feature learning and non-parametric function estimation. By using alternative minimisation, we iteratively rotate the data to improve alignment with leading directions. We establish that the expected risk of our method converges to the minimal risk under minimal assumptions and with explicit rates.
arXiv Detail & Related papers (2023-07-24T12:52:55Z)
Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration and Lower Bounds [54.51566432934556]
We consider distributed optimization methods for problems where forming the Hessian is computationally challenging. We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
arXiv Detail & Related papers (2022-03-18T05:49:13Z)
Robust learning of data anomalies with analytically-solvable entropic outlier sparsification [0.0]
Outlier Sparsification (EOS) is proposed as a robust computational strategy for the detection of data anomalies. The performance of EOS is compared to a range of commonly-used tools on synthetic problems and on partially-mislabeled supervised classification problems from biomedicine.
arXiv Detail & Related papers (2021-12-22T10:13:29Z)
Slice Sampling for General Completely Random Measures [74.24975039689893]
We present a novel Markov chain Monte Carlo algorithm for posterior inference that adaptively sets the truncation level using auxiliary slice variables. The efficacy of the proposed algorithm is evaluated on several popular nonparametric models.
arXiv Detail & Related papers (2020-06-24T17:53:53Z)
Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching. We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)
Deep Dimension Reduction for Supervised Representation Learning [51.10448064423656]
We propose a deep dimension reduction approach to learning representations with essential characteristics. The proposed approach is a nonparametric generalization of the sufficient dimension reduction method. We show that the estimated deep nonparametric representation is consistent in the sense that its excess risk converges to zero.
arXiv Detail & Related papers (2020-06-10T14:47:43Z)
Optimal statistical inference in the presence of systematic uncertainties using neural network optimization based on binned Poisson likelihoods with nuisance parameters [0.0]
This work presents a novel strategy to construct the dimensionality reduction with neural networks for feature engineering. We discuss how this approach results in an estimate of the parameters of interest that is close to optimal.
arXiv Detail & Related papers (2020-03-16T13:27:18Z)
Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck. We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian. We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.