Topology-Preserving Scaling in Data Augmentation
- URL: http://arxiv.org/abs/2411.19512v1
- Date: Fri, 29 Nov 2024 07:11:28 GMT
- Title: Topology-Preserving Scaling in Data Augmentation
- Authors: Vu-Anh Le, Mehmet Dik,
- Abstract summary: We propose an algorithmic framework for dataset normalization in data augmentation pipelines.<n>Our contributions provide a rigorous mathematical framework for dataset normalization in data augmentation pipelines.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We propose an algorithmic framework for dataset normalization in data augmentation pipelines that preserves topological stability under non-uniform scaling transformations. Given a finite metric space \( X \subset \mathbb{R}^n \) with Euclidean distance \( d_X \), we consider scaling transformations defined by scaling factors \( s_1, s_2, \ldots, s_n > 0 \). Specifically, we define a scaling function \( S \) that maps each point \( x = (x_1, x_2, \ldots, x_n) \in X \) to \[ S(x) = (s_1 x_1, s_2 x_2, \ldots, s_n x_n). \] Our main result establishes that the bottleneck distance \( d_B(D, D_S) \) between the persistence diagrams \( D \) of \( X \) and \( D_S \) of \( S(X) \) satisfies: \[ d_B(D, D_S) \leq (s_{\max} - s_{\min}) \cdot \operatorname{diam}(X), \] where \( s_{\min} = \min_{1 \leq i \leq n} s_i \), \( s_{\max} = \max_{1 \leq i \leq n} s_i \), and \( \operatorname{diam}(X) \) is the diameter of \( X \). Based on this theoretical guarantee, we formulate an optimization problem to minimize the scaling variability \( \Delta_s = s_{\max} - s_{\min} \) under the constraint \( d_B(D, D_S) \leq \epsilon \), where \( \epsilon > 0 \) is a user-defined tolerance. We develop an algorithmic solution to this problem, ensuring that data augmentation via scaling transformations preserves essential topological features. We further extend our analysis to higher-dimensional homological features, alternative metrics such as the Wasserstein distance, and iterative or probabilistic scaling scenarios. Our contributions provide a rigorous mathematical framework for dataset normalization in data augmentation pipelines, ensuring that essential topological characteristics are maintained despite scaling transformations.
Related papers
- Emergent Distance and Metricity of Mutual Information in 1D Quantum Chains [0.0]
Calibrating with the Euclidean benchmark (I(r)propto r-2mapsto d_E(r)propto r) makes the triangle-inequality test parameter-free and scale-invariant.<n>We establish a criterion linking the decay of (I(r)) to metric behavior of (d_E(r)): power laws (I(r)) with (0Xle 2) yield subadditivity (metric scaling), while exponential clustering leads to superaddit
arXiv Detail & Related papers (2025-07-13T19:00:06Z) - Bregman Linearized Augmented Lagrangian Method for Nonconvex Constrained Stochastic Zeroth-order Optimization [9.482573620753442]
We propose a Bregman linearized augmented Lagrangian method that utilizes zeroth-order estimators combined with variance technique.
Results show that the complexity of the proposed method can achieve a dimensional dependency dependency lower than required (O(d)) without requiring additional assumptions.
arXiv Detail & Related papers (2025-04-13T02:44:47Z) - Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization [77.3396841985172]
We provide a unified analysis of two-timescale gradient ascent (TTGDA) for solving structured non minimax optimization problems.<n>Our contribution is to design TTGDA algorithms are effective beyond the setting.
arXiv Detail & Related papers (2024-08-21T20:14:54Z) - The Sample Complexity Of ERMs In Stochastic Convex Optimization [13.896417716930687]
We show that in fact $tildeO(fracdepsilon+frac1epsilon2)$ data points are also sufficient.
We further generalize the result and show that a similar upper bound holds for all convex bodies.
arXiv Detail & Related papers (2023-11-09T14:29:25Z) - An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue
Optimization Problems [76.2042837251496]
We introduce two oblivious mirror descent algorithms based on a complementary composite setting.
Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function.
We show how to extend our framework to scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.
arXiv Detail & Related papers (2023-06-30T08:34:29Z) - ReSQueing Parallel and Private Stochastic Convex Optimization [59.53297063174519]
We introduce a new tool for BFG convex optimization (SCO): a Reweighted Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density.
We develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings.
arXiv Detail & Related papers (2023-01-01T18:51:29Z) - Private Isotonic Regression [54.32252900997422]
We consider the problem of isotonic regression over a partially ordered set (poset) $mathcalX$ and for any Lipschitz loss function.
We obtain a pure-DP algorithm that has an expected excess empirical risk of roughly $mathrmwidth(mathcalX) cdot log|mathcalX| / n$, where $mathrmwidth(mathcalX)$ is the width of the poset.
We show that the bounds above are essentially the best that can be
arXiv Detail & Related papers (2022-10-27T05:08:07Z) - Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$.
We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z) - Statistical Inference of Constrained Stochastic Optimization via Sketched Sequential Quadratic Programming [53.63469275932989]
We consider online statistical inference of constrained nonlinear optimization problems.
We apply the Sequential Quadratic Programming (StoSQP) method to solve these problems.
arXiv Detail & Related papers (2022-05-27T00:34:03Z) - Last iterate convergence of SGD for Least-Squares in the Interpolation
regime [19.05750582096579]
We study the noiseless model in the fundamental least-squares setup.
We assume that an optimum predictor fits perfectly inputs and outputs $langle theta_*, phi(X) rangle = Y$, where $phi(X)$ stands for a possibly infinite dimensional non-linear feature map.
arXiv Detail & Related papers (2021-02-05T14:02:20Z) - Parameter-free Stochastic Optimization of Variationally Coherent
Functions [19.468067110814808]
We design and analyze an algorithm for first-order optimization of a class of functions on $mathbbRdilon.
It is the first to achieve both these at the same time.
arXiv Detail & Related papers (2021-01-30T15:05:34Z) - Optimal Mean Estimation without a Variance [103.26777953032537]
We study the problem of heavy-tailed mean estimation in settings where the variance of the data-generating distribution does not exist.
We design an estimator which attains the smallest possible confidence interval as a function of $n,d,delta$.
arXiv Detail & Related papers (2020-11-24T22:39:21Z) - Thresholded Lasso Bandit [70.17389393497125]
Thresholded Lasso bandit is an algorithm that estimates the vector defining the reward function as well as its sparse support.
We establish non-asymptotic regret upper bounds scaling as $mathcalO( log d + sqrtT )$ in general, and as $mathcalO( log d + sqrtT )$ under the so-called margin condition.
arXiv Detail & Related papers (2020-10-22T19:14:37Z) - Hybrid Stochastic-Deterministic Minibatch Proximal Gradient:
Less-Than-Single-Pass Optimization with Nearly Optimal Generalization [83.80460802169999]
We show that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
For loss factors, we prove that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
arXiv Detail & Related papers (2020-09-18T02:18:44Z) - Tensor optimal transport, distance between sets of measures and tensor
scaling [0.0]
This is a linear programming problem on $d$-tensors.
We show that this algorithm can be viewed as a partial minimization algorithm of a strictly convex function.
arXiv Detail & Related papers (2020-05-02T23:49:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.