Related papers: Two-Dimensional Drift Analysis: Optimizing Two Functions Simultaneously Can Be Hard

Two-Dimensional Drift Analysis: Optimizing Two Functions Simultaneously Can Be Hard

URL: http://arxiv.org/abs/2203.14547v2
Date: Wed, 10 May 2023 07:47:50 GMT
Title: Two-Dimensional Drift Analysis: Optimizing Two Functions Simultaneously Can Be Hard
Authors: Duri Janett, Johannes Lengler
Abstract summary: We show how to use drift analysis in the case of two random variables. We analyze a minimal example Two of a dynamic environment that can be hard.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper we show how to use drift analysis in the case of two random variables $X_1, X_2$, when the drift is approximatively given by $A\cdot (X_1,X_2)^T$ for a matrix $A$. The non-trivial case is that $X_1$ and $X_2$ impede each other's progress, and we give a full characterization of this case. As application, we develop and analyze a minimal example TwoLinear of a dynamic environment that can be hard. The environment consists of two linear function $f_1$ and $f_2$ with positive weights $1$ and $n$, and in each generation selection is based on one of them at random. They only differ in the set of positions that have weight $1$ and $n$. We show that the $(1+1)$-EA with mutation rate $\chi/n$ is efficient for small $\chi$ on TwoLinear, but does not find the shared optimum in polynomial time for large $\chi$.

Related papers

Sparse Linear Regression is Easy on Random Supports [20.128442161507582]
We are given as input a design matrix $X in mathbbRN times d$ and measurements or labels $y in mathbbRN$.<n>We find that if the support of $w*$ is chosen at random, we can get prediction error $epsilon$ with roughly $N = O(klog d/epsilon)$ samples.
arXiv Detail & Related papers (2025-11-09T03:48:21Z)
LevAttention: Time, Space, and Streaming Efficient Algorithm for Heavy Attentions [54.54897832889028]
We show that for any $K$, there is a universal set" $U subset [n]$ of size independent of $n$, such that for any $Q$ and any row $i$, the large attention scores $A_i,j$ in row $i$ of $A$ all have $jin U$. We empirically show the benefits of our scheme for vision transformers, showing how to train new models that use our universal set while training as well.
arXiv Detail & Related papers (2024-10-07T19:47:13Z)
Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
We study the problem of residual error estimation for matrix and vector norms using a linear sketch. We demonstrate that this gives a substantial advantage empirically, for roughly the same sketch size and accuracy as in previous work. We also show an $Omega(k2/pn1-2/p)$ lower bound for the sparse recovery problem, which is tight up to a $mathrmpoly(log n)$ factor.
arXiv Detail & Related papers (2024-08-16T02:33:07Z)
Outlier Robust Multivariate Polynomial Regression [27.03423421704806]
We are given a set of random samples $(mathbfx_i,y_i) in [-1,1]n times mathbbR$ that are noisy versions of $(mathbfx_i,p(mathbfx_i)$. The goal is to output a $hatp$, within an $ell_in$-distance of at most $O(sigma)$ from $p$.
arXiv Detail & Related papers (2024-03-14T15:04:45Z)
A Whole New Ball Game: A Primal Accelerated Method for Matrix Games and Minimizing the Maximum of Smooth Functions [44.655316553524855]
We design algorithms for minimizing $max_iin[n] f_i(x) over a $d$-dimensional Euclidean or simplex domain. When each $f_i$ is $1$-Lipschitz and $1$-smooth, our method computes an $epsilon-approximate solution.
arXiv Detail & Related papers (2023-11-17T22:07:18Z)
Optimal Query Complexities for Dynamic Trace Estimation [59.032228008383484]
We consider the problem of minimizing the number of matrix-vector queries needed for accurate trace estimation in the dynamic setting where our underlying matrix is changing slowly. We provide a novel binary tree summation procedure that simultaneously estimates all $m$ traces up to $epsilon$ error with $delta$ failure probability. Our lower bounds (1) give the first tight bounds for Hutchinson's estimator in the matrix-vector product model with Frobenius norm error even in the static setting, and (2) are the first unconditional lower bounds for dynamic trace estimation.
arXiv Detail & Related papers (2022-09-30T04:15:44Z)
Self-adjusting Population Sizes for the $(1, \lambda)$-EA on Monotone Functions [7.111443975103329]
We study the $(1,lambda)$-EA with mutation rate $c/n$ for $cle 1$, where the population size is adaptively controlled with the $(1:s+1)$-success rule. We show that this setup with $c=1$ is efficient on onemax for $s1$, but inefficient if $s ge 18$.
arXiv Detail & Related papers (2022-04-01T15:46:12Z)
TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$. We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z)
Mediated Uncoupled Learning: Learning Functions without Direct Input-output Correspondences [80.95776331769899]
We consider the task of predicting $Y$ from $X$ when we have no paired data of them. A naive approach is to predict $U$ from $X$ using $S_X$ and then $Y$ from $U$ using $S_Y$. We propose a new method that avoids predicting $U$ but directly learns $Y = f(X)$ by training $f(X)$ with $S_X$ to predict $h(U)$.
arXiv Detail & Related papers (2021-07-16T22:13:29Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
Learning Mixtures of Spherical Gaussians via Fourier Analysis [0.5381004207943596]
We find that a bound on the sample and computational complexity was previously unknown when $omega(1) leq d leq O(log k)$. These authors also show that the sample of complexity of a random mixture of gaussians in a ball of radius $d$ in $d$ dimensions, when $d$ is $Theta(sqrtd)$ in $d$ dimensions, when $d$ is at least $poly(k, frac1delta)$.
arXiv Detail & Related papers (2020-04-13T08:06:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.