Chi-square and normal inference in high-dimensional multi-task
regression
- URL: http://arxiv.org/abs/2107.07828v1
- Date: Fri, 16 Jul 2021 11:19:49 GMT
- Title: Chi-square and normal inference in high-dimensional multi-task
regression
- Authors: Pierre C Bellec, Gabriel Romon
- Abstract summary: The paper proposes chi-square and normal methodologies for the unknown coefficient matrix $B*$ of size $ptimes T$ in a Multi-Task (MT) linear model.
- Score: 7.310043452300736
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The paper proposes chi-square and normal inference methodologies for the
unknown coefficient matrix $B^*$ of size $p\times T$ in a Multi-Task (MT)
linear model with $p$ covariates, $T$ tasks and $n$ observations under a
row-sparse assumption on $B^*$. The row-sparsity $s$, dimension $p$ and number
of tasks $T$ are allowed to grow with $n$. In the high-dimensional regime
$p\ggg n$, in order to leverage row-sparsity, the MT Lasso is considered.
We build upon the MT Lasso with a de-biasing scheme to correct for the bias
induced by the penalty. This scheme requires the introduction of a new
data-driven object, coined the interaction matrix, that captures effective
correlations between noise vector and residuals on different tasks. This matrix
is psd, of size $T\times T$ and can be computed efficiently.
The interaction matrix lets us derive asymptotic normal and $\chi^2_T$
results under Gaussian design and $\frac{sT+s\log(p/s)}{n}\to0$ which
corresponds to consistency in Frobenius norm. These asymptotic distribution
results yield valid confidence intervals for single entries of $B^*$ and valid
confidence ellipsoids for single rows of $B^*$, for both known and unknown
design covariance $\Sigma$. While previous proposals in grouped-variables
regression require row-sparsity $s\lesssim\sqrt n$ up to constants depending on
$T$ and logarithmic factors in $n,p$, the de-biasing scheme using the
interaction matrix provides confidence intervals and $\chi^2_T$ confidence
ellipsoids under the conditions ${\min(T^2,\log^8p)}/{n}\to 0$ and $$
\frac{sT+s\log(p/s)+\|\Sigma^{-1}e_j\|_0\log p}{n}\to0, \quad
\frac{\min(s,\|\Sigma^{-1}e_j\|_0)}{\sqrt n} \sqrt{[T+\log(p/s)]\log p}\to 0,
$$ allowing row-sparsity $s\ggg\sqrt n$ when $\|\Sigma^{-1}e_j\|_0 \sqrt T\lll
\sqrt{n}$ up to logarithmic factors.
Related papers
- Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
We study the problem of residual error estimation for matrix and vector norms using a linear sketch.
We demonstrate that this gives a substantial advantage empirically, for roughly the same sketch size and accuracy as in previous work.
We also show an $Omega(k2/pn1-2/p)$ lower bound for the sparse recovery problem, which is tight up to a $mathrmpoly(log n)$ factor.
arXiv Detail & Related papers (2024-08-16T02:33:07Z) - A Nearly-Optimal Bound for Fast Regression with $\ell_\infty$ Guarantee [16.409210914237086]
Given a matrix $Ain mathbbRntimes d$ and a tensor $bin mathbbRn$, we consider the regression problem with $ell_infty$ guarantees.
We show that in order to obtain such $ell_infty$ guarantee for $ell$ regression, one has to use sketching matrices that are dense.
We also develop a novel analytical framework for $ell_infty$ guarantee regression that utilizes the Oblivious Coordinate-wise Embedding (OCE) property
arXiv Detail & Related papers (2023-02-01T05:22:40Z) - Optimal Query Complexities for Dynamic Trace Estimation [59.032228008383484]
We consider the problem of minimizing the number of matrix-vector queries needed for accurate trace estimation in the dynamic setting where our underlying matrix is changing slowly.
We provide a novel binary tree summation procedure that simultaneously estimates all $m$ traces up to $epsilon$ error with $delta$ failure probability.
Our lower bounds (1) give the first tight bounds for Hutchinson's estimator in the matrix-vector product model with Frobenius norm error even in the static setting, and (2) are the first unconditional lower bounds for dynamic trace estimation.
arXiv Detail & Related papers (2022-09-30T04:15:44Z) - A spectral least-squares-type method for heavy-tailed corrupted
regression with unknown covariance \& heterogeneous noise [2.019622939313173]
We revisit heavy-tailed corrupted least-squares linear regression assuming to have a corrupted $n$-sized label-feature sample of at most $epsilon n$ arbitrary outliers.
We propose a near-optimal computationally tractable estimator, based on the power method, assuming no knowledge on $(Sigma,Xi) nor the operator norm of $Xi$.
arXiv Detail & Related papers (2022-09-06T23:37:31Z) - Active Sampling for Linear Regression Beyond the $\ell_2$ Norm [70.49273459706546]
We study active sampling algorithms for linear regression, which aim to query only a small number of entries of a target vector.
We show that this dependence on $d$ is optimal, up to logarithmic factors.
We also provide the first total sensitivity upper bound $O(dmax1,p/2log2 n)$ for loss functions with at most degree $p$ growth.
arXiv Detail & Related papers (2021-11-09T00:20:01Z) - Spectral properties of sample covariance matrices arising from random
matrices with independent non identically distributed columns [50.053491972003656]
It was previously shown that the functionals $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$.
Here, we show that $|mathbb E[R(z)] - tilde R(z)|_F
arXiv Detail & Related papers (2021-09-06T14:21:43Z) - Sparse sketches with small inversion bias [79.77110958547695]
Inversion bias arises when averaging estimates of quantities that depend on the inverse covariance.
We develop a framework for analyzing inversion bias, based on our proposed concept of an $(epsilon,delta)$-unbiased estimator for random matrices.
We show that when the sketching matrix $S$ is dense and has i.i.d. sub-gaussian entries, the estimator $(epsilon,delta)$-unbiased for $(Atop A)-1$ with a sketch of size $m=O(d+sqrt d/
arXiv Detail & Related papers (2020-11-21T01:33:15Z) - The Average-Case Time Complexity of Certifying the Restricted Isometry
Property [66.65353643599899]
In compressed sensing, the restricted isometry property (RIP) on $M times N$ sensing matrices guarantees efficient reconstruction of sparse vectors.
We investigate the exact average-case time complexity of certifying the RIP property for $Mtimes N$ matrices with i.i.d. $mathcalN(0,1/M)$ entries.
arXiv Detail & Related papers (2020-05-22T16:55:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.