Demystifying Spectral Feature Learning for Instrumental Variable Regression
- URL: http://arxiv.org/abs/2506.10899v2
- Date: Mon, 22 Sep 2025 20:31:39 GMT
- Title: Demystifying Spectral Feature Learning for Instrumental Variable Regression
- Authors: Dimitri Meunier, Antoine Moulin, Jakub Wornbard, Vladimir R. Kostic, Arthur Gretton,
- Abstract summary: We derive a generalization error bound for a two-stage least squares estimator based on spectral features.<n>We show that performance depends on two key factors, leading to a clear taxonomy of outcomes.
- Score: 21.514407689434364
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We address the problem of causal effect estimation in the presence of hidden confounders, using nonparametric instrumental variable (IV) regression. A leading strategy employs spectral features - that is, learned features spanning the top eigensubspaces of the operator linking treatments to instruments. We derive a generalization error bound for a two-stage least squares estimator based on spectral features, and gain insights into the method's performance and failure modes. We show that performance depends on two key factors, leading to a clear taxonomy of outcomes. In a good scenario, the approach is optimal. This occurs with strong spectral alignment, meaning the structural function is well-represented by the top eigenfunctions of the conditional operator, coupled with this operator's slow eigenvalue decay, indicating a strong instrument. Performance degrades in a bad scenario: spectral alignment remains strong, but rapid eigenvalue decay (indicating a weaker instrument) demands significantly more samples for effective feature learning. Finally, in the ugly scenario, weak spectral alignment causes the method to fail, regardless of the eigenvalues' characteristics. Our synthetic experiments empirically validate this taxonomy.
Related papers
- Canonical correlation regression with noisy data [1.8620637029128544]
We analyze a family of estimators based on two stage least squares with spectral regularization.<n>As a theoretical contribution, we derive upper and lower bounds on estimation error, proving optimality of the method with noisy data.<n>As a practical contribution, we provide guidance on which types of spectral regularization to use in different regimes.
arXiv Detail & Related papers (2025-12-27T20:08:15Z) - Outcome-Aware Spectral Feature Learning for Instrumental Variable Regression [37.76825470697479]
We introduce Augmented Spectral Feature Learning, a framework that makes the feature learning process outcome-aware.<n>We provide a theoretical analysis of this framework and validate our approach on challenging benchmarks.
arXiv Detail & Related papers (2025-11-30T14:54:03Z) - STNet: Spectral Transformation Network for Solving Operator Eigenvalue Problem [10.27238431947351]
Operator eigenvalue problems play a critical role in various scientific fields and engineering applications.<n>Recent deep learning methods provide an efficient approach to address this challenge by iteratively updating neural networks.<n>We propose the Spectral Transformation Network (STNet), which consistently outperforms existing learning-based methods.
arXiv Detail & Related papers (2025-10-28T01:43:54Z) - Nonparametric Instrumental Variable Inference with Many Weak Instruments [38.841210420855276]
We study inference on linear functionals in the nonparametric instrumental variable (NPIV) problem with a discretely-valued instrument.<n>We construct automatic debiased machine learning estimators for linear functionals of both the structural function and its minimum-norm projection.
arXiv Detail & Related papers (2025-05-12T16:36:55Z) - Spectral Estimators for Multi-Index Models: Precise Asymptotics and Optimal Weak Recovery [21.414505380263016]
We focus on recovering the subspace spanned by the signals via spectral estimators.<n>Our main technical contribution is a precise characterization of the performance of spectral methods.<n>Our analysis unveils a phase transition phenomenon in which, as the sample complexity grows, eigenvalues escape from the bulk of the spectrum.
arXiv Detail & Related papers (2025-02-03T18:08:30Z) - Point-Calibrated Spectral Neural Operators [54.13671100638092]
We introduce Point-Calibrated Spectral Transform, which learns operator mappings by approximating functions with the point-level adaptive spectral basis.
Point-Calibrated Spectral Neural Operators learn operator mappings by approximating functions with the point-level adaptive spectral basis.
arXiv Detail & Related papers (2024-10-15T08:19:39Z) - On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics.
The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z) - Hodge-Aware Contrastive Learning [101.56637264703058]
Simplicial complexes prove effective in modeling data with multiway dependencies.
We develop a contrastive self-supervised learning approach for processing simplicial data.
arXiv Detail & Related papers (2023-09-14T00:40:07Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Sharp Spectral Rates for Koopman Operator Learning [27.820383937933034]
We present for the first time non-asymptotic learning bounds for the Koopman eigenvalues and eigenfunctions.
Our results shed new light on the emergence of spurious eigenvalues.
arXiv Detail & Related papers (2023-02-03T21:19:56Z) - Spectral Feature Augmentation for Graph Contrastive Learning and Beyond [64.78221638149276]
We present a novel spectral feature argumentation for contrastive learning on graphs (and images)
For each data view, we estimate a low-rank approximation per feature map and subtract that approximation from the map to obtain its complement.
This is achieved by the proposed herein incomplete power iteration, a non-standard power regime which enjoys two valuable byproducts (under mere one or two iterations)
Experiments on graph/image datasets show that our spectral feature augmentation outperforms baselines.
arXiv Detail & Related papers (2022-12-02T08:48:11Z) - Spectral Decomposition Representation for Reinforcement Learning [100.0424588013549]
We propose an alternative spectral method, Spectral Decomposition Representation (SPEDER), that extracts a state-action abstraction from the dynamics without inducing spurious dependence on the data collection policy.
A theoretical analysis establishes the sample efficiency of the proposed algorithm in both the online and offline settings.
An experimental investigation demonstrates superior performance over current state-of-the-art algorithms across several benchmarks.
arXiv Detail & Related papers (2022-08-19T19:01:30Z) - Harmless interpolation in regression and classification with structured
features [21.064512161584872]
Overparametrized neural networks tend to perfectly fit noisy training data yet generalize well on test data.
We present a general and flexible framework for upper bounding regression and classification risk in a reproducing kernel Hilbert space.
arXiv Detail & Related papers (2021-11-09T15:12:26Z) - Bias-Variance Tradeoffs in Single-Sample Binary Gradient Estimators [100.58924375509659]
Straight-through (ST) estimator gained popularity due to its simplicity and efficiency.
Several techniques were proposed to improve over ST while keeping the same low computational complexity.
We conduct a theoretical analysis of Bias and Variance of these methods in order to understand tradeoffs and verify originally claimed properties.
arXiv Detail & Related papers (2021-10-07T15:16:07Z) - Fine-grained Generalization Analysis of Vector-valued Learning [28.722350261462463]
We start the generalization analysis of regularized vector-valued learning algorithms by presenting bounds with a mild dependency on the output dimension and a fast rate on the sample size.
To understand the interaction between optimization and learning, we further use our results to derive the first bounds for descent with vector-valued functions.
As a byproduct, we derive a Rademacher complexity bound for loss function classes defined in terms of a general strongly convex function.
arXiv Detail & Related papers (2021-04-29T07:57:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.