C-kNN-LSH: A Nearest-Neighbor Algorithm for Sequential Counterfactual Inference
- URL: http://arxiv.org/abs/2602.02371v1
- Date: Mon, 02 Feb 2026 17:35:57 GMT
- Title: C-kNN-LSH: A Nearest-Neighbor Algorithm for Sequential Counterfactual Inference
- Authors: Jing Wang, Jie Shen, Qiaomin Xie, Jeremy C Weiss,
- Abstract summary: Estimating causal effects from longitudinal trajectories is central to understanding the progression of complex conditions.<n>We introduce emphC-kNN--LSH, a nearest-neighbor framework for sequential causal inference.
- Score: 18.510457926165373
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Estimating causal effects from longitudinal trajectories is central to understanding the progression of complex conditions and optimizing clinical decision-making, such as comorbidities and long COVID recovery. We introduce \emph{C-kNN--LSH}, a nearest-neighbor framework for sequential causal inference designed to handle such high-dimensional, confounded situations. By utilizing locality-sensitive hashing, we efficiently identify ``clinical twins'' with similar covariate histories, enabling local estimation of conditional treatment effects across evolving disease states. To mitigate bias from irregular sampling and shifting patient recovery profiles, we integrate neighborhood estimator with a doubly-robust correction. Theoretical analysis guarantees our estimator is consistent and second-order robust to nuisance error. Evaluated on a real-world Long COVID cohort with 13,511 participants, \emph{C-kNN-LSH} demonstrates superior performance in capturing recovery heterogeneity and estimating policy values compared to existing baselines.
Related papers
- Conditional Counterfactual Mean Embeddings: Doubly Robust Estimation and Learning Rates [0.0]
We propose a framework that embeds conditional distributions of counterfactual outcomes into a kernel reproducing Hilbert space (RKHS)<n>Under this framework, we develop a two-stage meta-estimator for CCME that accommodates any RKHS-valued regression in each stage.<n>Our experiments demonstrate that our estimators accurately recover distributional features including multimodal structure of conditional counterfactual distributions.
arXiv Detail & Related papers (2026-02-04T16:40:29Z) - Direct Doubly Robust Estimation of Conditional Quantile Contrasts [8.035129168712334]
Conditional quantile comparator (CQC) offers quantile-level summaries akin to the conditional quantile treatment effect (CQTE)<n>Current estimation is limited by the need to first estimate the difference in conditional cumulative distribution functions and then invert it.<n>We propose the first direct estimator of the CQC, allowing for explicit modelling and parameterisation.
arXiv Detail & Related papers (2026-01-27T14:46:25Z) - Learning Causality for Longitudinal Data [1.2691047660244335]
This thesis develops methods for causal inference and causal representation learning in high-dimensional, time-varying data.<n>The first contribution introduces the Causal Dynamic Variational Autoencoder (CDVAE), a model for estimating Individual Treatment Effects (ITEs)<n>The second contribution proposes an efficient framework for long-term counterfactual regression based on RNNs enhanced with Contrastive Predictive Coding ( CPC) and InfoMax.<n>The third contribution advances CRL by addressing how latent causes manifest in observed variables.
arXiv Detail & Related papers (2025-12-04T16:51:49Z) - Learning bounds for doubly-robust covariate shift adaptation [8.24901041136559]
Distribution shift between the training domain and the test domain poses a key challenge for machine learning.<n> doubly-robust (DR) estimator combines density ratio estimation with a pilot regression model.<n>This paper establishes the first non-asymptotic learning bounds for the DR estimator.
arXiv Detail & Related papers (2025-11-14T06:46:23Z) - Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations [57.179679246370114]
We identify the distribution of random perturbations that minimizes the estimator's variance as the perturbation stepsize tends to zero.<n>Our findings reveal that such desired perturbations can align directionally with the true gradient, instead of maintaining a fixed length.
arXiv Detail & Related papers (2025-10-22T19:06:39Z) - A Hybrid Enumeration Framework for Optimal Counterfactual Generation in Post-Acute COVID-19 Heart Failure [1.9794615806637272]
We present a counterfactual inference framework for individualized risk estimation and intervention analysis.<n>The framework combines exact enumeration with optimization-based methods, including the Nearest Instance Counterfactual Explanations (NICE) and Multi-Objective Counterfactuals (MOC) algorithms.
arXiv Detail & Related papers (2025-10-21T17:35:12Z) - Classifying Clinical Outcome of Epilepsy Patients with Ictal Chirp Embeddings [0.23020018305241333]
This study presents a pipeline leveraging t-Distributed Neighbor Embedding (t-SNE) for interpretable visualizations of chirp features across diverse outcome scenarios.<n>The dataset comprises chirp-based temporal, spectral, and frequency metrics.<n>Using t-SNE, local neighborhood relationships were preserved while addressing the crowding problem.
arXiv Detail & Related papers (2025-08-19T03:14:41Z) - In-Context Parametric Inference: Point or Distribution Estimators? [66.22308335324239]
We show that amortized point estimators generally outperform posterior inference, though the latter remain competitive in some low-dimensional problems.<n>Our experiments indicate that amortized point estimators generally outperform posterior inference, though the latter remain competitive in some low-dimensional problems.
arXiv Detail & Related papers (2025-02-17T10:00:24Z) - A Partial Initialization Strategy to Mitigate the Overfitting Problem in CATE Estimation with Hidden Confounding [44.874826691991565]
Estimating the conditional average treatment effect (CATE) from observational data plays a crucial role in areas such as e-commerce, healthcare, and economics.<n>Existing studies mainly rely on the strong ignorability assumption that there are no hidden confounders.<n>Data collected from randomized controlled trials (RCT) do not suffer from confounding but are usually limited by a small sample size.
arXiv Detail & Related papers (2025-01-15T15:58:16Z) - Doubly Robust Proximal Causal Learning for Continuous Treatments [56.05592840537398]
We propose a kernel-based doubly robust causal learning estimator for continuous treatments.
We show that its oracle form is a consistent approximation of the influence function.
We then provide a comprehensive convergence analysis in terms of the mean square error.
arXiv Detail & Related papers (2023-09-22T12:18:53Z) - L-C2ST: Local Diagnostics for Posterior Approximations in
Simulation-Based Inference [63.22081662149488]
L-C2ST allows for a local evaluation of the posterior estimator at any given observation.
It offers theoretically grounded and easy to interpret.
On standard SBI benchmarks, L-C2ST provides comparable results to C2ST and outperforms alternative local approaches.
arXiv Detail & Related papers (2023-06-06T10:53:26Z) - Cross-Site Severity Assessment of COVID-19 from CT Images via Domain
Adaptation [64.59521853145368]
Early and accurate severity assessment of Coronavirus disease 2019 (COVID-19) based on computed tomography (CT) images offers a great help to the estimation of intensive care unit event.
To augment the labeled data and improve the generalization ability of the classification model, it is necessary to aggregate data from multiple sites.
This task faces several challenges including class imbalance between mild and severe infections, domain distribution discrepancy between sites, and presence of heterogeneous features.
arXiv Detail & Related papers (2021-09-08T07:56:51Z) - Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality [131.45028999325797]
We develop a doubly robust off-policy AC (DR-Off-PAC) for discounted MDP.
DR-Off-PAC adopts a single timescale structure, in which both actor and critics are updated simultaneously with constant stepsize.
We study the finite-time convergence rate and characterize the sample complexity for DR-Off-PAC to attain an $epsilon$-accurate optimal policy.
arXiv Detail & Related papers (2021-02-23T18:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.