Related papers: Personalized Treatment Effect Estimation from Unstructured Data

Personalized Treatment Effect Estimation from Unstructured Data

URL: http://arxiv.org/abs/2507.20993v1
Date: Mon, 28 Jul 2025 16:52:31 GMT
Title: Personalized Treatment Effect Estimation from Unstructured Data
Authors: Henri Arno, Thomas Demeester,
Abstract summary: We introduce an approximate 'plug-in' method trained directly on the neural representations of unstructured data.<n>We then introduce two theoretically grounded estimators that leverage structured measurements of the confounders during training.<n>Our experiments on two benchmark datasets show that the plug-in method, directly trainable on large unstructured datasets, achieves strong empirical performance across all settings.
Score: 8.468367158186007
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Existing methods for estimating personalized treatment effects typically rely on structured covariates, limiting their applicability to unstructured data. Yet, leveraging unstructured data for causal inference has considerable application potential, for instance in healthcare, where clinical notes or medical images are abundant. To this end, we first introduce an approximate 'plug-in' method trained directly on the neural representations of unstructured data. However, when these fail to capture all confounding information, the method may be subject to confounding bias. We therefore introduce two theoretically grounded estimators that leverage structured measurements of the confounders during training, but allow estimating personalized treatment effects purely from unstructured inputs, while avoiding confounding bias. When these structured measurements are only available for a non-representative subset of the data, these estimators may suffer from sampling bias. To address this, we further introduce a regression-based correction that accounts for the non-uniform sampling, assuming the sampling mechanism is known or can be well-estimated. Our experiments on two benchmark datasets show that the plug-in method, directly trainable on large unstructured datasets, achieves strong empirical performance across all settings, despite its simplicity.

Related papers

Biased Generalization in Diffusion Models [4.602851365305176]
Generalization in generative modeling is defined as the ability to learn an underlying distribution from a finite dataset and produce novel samples.<n>In practice, training is often stopped at the minimum of the test loss, taken as an operational indicator of generalization.<n>We challenge this viewpoint by identifying a phase of biased generalization during training, in which the model continues to decrease the test loss while favoring samples with anomalously high proximity to training data.
arXiv Detail & Related papers (2026-03-03T19:25:33Z)
Diffusion Reconstruction-based Data Likelihood Estimation for Core-Set Selection [32.39319533553288]
We propose a novel, theoretically grounded approach to estimate data likelihood via reconstruction deviation.<n>We establish a formal connection between reconstruction error and data likelihood, grounded in the Evidence Lower Bound (ELBO) of Markovian diffusion processes.<n>Experiments on ImageNet demonstrate that reconstruction deviation offers an effective scoring criterion.
arXiv Detail & Related papers (2025-11-24T16:25:34Z)
Understanding Data Influence with Differential Approximation [63.817689230826595]
We introduce a new formulation to approximate a sample's influence by accumulating the differences in influence between consecutive learning steps, which we term Diff-In.<n>By employing second-order approximations, we approximate these difference terms with high accuracy while eliminating the need for model convexity required by existing methods.<n>Our theoretical analysis demonstrates that Diff-In achieves significantly lower approximation error compared to existing influence estimators.
arXiv Detail & Related papers (2025-08-20T11:59:32Z)
Simulating Biases for Interpretable Fairness in Offline and Online Classifiers [0.35998666903987897]
Mitigation methods are critical to ensure that model outcomes are adjusted to be fair.<n>We develop a framework for synthetic dataset generation with controllable bias injection.<n>In experiments, both offline and online learning approaches are employed.
arXiv Detail & Related papers (2025-07-14T11:04:24Z)
Robust Molecular Property Prediction via Densifying Scarce Labeled Data [51.55434084913129]
In drug discovery, compounds most critical for advancing research often lie beyond the training set.<n>We propose a novel meta-learning-based approach that leverages unlabeled data to interpolate between in-distribution (ID) and out-of-distribution (OOD) data.<n>We demonstrate significant performance gains on challenging real-world datasets.
arXiv Detail & Related papers (2025-06-13T15:27:40Z)
A Unifying Framework for Robust and Efficient Inference with Unstructured Data [2.07180164747172]
This paper presents a general framework for conducting efficient inference on parameters derived from unstructured data.<n>We formalize this approach with MAR-S, a framework that unifies and extends existing methods for debiased inference.<n>Within this framework, we develop robust and efficient estimators for both descriptive and causal estimands.
arXiv Detail & Related papers (2025-05-01T04:11:25Z)
A Partial Initialization Strategy to Mitigate the Overfitting Problem in CATE Estimation with Hidden Confounding [44.874826691991565]
Estimating the conditional average treatment effect (CATE) from observational data plays a crucial role in areas such as e-commerce, healthcare, and economics.<n>Existing studies mainly rely on the strong ignorability assumption that there are no hidden confounders.<n>Data collected from randomized controlled trials (RCT) do not suffer from confounding but are usually limited by a small sample size.
arXiv Detail & Related papers (2025-01-15T15:58:16Z)
Two Is Better Than One: Aligned Representation Pairs for Anomaly Detection [56.57122939745213]
Anomaly detection focuses on identifying samples that deviate from the norm.<n>Recent self-supervised methods have successfully learned such representations by employing prior knowledge about anomalies to create synthetic outliers during training.<n>We address this limitation with our new approach Con$$, which leverages prior knowledge about symmetries in normal samples to observe the data in different contexts.
arXiv Detail & Related papers (2024-05-29T07:59:06Z)
Geometry-Aware Instrumental Variable Regression [56.16884466478886]
We propose a transport-based IV estimator that takes into account the geometry of the data manifold through data-derivative information. We provide a simple plug-and-play implementation of our method that performs on par with related estimators in standard settings.
arXiv Detail & Related papers (2024-05-19T17:49:33Z)
Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation [53.27596811146316]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts. We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep. We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z)
Mitigating Dataset Bias by Using Per-sample Gradient [9.290757451344673]
We propose PGD (Per-sample Gradient-based Debiasing), that comprises three steps: training a model on uniform batch sampling, setting the importance of each sample in proportion to the norm of the sample gradient, and training the model using importance-batch sampling. Compared with existing baselines for various synthetic and real-world datasets, the proposed method showed state-of-the-art accuracy for a the classification task.
arXiv Detail & Related papers (2022-05-31T11:41:02Z)
Spectral Clustering with Variance Information for Group Structure Estimation in Panel Data [7.712669451925186]
We first conduct a local analysis which reveals that the variances of the individual coefficient estimates contain useful information for the estimation of group structure. We then propose a method to estimate unobserved groupings for general panel data models that explicitly account for the variance information.
arXiv Detail & Related papers (2022-01-05T19:16:16Z)
Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation. We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation. Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z)
Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties. Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z)
Weight-of-evidence 2.0 with shrinkage and spline-binning [3.925373521409752]
We propose a formalized, data-driven and powerful method to transform categorical predictors. We extend upon the weight-of-evidence approach and propose to estimate the proportions using shrinkage estimators. We present the results of a series of experiments in a fraud detection setting, which illustrate the effectiveness of the presented approach.
arXiv Detail & Related papers (2021-01-05T13:13:16Z)
Performance metrics for intervention-triggering prediction models do not reflect an expected reduction in outcomes from using the model [71.9860741092209]
Clinical researchers often select among and evaluate risk prediction models. Standard metrics calculated from retrospective data are only related to model utility under certain assumptions. When predictions are delivered repeatedly throughout time, the relationship between standard metrics and utility is further complicated.
arXiv Detail & Related papers (2020-06-02T16:26:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.