Double and Single Descent in Causal Inference with an Application to
High-Dimensional Synthetic Control
- URL: http://arxiv.org/abs/2305.00700v3
- Date: Thu, 12 Oct 2023 21:25:46 GMT
- Title: Double and Single Descent in Causal Inference with an Application to
High-Dimensional Synthetic Control
- Authors: Jann Spiess, Guido Imbens, Amar Venugopal
- Abstract summary: In machine learning, there may be so many free parameters that the model fits the training data perfectly.
We document the performance of high-dimensional synthetic control estimators with many control units.
We find that adding control units can help improve imputation performance even beyond the point where the pre-treatment fit is perfect.
- Score: 2.3173485093942943
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated by a recent literature on the double-descent phenomenon in machine
learning, we consider highly over-parameterized models in causal inference,
including synthetic control with many control units. In such models, there may
be so many free parameters that the model fits the training data perfectly. We
first investigate high-dimensional linear regression for imputing wage data and
estimating average treatment effects, where we find that models with many more
covariates than sample size can outperform simple ones. We then document the
performance of high-dimensional synthetic control estimators with many control
units. We find that adding control units can help improve imputation
performance even beyond the point where the pre-treatment fit is perfect. We
provide a unified theoretical perspective on the performance of these
high-dimensional models. Specifically, we show that more complex models can be
interpreted as model-averaging estimators over simpler ones, which we link to
an improvement in average performance. This perspective yields concrete
insights into the use of synthetic control when control units are many relative
to the number of pre-treatment periods.
Related papers
- SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - Data-driven Nonlinear Model Reduction using Koopman Theory: Integrated
Control Form and NMPC Case Study [56.283944756315066]
We propose generic model structures combining delay-coordinate encoding of measurements and full-state decoding to integrate reduced Koopman modeling and state estimation.
A case study demonstrates that our approach provides accurate control models and enables real-time capable nonlinear model predictive control of a high-purity cryogenic distillation column.
arXiv Detail & Related papers (2024-01-09T11:54:54Z) - Model-agnostic Body Part Relevance Assessment for Pedestrian Detection [4.405053430046726]
We present a framework for using sampling-based explanation models in a computer vision context by body part relevance assessment for pedestrian detection.
We introduce a novel sampling-based method similar to KernelSHAP that shows more robustness for lower sampling sizes and, thus, is more efficient for explainability analyses on large-scale datasets.
arXiv Detail & Related papers (2023-11-27T10:10:25Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - Understanding Parameter Sharing in Transformers [53.75988363281843]
Previous work on Transformers has focused on sharing parameters in different layers, which can improve the performance of models with limited parameters by increasing model depth.
We show that the success of this approach can be largely attributed to better convergence, with only a small part due to the increased model complexity.
Experiments on 8 machine translation tasks show that our model achieves competitive performance with only half the model complexity of parameter sharing models.
arXiv Detail & Related papers (2023-06-15T10:48:59Z) - Learning a model is paramount for sample efficiency in reinforcement
learning control of PDEs [5.488334211013093]
We show that learning an actuated model in parallel to training the RL agent significantly reduces the total amount of required data sampled from the real system.
We also show that iteratively updating the model is of major importance to avoid biases in the RL training.
arXiv Detail & Related papers (2023-02-14T16:14:39Z) - End-to-End Learning of Hybrid Inverse Dynamics Models for Precise and
Compliant Impedance Control [16.88250694156719]
We present a novel hybrid model formulation that enables us to identify fully physically consistent inertial parameters of a rigid body dynamics model.
We compare our approach against state-of-the-art inverse dynamics models on a 7 degree of freedom manipulator.
arXiv Detail & Related papers (2022-05-27T07:39:28Z) - MoEfication: Conditional Computation of Transformer Models for Efficient
Inference [66.56994436947441]
Transformer-based pre-trained language models can achieve superior performance on most NLP tasks due to large parameter capacity, but also lead to huge computation cost.
We explore to accelerate large-model inference by conditional computation based on the sparse activation phenomenon.
We propose to transform a large model into its mixture-of-experts (MoE) version with equal model size, namely MoEfication.
arXiv Detail & Related papers (2021-10-05T02:14:38Z) - A Semiparametric Approach to Interpretable Machine Learning [9.87381939016363]
Black box models in machine learning have demonstrated excellent predictive performance in complex problems and high-dimensional settings.
Their lack of transparency and interpretability restrict the applicability of such models in critical decision-making processes.
We propose a novel approach to trading off interpretability and performance in prediction models using ideas from semiparametric statistics.
arXiv Detail & Related papers (2020-06-08T16:38:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.