Mining Causality: AI-Assisted Search for Instrumental Variables
- URL: http://arxiv.org/abs/2409.14202v1
- Date: Sat, 21 Sep 2024 17:19:29 GMT
- Title: Mining Causality: AI-Assisted Search for Instrumental Variables
- Authors: Sukjin Han,
- Abstract summary: We propose using large language models to search for new IVs through narratives and counterfactual reasoning.
We demonstrate how to construct prompts to search for potentially valid IVs.
We apply our method to three well-known examples in economics: returns to schooling, production functions, and peer effects.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The instrumental variables (IVs) method is a leading empirical strategy for causal inference. Finding IVs is a heuristic and creative process, and justifying its validity (especially exclusion restrictions) is largely rhetorical. We propose using large language models (LLMs) to search for new IVs through narratives and counterfactual reasoning, similar to how a human researcher would. The stark difference, however, is that LLMs can accelerate this process exponentially and explore an extremely large search space. We demonstrate how to construct prompts to search for potentially valid IVs. We argue that multi-step prompting is useful and role-playing prompts are suitable for mimicking the endogenous decisions of economic agents. We apply our method to three well-known examples in economics: returns to schooling, production functions, and peer effects. We then extend our strategy to finding (i) control variables in regression and difference-in-differences and (ii) running variables in regression discontinuity designs.
Related papers
- Using LLMs for Explaining Sets of Counterfactual Examples to Final Users [0.0]
In automated decision-making scenarios, causal inference methods can analyze the underlying data-generation process.
Counterfactual examples explore hypothetical scenarios where a minimal number of factors are altered.
We propose a novel multi-step pipeline that uses counterfactuals to generate natural language explanations of actions that will lead to a change in outcome.
arXiv Detail & Related papers (2024-08-27T15:13:06Z) - Evaluating Interventional Reasoning Capabilities of Large Language Models [58.52919374786108]
Large language models (LLMs) can estimate causal effects under interventions on different parts of a system.
We conduct empirical analyses to evaluate whether LLMs can accurately update their knowledge of a data-generating process in response to an intervention.
We create benchmarks that span diverse causal graphs (e.g., confounding, mediation) and variable types, and enable a study of intervention-based reasoning.
arXiv Detail & Related papers (2024-04-08T14:15:56Z) - An Incomplete Loop: Deductive, Inductive, and Abductive Learning in Large Language Models [99.31449616860291]
Modern language models (LMs) can learn to perform new tasks in different ways.
In instruction following, the target task is described explicitly in natural language; in few-shot prompting, the task is specified implicitly.
In instruction inference, LMs are presented with in-context examples and are then prompted to generate a natural language task description.
arXiv Detail & Related papers (2024-04-03T19:31:56Z) - An Examination on the Effectiveness of Divide-and-Conquer Prompting in Large Language Models [28.139780691709266]
We provide a theoretic analysis to divide-and-conquer prompting strategy and help us identify the specific tasks where DaC prompting can bring performance boost with theoretic guarantee.
We present two cases (large integer arithmetic and fact verification) where experimental results align with our theoretic analysis.
arXiv Detail & Related papers (2024-02-08T02:37:30Z) - How FaR Are Large Language Models From Agents with Theory-of-Mind? [69.41586417697732]
We propose a new evaluation paradigm for large language models (LLMs): Thinking for Doing (T4D)
T4D requires models to connect inferences about others' mental states to actions in social scenarios.
We introduce a zero-shot prompting framework, Foresee and Reflect (FaR), which provides a reasoning structure that encourages LLMs to anticipate future challenges.
arXiv Detail & Related papers (2023-10-04T06:47:58Z) - Query-Dependent Prompt Evaluation and Optimization with Offline Inverse
RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization.
We identify a previously overlooked objective of query dependency in such optimization.
We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z) - Hypothesis Search: Inductive Reasoning with Language Models [39.03846394586811]
Recent work evaluates large language models on inductive reasoning tasks by directly prompting them yielding "in context learning"
This works well for straightforward inductive tasks but performs poorly on complex tasks such as the Abstraction and Reasoning Corpus (ARC)
In this work, we propose to improve the inductive reasoning ability of LLMs by generating explicit hypotheses at multiple levels of abstraction.
arXiv Detail & Related papers (2023-09-11T17:56:57Z) - Endogenous Macrodynamics in Algorithmic Recourse [52.87956177581998]
Existing work on Counterfactual Explanations (CE) and Algorithmic Recourse (AR) has largely focused on single individuals in a static environment.
We show that many of the existing methodologies can be collectively described by a generalized framework.
We then argue that the existing framework does not account for a hidden external cost of recourse, that only reveals itself when studying the endogenous dynamics of recourse at the group level.
arXiv Detail & Related papers (2023-08-16T07:36:58Z) - Instrumental Variables in Causal Inference and Machine Learning: A
Survey [26.678154268037595]
Causal inference is a process of using assumptions to draw conclusions about the causal relationships between variables based on data.
A growing literature in both causal inference and machine learning proposes to use Instrumental Variables (IV)
This paper serves as the first effort to systematically and comprehensively introduce and discuss the IV methods and their applications in both causal inference and machine learning.
arXiv Detail & Related papers (2022-12-12T08:59:04Z) - Identifying Causal Effects using Instrumental Time Series: Nuisance IV and Correcting for the Past [12.49477539101379]
We consider IV regression in time series models, such as vector auto-regressive ( VAR) processes.
Direct applications of i.i.d. techniques are generally inconsistent as they do not correctly adjust for dependencies in the past.
We provide methods, prove their consistency, and show how the inferred causal effect can be used for distribution generalization.
arXiv Detail & Related papers (2022-03-11T16:29:48Z) - Deterministic and Discriminative Imitation (D2-Imitation): Revisiting
Adversarial Imitation for Sample Efficiency [61.03922379081648]
We propose an off-policy sample efficient approach that requires no adversarial training or min-max optimization.
Our empirical results show that D2-Imitation is effective in achieving good sample efficiency, outperforming several off-policy extension approaches of adversarial imitation.
arXiv Detail & Related papers (2021-12-11T19:36:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.