Causal Inference cheatsheet based on Matheus Facure's book on Causal Inference. Created by ChatGPT.

### Overview of Causal Inference Concepts

Concept | Description | Example | Reference |
---|---|---|---|

Causal Inference | Determining the cause-and-effect relationship between variables. | Assessing the impact of a new drug on patient recovery. | Link |

Treatment/Intervention | The variable or action being studied for its effect on an outcome. | A new teaching method. | Link |

Outcome | The variable or result that is influenced by the treatment/intervention. | Student test scores. | Link |

Confounder | A variable that influences both the treatment and the outcome, potentially biasing the estimated effect. | Age in a study linking exercise to heart health. | Link |

Randomized Controlled Trial (RCT) | Participants are randomly assigned to treatment or control groups to ensure comparability. | Testing a new medication by randomly assigning patients to receive either the medication or a placebo. | Link |

Observational Study | The researcher observes the effect of treatments without random assignment. | Studying the effect of smoking on lung cancer through observational data. | Link |

Counterfactual | The hypothetical scenario of what would have happened to the same units under a different treatment condition. | What would be the unemployment rate if a stimulus package had not been implemented? | Link |

Selection Bias | Bias introduced when the subjects studied are not representative of the general population. | Studying only healthy volunteers for a new drug might overestimate its effectiveness. | Link |

Instrumental Variables (IV) | Variables that affect the treatment but do not directly affect the outcome, used to estimate causal relationships when controlled experiments are not feasible. | Using distance to the nearest college as an instrument for education level in earnings studies. | Link |

Difference-in-Differences (DiD) | Compares the changes in outcomes over time between a treatment group and a control group. | Evaluating the impact of a new law by comparing regions before and after the law is implemented. | Link |

Regression Discontinuity (RD) | Uses a cutoff or threshold to assign treatment and compares those just above and below the cutoff to estimate causal effects. | Estimating the effect of a scholarship program on student performance by comparing students around the eligibility cutoff. | Link |

Propensity Score Matching | Matches treated and untreated units with similar propensity scores (the probability of receiving the treatment) to estimate the treatment effect. | Comparing outcomes of patients receiving different treatments by matching on demographic and clinical characteristics. | Link |

Synthetic Control Method | Constructs a weighted combination of control units to create a synthetic control group for comparison with the treated unit. | Evaluating the impact of a policy change in one country by comparing it to a synthetic control group constructed from other countries. | Link |

Mediation Analysis | Examines how an intermediate variable mediates the relationship between an independent variable and a dependent variable. | Analyzing how stress reduction mediates the relationship between exercise and mental health. | Link |

Natural Experiment | Uses naturally occurring events or circumstances that mimic random assignment to estimate causal effects. | Studying the impact of a natural disaster on economic outcomes. | Link |

Heterogeneous Treatment Effects | Analysis that examines how treatment effects vary across different subgroups or contexts. | Investigating whether a job training program has different effects based on participants' age or education level. | Link |

Panel Data and Fixed Effects | Uses data collected over time on the same units to control for unobserved variables that do not change over time. | Evaluating the impact of education policies by analyzing student performance data over multiple years. | Link |

Synthetic Difference-in-Differences (SDID) | Combines synthetic control and difference-in-differences methods to estimate treatment effects. | Evaluating the impact of a new law by comparing the treated region to a synthetic control region over time. | Link |

### Key Assumptions

Assumption | Description | Example | Reference |
---|---|---|---|

Ignorability/Exchangeability | Given a set of observed covariates, the potential outcomes are independent of the treatment assignment. | Assuming no unmeasured confounders in a study linking diet to heart disease. | Link |

Stable Unit Treatment Value Assumption (SUTVA) | There are no interference effects between units, and each unit has a single version of treatment. | One person's vaccination does not directly affect another's health outcome in the study. | Link |

Common Support/Overlap | There is a sufficient overlap in covariate distributions between the treatment and control groups to make comparisons possible. | In a study comparing different teaching methods, students in all groups have similar background characteristics. | Link |

### Important Methods

**Randomized Controlled Trials (RCTs)**

**Purpose**: Establish causal relationships by randomly assigning treatment.**Example**: Testing the effectiveness of a new drug.**Key Point**: Randomization ensures comparability between treatment and control groups.

**Instrumental Variables (IV)**

**Purpose**: Estimate causal relationships when controlled experiments are not feasible.**Example**: Using proximity to a college as an instrument for education in earnings studies.**Key Point**: The instrument affects the outcome only through the treatment.

**Difference-in-Differences (DiD)**

**Purpose**: Compare changes in outcomes over time between a treatment group and a control group.**Example**: Evaluating the impact of a policy change by comparing regions before and after the policy implementation.**Key Point**: Assumes parallel trends between the treatment and control groups before the intervention.

**Regression Discontinuity (RD)**

**Purpose**: Estimate causal effects using a cutoff or threshold for treatment assignment.**Example**: Assessing the effect of a scholarship program by comparing students just above and below the eligibility cutoff.**Key Point**: Compares observations just above and below the threshold.

**Propensity Score Matching**

**Purpose**: Estimate treatment effects by matching treated and untreated units with similar propensity scores.**Example**: Comparing outcomes of patients receiving different treatments by matching on demographic and clinical characteristics.**Key Point**: Reduces bias by ensuring comparable groups.

**Synthetic Control Method**

**Purpose**: Create a synthetic control group for comparison with the treated unit.**Example**: Evaluating the impact of a policy change by comparing it to a synthetic control group constructed from other regions or countries.**Key Point**: Constructs a weighted combination of control units to match the treated unit.

**Mediation Analysis**

**Purpose**: Examine how an intermediate variable mediates the relationship between an independent variable and a dependent variable.**Example**: Analyzing how stress reduction mediates the relationship between exercise and mental health.**Key Point**: Identifies pathways through which the treatment affects the outcome.

**Panel Data and Fixed Effects**

**Purpose**: Control for unobserved variables that do not change over time by using data collected over multiple time periods.**Example**: Evaluating the impact of education policies by analyzing student performance data over multiple years.**Key Point**: Removes bias from time-invariant unobserved variables.

**Synthetic Difference-in-Differences (SDID)**

**Purpose**: Combine synthetic control and difference-in-differences methods to estimate treatment effects.**Example**: Evaluating the impact of a new law by comparing the treated region to a synthetic control region over time.**Key Point**: Integrates the strengths of both methods for more robust causal inference.

### Practical Implementation Tips

**Data Quality**: Ensure high-quality, accurate, and relevant data.**Model Validation**: Validate models using out-of-sample tests and robustness checks.**Assumption Testing**: Test key assumptions such as common support and no interference between units.**Sensitivity Analysis**: Conduct sensitivity analyses to check the robustness of the results to different assumptions and specifications.