| Child | Observed nbhd | Yrich | Ypoor |
|---|---|---|---|
| Anna | Rich nbhd | €42k | ? |
| Bo | Rich nbhd | €39k | ? |
| Carlos | Poor nbhd | ? | €40k |
| Daniel | Poor nbhd | ? | €28k |
| Eva | Poor nbhd | ? | €30k |
| Finn | Rich nbhd | €28k | ? |
| Grace | Rich nbhd | €39k | ? |
| Hassan | Poor nbhd | ? | €26k |
What Would Have Happened Otherwise?
This pattern holds across thousands of children.
Is it causal?
One explanation for Anna’s €14k advantage:
If that arrow is real, neighborhoods are a policy lever.
But an arrow is a hypothesis. What else could draw the same €14k gap?
What we want to know: What would Anna have earned if she had grown up in Daniel’s neighborhood?
We see Anna in one life. The other, the counterfactual, is missing.
Half the table is missing. That missing half is the whole problem.
| Child | Observed nbhd | Yrich | Ypoor |
|---|---|---|---|
| Anna | Rich nbhd | €42k | ? |
| Bo | Rich nbhd | €39k | ? |
| Carlos | Poor nbhd | ? | €40k |
| Daniel | Poor nbhd | ? | €28k |
| Eva | Poor nbhd | ? | €30k |
| Finn | Rich nbhd | €28k | ? |
| Grace | Rich nbhd | €39k | ? |
| Hassan | Poor nbhd | ? | €26k |
Goal: estimate how an intervention (treatment, explanatory variable, independent variable, or predictor) affects an outcome (response or dependent variable).
Two formalisations describe the same missing-counterfactual story.
Pearl, structural causal models
Causal claims represented as DAGs.
Identification = closing back-door paths.
Neyman–Rubin, potential outcomes
A causal effect is the gap between two potential outcomes of the same unit, \(\mathbb{E}[Y^{\text{rich}} - Y^{\text{poor}}]\)
Identification = recovering the (average) missing information.
| Unit | X | Y(0) | Y(1) |
|---|---|---|---|
| Anna | 1 | ? | €42k |
| Bo | 1 | ? | €39k |
| Daniel | 0 | €28k | ? |
| Eva | 0 | €30k | ? |
Estimand is the target (the dish you want to make). Estimator is the method (the recipe). Estimate is the result (the plate you get).
Confusing them is the most common mistake in applied work.
| What it means | Neighborhood example | |
|---|---|---|
| Estimand | The causal quantity we want | \(\mathbb{E}[Y^{\text{rich}} - Y^{\text{poor}}] = \mathbb{E}[Y^{\text{rich}}] - \mathbb{E}[Y^{\text{poor}}]\) |
| Estimator | The recipe we use | regression, matching, DiD, IV |
| Estimate | The number we get | “€6k higher earnings” |
The estimand comes from your goal and the counterfactual question, not from the method.
Different estimators can target the same estimand, under their assumptions.
If those assumptions fail, you still get an estimate. Just not the one you ordered.
\[ \underbrace{\mathbb{E}[Y^{rich}] - \mathbb{E}[Y^{poor}]}_{\text{what we want}} \;\;\neq\;\; \underbrace{\mathbb{E}[Y \mid \text{rich nbhd}] - \mathbb{E}[Y \mid \text{poor nbhd}]}_{\text{observable}} \]
The fundamental problem: Causal inference requires averaging over a column that does not exist. To recover Anna’s missing counterfactual, we need a design and assumptions.
Regression with controls, lm(y ~ x + controls), is not outside causal inference. It just hands you the hardest assumption to defend.
\[ Y_i = \alpha + \color{#c44e52}{\tau}\,\color{#0b789d}{T_i} + \beta' \color{#6f7681}{X_i} + \varepsilon_i \]
Model → impute → compare (G-computation, Robins). Unbiased only if the controls close every back-door path,, the model is correctly specified, and we do not control for bad variables like colliders or mediators.
\(\widehat{\text{income}} \;=\; \underbrace{\text{€}28\text{k}}_{\hat\alpha} \;+\; \underbrace{\text{€}0\text{k}}_{\hat\beta}\cdot\text{rich} \;+\; \underbrace{\text{€}12\text{k}}_{\hat\gamma}\cdot\text{wealth}\)
Drop family wealth from the model and the groups are not comparable. Rich-neighborhood kids get compared to poor-neighborhood kids who are mostly less wealthy. The neighborhood coefficient absorbs this difference.
\(\widehat{\text{income}} \;=\; \underbrace{\text{€}31\text{k}}_{\hat\alpha} \;+\; \underbrace{\text{€}6\text{k}}_{\hat\beta}\cdot\text{rich}\)
You still get a number, just not the one you want.
The model still runs. The numbers are still numbers. They just don’t mean what you think.
Tip. Predicting future outcomes rules out reverse causality and reduces the risk of collider bias.
Regression breaks easily because confounders are everywhere: schools, parental attitudes, pollution, etc. We can never be sure we are measuring them all.
The ideal: randomize neighborhoods. Randomization makes the groups exchangeable, so every confounder, measured or not, cancels out.
The observed difference in income between people growing up in rich vs poor neighborhoods is now the causal effect.
In the 1990s, the U.S. Department of Housing and Urban Development (HUD) randomly assigned 4,604 families in high-poverty public housing to receive different housing-voucher offers or no voucher offer.
Simplified design
One potential estimand
Chetty, Hendren & Katz (2016), Figure 2A. Y-axis: experimental-vs-control ITT on adult income ($). X-axis: child’s age at random assignment. Younger movers benefit; the effect vanishes for kids assigned at older ages.
Note that the estimand differs from what we actually want to know: the effect of moving to a lower-poverty neighborhood. The paper uses instrumental variables for it.
Designs are different ways of rebuilding the missing counterfactual. The source of variation, not the method, is what makes a design credible.
Every design looks for variation in treatment X that is plausibly exogenous with respect to Y’s potential outcomes, variation that doesn’t come from the same forces driving Y.
An instrument is a source of variation that shifts whether someone receives the treatment, but affects the outcome only through that treatment. The MTO voucher is an instrument: it does not directly raise your future income. The estimand is different: we estimate the effect of moving for families whose move was caused by the voucher, not the effect of merely being offered a voucher.
The red dashed arrow is the exclusion restriction: Z must affect Y only through X.
A regression discontinuity design uses a rule with a sharp cutoff to create a near-experiment (a school-district boundary, an income threshold for a housing voucher, a test-score line). Units just above and just below the cutoff are assumed to be similar, but only one side receives the treatment.
Lecture 2 returns to this as a spatial boundary design (school districts, administrative borders).
A difference-in-differences design compares how outcomes change over time in treated neighborhoods versus similar untreated neighborhoods. Example: A city builds a new metro line in 2018. The neighborhoods it touches see better child outcomes by 2024. The neighborhoods it doesn’t touch see no change.
DiD is closely related to fixed-effects (within) models, both exploit within-unit variation over time.
A fixed-effects design compares units to themselves, or to very similar units within the same group. It asks whether outcomes change when exposure changes within the same unit (e.g. family, school, neighborhood, or cohort). This removes stable background differences, so identification comes from the remaining within-unit variation, such as siblings moving at different ages or cohorts facing different peer mixes.
When the world hasn’t run an experiment, there’s no shock, no shifter, what’s left? Find units that look the same on what we measured.
Two designs share this source:
Matching tries to find an untreated twin for every treated unit.
Among the poor-neighborhood children, find the ones who look most like Anna, same family income, same parents’ education, and compare.
A common misread is that all designs estimate the same estimand. They don’t. The method picks the population.
| Design | …only as good as |
|---|---|
| RCT | randomization actually held |
| IV | relevance AND exclusion |
| DiD | the parallel trends bet |
| RDD | no sorting at the cutoff (continuity) |
| Fixed effects | no time-varying confounders |
| Matching | the observed covariates |
| Regression | the controls + functional form |
Every design is a way of approximating the missing counterfactual.
“If the estimates you get are not the estimates you want, the fault lies in the econometrician, not the econometrics.” (Angrist & Pischke)
Goal: practice choosing a credible comparison.
Teams: each team gets one identification strategy/design Matching/IPW · Fixed effects · DiD · IV · RDD · Experiment
Step 1 — Design it, 10 min Pick a causal question and propose a design. Answer the 5 questions:
Step 2 — Defend it, 3 min per team Explain your design. Other teams challenge the a ssumption.
Step 3 — Recap within groups, 5 min What made each design credible? What could break it?
You want to study causal impact of giving money to some people in some postcodes. Anna lives in a treated postcode; Daniel lives just across the street in an untreated one.
Space stresses all assumptions:
If your treatment is a policy intervention, the setup is often:
The SoDa materials at causalpolicy.nl are a strong next step for exactly that setting:
So this lecture gives the general counterfactual backbone. causalpolicy.nl then picks up the common policy-evaluation case in more depth.