This site hosts the materials for the workshop Causal inference with geospatial data. Unless noted otherwise, the materials are CC-BY-4.0 licensed. Instructor
Javier Garcia-Bernardo
Utrecht University
Maintained by
ODISSEI Social Data Science (SoDa)
soda@odissei-data.nl

Causal inference with geospatial data

Materials for a 4-hour hands-on workshop on causal inference with geospatial data. It is aimed at social scientists who are comfortable with regression but want a clearer way to think about causal claims, the assumptions behind them, and why spatial data make those assumptions harder to satisfy. It focuses on breadth and intuition.

The workshop is two short lectures plus a worked Python practical on a real case study: does livestock density raise ammonia (NH₃) concentrations?

The website for the workshop is here. The github repository is here.

One workshop backbone

Throughout the lectures and the practical we keep returning to the same five questions. This is the whole workshop in one checklist:

  1. What is the treatment?
  2. What is the estimand?
  3. What is the comparison?
  4. What assumption makes that comparison credible?
  5. Why might that assumption fail?

Schedule and materials

Duration Activity Content Link
45 min Lecture 1 Counterfactuals, estimands, exogenous variation, causal designs lecture 1
15 min Break    
40 min Lecture 2 Spatial confounding, spillovers, scale, why spatial models are not causal designs lecture 2
60 min Practical Maps and association → confounders and Moran’s I → spatial models → DiD with farm gains → spillover-aware interpretation practical notebook

The lectures are reveal.js slides — open the .html files directly in a browser, no setup required. Their source is in the matching .qmd files.

The practical

The practical is a worked example, not a hidden causal proof. Working from the single grid dataset, participants move through:

  1. maps and descriptive association
  2. controls and residual spatial clustering (Moran’s I)
  3. spatial lag / error / Durbin models
  4. a difference-in-differences with farm-gain vs no-change cells (2020 → 2024)
  5. why spillovers make that DiD fragile, and how to read a Spatial Durbin model

The takeaway: a map shows where, a regression shows what correlates, and a design tells you what would need to be true for a causal claim.

Both practicals are also available as reactive marimo notebooks (practical/*.py) — see Quick start.

Quick start

You need uv, a fast Python package and environment manager. uv reads pyproject.toml and uv.lock and builds the exact environment automatically the first time you run something — there is no separate “create a venv” step.

1. Install uv

macOS:

curl -LsSf https://astral.sh/uv/install.sh | sh
# or, with Homebrew:  brew install uv

Linux (Ubuntu / Debian):

sudo apt update && sudo apt install -y curl git   # only if they are missing
curl -LsSf https://astral.sh/uv/install.sh | sh

Windows (PowerShell):

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

Alternatively, on any platform: pip install uv (or pipx install uv). See the uv install docs for details. Restart your terminal afterwards, then check it works:

uv --version

2. Get the materials

Clone the repository (or download it as a ZIP. and unzip):

git clone https://github.com/sodascience/workshop_geocausal.git
cd workshop_geocausal

3. Open the practical

From the project root, the recommended way is JupyterLab if you have no experience with marimo:

uv run jupyter lab practical/practical_grid_nh3.ipynb

The first run downloads and installs all dependencies (this can take a few minutes); later runs start instantly. JupyterLab opens in your browser — run the cells top to bottom.

Prefer the classic interface? Use uv run jupyter notebook instead of jupyter lab.

Like marimo instead?

uv run marimo edit practical/practical_grid_nh3.py

(marimo edit lets you run and change cells; uv run marimo run practical/practical_grid_nh3.py opens it read-only as an app.)

Data

The practical uses a single, ready-to-use file, data/final/workshop_grid_1km.csv:

It was built from several sources:

Further reading

A short, opinionated list. Start with the background texts for the ideas; the papers below are the concrete examples used in the lectures.

Causal inference — background

Geospatial statistics & spatial causal inference — background

Papers referenced in the materials (clear geographic causal designs, by identification strategy)

Rebuilding the lectures (optional)

The rendered lecture HTML is already included. To rebuild from source you need Quarto and, for the DAG figures, the Python graphviz package plus the Graphviz dot system binary:

quarto render lectures/1_intro_causality/1_intro_causality.qmd --to revealjs
quarto render lectures/2_geocausality/2_geocausality.qmd --to revealjs

Contact

Developed and maintained by the ODISSEI Social Data Science (SoDa) team.

SoDa logo

Questions? Email soda@odissei-data.nl, or contact the instructor Javier Garcia-Bernardo (j.garciabernardo@uu.nl) directly.