Causal Inference through the lens of probabilistic programming

Dr. Juan Orduz

Machine Learning & Deep Learning & Statistics
Python Skill Intermediate
Domain Expertise Intermediate

Why should you use a Probabilistic Programming Language (PPL) for Causal Inference? Because causal problems are inherently about uncertainty and structure—two things PPLs handle natively.

In this session, we will demonstrate how to translate causal diagrams (DAGs) directly into code, using PyMC and NumPyro to estimate causal effects with rigorous uncertainty quantification. We will cover three distinct levels of complexity, drawing on real-world examples and recent research:

  1. The "Simple" Case: Enhancing A/B Tests Even in randomized experiments, PPLs provide massive value. We will show how to:

    • Use Prior Predictive Checks to prevent "silly" estimates (Twyman's Law) by incorporating domain knowledge into priors (e.g., preventing the model from predicting a 1000% lift). We also describe how to perform a power analysis in a Bayesian framework.

    • Implement Bayesian CUPED to reduce variance and increase statistical power without collecting more data. We can combine these variance-reduction methods with smarter priors as described above.

  2. The Observational Challenge: Confounding & Structure When we can't randomize, we must adjust. We will explore (through concrete examples):

    • Backdoor Adjustment: Show how PPLs implement the "do-operator" to estimate Average Treatment Effects (ATE) in the presence of observed confounders.

    • Multilevel Causal Models: Demonstrate how to use multilevel models to account for time-invariant unobserved confounders. We discuss the pros and cons compared with similar methods, such as fixed effects.

  3. The Frontier: Deep Latent Variable Models: What if confounders are unobserved? We will introduce advanced methods combining Deep Learning with Probabilistic Programming:

    • An introduction to the Causal Effect Variational Autoencoder (CEVAE).

By the end of this talk, you will understand how to view causal inference not as a collection of isolated statistical tricks, but as a coherent modeling process powered by probabilistic programming.

References

Dr. Juan Orduz

Mathematician (Ph.D., Humboldt Universität zu Berlin) and data scientist. I am interested in interdisciplinary applications of mathematical methods, particularly time series analysis, Bayesian methods, and causal inference. Active open source developer (PyMC, PyMC-Marketing, and NumPyro, among others). For more info, please visit my personal website https://juanitorduz.github.io