Causality in the Time of Cholera

Exercise Sheet 1

Instructions (read aloud in your group): Your group’s task is to use the facts described below to devise a test of the theory that a waterborne organism/pathogen is responsible for infecting humans with cholera. Your test must be feasible with the technology and resources available at the time. Your test must also be falsifiable; that is, it must be possible that you find evidence inconsistent with the waterborne germ theory of cholera’s spread. The questions on this worksheet will walk you through developing your test. You do not have to use all of the facts below in devising your test, but try not to use information beyond what you are told below.

In your reading, you learned about how John Snow was able to convince the people of London that Cholera was a waterborne disease and not spread by miasma (a great lesson in the fact that not all data science requires machine learning :)).

In early 1854, Snow noticed that numerous cholera cases clustered around the Broad Street pump in the neighborhood of Soho in London. Like nearly all of the water supply of London, the pump was a public source of drinking water, cooking water and water for all other household chores for anyone who wanted it. Indoor plumbing was almost completely unheard of at this time, which meant that individuals from all walks of life obtained their water from public pumps like the Broad Street pump that were located on street corners throughout the city.

Despite the clustering of cholera cases, the workers in the brewery directly across the street from the pump did not contract cholera during the 1854 outbreak. Those who worked in other food service industries, like the bakery on the same block, however, did exhibit cases of cholera. As did other individuals involved in other retail trades. When asked, those who worked in the brewery noted that they mostly drank beer for hydration, not water, and that all of the water they obtained to brew their beer was boiled as part of the brewing process.

John Snow famously realized, as a result of this fact, that water was carrying cholera. How?

In this exercise, you must map the facts of this case to our counter-factual conceptual framework! Hopefully this will not only help you understand the counter-factual framework, but also allow you to think more precisely about what assumptions John Snow was implicitly making.

Question 1:

Using the language of the potential outcomes framework, identify the outcome of interest (\(Y\)) of interest.

Question 2:

Who or what do you want to observe? Do you want to observe something about individuals? Groups? Places? Things? (This is your unit of analysis.)

Question 3:

What is the treatment (\(T\)) in John Snow’s analysis?

Question 4

Given the observations you’ve made of the outcome \(Y\), who or what do you want to compare? That is, how do you want to group your unit of analysis so that you can compare between groups? Will you make comparisons across space? Across time? Both? (This is the fundamental component of your test.)

Question 5

If the waterborne germ theory of disease is correct, what differences (or lack of differences) do you expect to see in the comparisons of outcomes between groups from 4 above?

Question 6

If the waterborne germ theory of disease is incorrect, what differences (or lack of differences) do you expect to see in the comparisons of outcomes between groups from 4 above?

Question 7:

In order to make this inference valid, what assumptions was Snow (implicitly) making about brewers and non-brewers?

Question 8:

If you observe the differences (or lack of differences) in outcomes between groups that support the waterborne germ theory of disease, is there anything else that could be going on that could also or alternatively account for the differences? What worries you? (This is a question about potential confounders.)

Question 9:

If you don’t observe the differences (or lack of differences) in outcomes between groups that support the waterborne germ theory of disease, is there anything else that could be going on that could also or alternatively account for the differences? What worries you? (This is a question about potential confounders.)

[ ]: