# Counter-Factuals and Experimental Ideals¶

These exercises are adapted with permission from exercises written by Adriane Fresh

## What is a Counterfactual?¶

Consider the causal relationship “X caused Y,” where X is the causal factor of interest (independent variable), and Y is the outcome of interest (dependent variable). A counterfactual statement follows from (or, is related to) such a causal statement. The counterfactual is a logical statement about what outcome would have occurred if X had not occurred. For X to have caused Y, logically it must be the case that the following is true: Had X not occurred, then Y also would not have occurred. That underlined statement is the counterfactual statement.

You may find it helpful to first form a counterfactual question (embedded above) that leads you to the counterfactual statement. This counterfactual question takes the following form: What would have occurred had X not occurred?

## What is a randomized experiment?¶

A randomized experiment is the gold standard for making high quality causal inferences given that we live in a world in which the counterfactual condition is always impossible to observe. In a randomized experiment we randomize our units of study (individuals, classrooms, states, etc.) into (typically) two groups. One group receives the causal factor of interest (usually referred to as a “treatment,” with the group referred to as the “treatment group”), and the other group does not (referred to as the “control group”). The outcome of interest is measured in each group, and the causal effect of having received the treatment is the difference in average outcome between each of the groups.

The fact that the groups are determined randomly ensures (by the Law of Large Numbers, a pretty cool statistical law) that although no two individuals from each the treatment and control group represent a counterfactual for the other, the entire control group on average represents the counterfactual for the entire treatment group. That is, the two groups are the same, on average, except for the presence or absence of the causal factor of interest. And thus, any differences on average between the groups in the outcome of interest can only have arisen as a consequence of the causal factor of interest.

## Identifying Counter-Factuals¶

Instructions: Take the causal statement given and identify (a) the causal factor of interest, and (b) the outcome of interest. Then state (c) the counterfactual question, and (d) the counterfactual statement that must be true for the causal statement to be true. Finally, even if it is absolutely infeasible in the real world, very briefly describe the randomized experiment that would represent the best way of approximating the counterfactual condition.

Example:

You are advising NATO military commanders on how best to defeat insurgents in Afghanistan. There is a concern that attempts to kill insurgents by bombing villages may be counter-productive because it may be upsetting civilians, making them more likely to house insurgents. A member of your team makes the following causal statement: “Afghani villages that are bombed by coalition forces during the War in Afghanistan are more likely to harbor insurgents.” Break down this causal statement.

Causal factor: Bombing by coalition forces

Dependent variable: Harboring insurgents

Counterfactual question: How likely would villages have been to harbor insurgents had they not been bombed by coalition forces?

Counterfactual statement: Had a village not been bombed by coalition forces, it would not have harbored insurgents.

Idealized hypothetical even-if-farfetched-and-wildly-unethical randomized experiment: Randomize bombing some villages but not others. Observe any differences in insurgency operations out of the two groups of villages.

Give a reason that it might be a problem to just look at the correlation between your causal factor and dependent variable in the world? In other words, what’s a reason there may be baseline differences or differential treatment effects?

Coalition forces are likely to target communities harboring insurgents, so bombing and support are likely to be positively correlated even if bombing doesn’t increase or decrease civilian support for militants.

(side note: while the randomized version of this clearly can’t be done (and would be insanely unethical), and so this may seem unanswerable, there are ways to answer this kind of question causally – e.g. this, this, this, or really anything by Jason Lyall.))

## Exercise 1:¶

You work for an auto insurance company. You have noticed some of your customers have bought a device that alerts drivers when they are driving in an unsafe manner. You have noticed that the people with this device tend to get in fewer accidents. You are wondering whether to pay to give these to your customers for free. Someone on your team makes the following statement: “Having this device reduces accidents.” Break down this causal statement.

Causal factor:

Dependent variable:

Counterfactual question:

Counterfactual statement:

Idealized hypothetical even-if-farfetched randomized experiment:

Give a reason that it might be a problem to just look at the correlation between your causal factor and dependent variable in the world? In other words, what’s a reason there may be baseline differences or differential treatment effects?

## Exercise 2:¶

Following Britain’s vote to leave the European Union, many looking at the UK worry that support for Brexit was caused by high levels of immigration. If this is true, they might limit immigration to try and prevent these attitudes from forming in their own country. You have been called in to consult. Your boss states: “Living in proximity to immigrants caused individuals in Great Britain to be more likely to vote in favor of leaving the EU in the 2016 Brexit vote.” Break down this causal statement.

Causal factor:

Dependent variable:

Counterfactual question:

Counterfactual statement:

Idealized hypothetical even-if-farfetched randomized experiment:

Give a reason that it might be a problem to just look at the correlation between your causal factor and dependent variable in the world? In other words, what’s a reason there may be baseline differences or differential treatment effects?

## Exercise 3:¶

You have been hired by an international aid organization to help prioritize aid spending. Someone on your team makes the following statement: “Offering free bed-nets to families in rural Kenya will reduce malaria infection rates.” Break down this causal statement.

Causal factor:

Dependent variable:

Counterfactual question:

Counterfactual statement:

Idealized hypothetical even-if-farfetched randomized experiment:

Give a reason that it might be a problem to just look at the correlation between your causal factor and dependent variable in the world? In other words, what’s a reason there may be baseline differences or differential treatment effects?