By Asad Zaman
[Ed note: A longer version of this analysis can be found in Asad Zaman and Taseer Salahuddin (2020) “Causality, Confounding, and Simpson’s Paradox” International Econometric Review, Vol 2, Issue 1 (forthcoming in April)]
Statistics and Econometrics today are done without any essential reference to causality – this is much like trying to figure out how birds fly without taking into account their wings. Chapter 2 of Judea Pearl’s (2018) The Book of Why: The New Science of Cause and Effect tells the bizarre story of how the discipline of statistics inflicted causal blindness on itself, with far-reaching effects for all sciences that depend on data. This article elaborates and explains the introductory chapter of Pearl, Glymour, & Jewell (2016) Causal Inference in Statistics: A Primer. The first steps to understanding causality involve a detailed analysis of the Simpson’s Paradox. This is described in five points, summarized here and with more detail if you follow the links:
Simpson’s Paradox 1: Suppose that there are only two departments at Berkeley, and that they have different admission ratios for women. In Humanities 40% of female applicants are admitted, while in Engineering 80% are admitted. What will be the overall admission ratio of women to Berkeley? The overall admission ratio is a weighted average of 40% and 80% where the weights are the proportions of females who apply to the two departments. Similarly, if 20% of male applicants are admitted to Humanities while 60% are admitted to Engineering, then the overall admission ratio is a weighted average of 20% and 60%, with weights depending on the proportion of males who apply to the two departments. This is what leads to the possibility of Simpson’s Paradox. As the numbers have been set up, both Engineering and Humanities favour females, who have much higher admission ratios than male. If males apply mostly to Engineering, then the overall admission ratio for men will be closer to 60%. If females apply mostly to humanities, their overall admission ratio will be closer to 40%. So, looking at the overall ratios, it will appear that admissions favour males, who have higher admission ratios. The key question is: which of these comparisons is correct? Does Berkeley discriminate against males, the story told be departmental admission ratios? Or does it discriminate against females, as the overall admission ratios indicate? The main lesson from the analysis in this article is that the answer cannot be determined by the numbers. Either answer can be correct, depending on the hidden and unobservable causal structures of the real world which generate the data.
Simpson’s Paradox 2: Here I elaborate on Bickel et al (1975) discussion of the Berkeley admissions paradox. Their explanation can be understood as a causal path diagram where gender affects choice of department. Both gender and choice of department affect the admissions rate. With this causal structure, gender is a confounding variable when it comes to departmental admission ratios. These must be calculated conditionally on gender – that is, separately for men and women. However, departments are NOT a confounding factor when it comes to the effect of gender on admissions rate. Gender affects admissions through two channels – one is a direct effect on admissions ratios, and the second is an indirect effect via choice of department. Female gender affects admission positively via the direct affect which is favourable. However the indirect affect is negative since females choose the more difficult department in larger numbers. The numbers can be set up so that the negative indirect effect overwhelms the positive direct affect, creating the Simpson’s Paradox. But this entire analysis is dependent on a particular causal structure, and different causal structures can lead to entirely different analyses for exactly the same set of numbers. This is my main point – to show that the hidden and unobservable real world causal structures MUST be considered for meaningful data analysis. Current econometrics and statistics does not pay attention to causality and hence often leads to meaningless analysis.
Simpson’s Paradox 3: We can consider alternative causal structures for Berkeley admissions which lead to conclusions radically different from Bickel’s original analysis. We first consider a case where gender affects department choice, while the admission ratio depends only on department, and is completely gender neutral. If females choose more difficult departments, there will be a spurious correlation between admission ratios and gender, creating a misleading impression of discrimination against females. A second example is considered where admissions depend purely on SAT scores, and has no relationship to gender or to department. Nonetheless, if gender affects SAT Scores and choice of department, we can replicate the exact same numbers of the original data, which would create the misleading impressions that departments discriminate by gender, and some departments are more difficult to get into than others. In fact, admissions policy is same across departments, and depends only on SAT scores. The point of these analyses is that exactly the same observed data can correspond to radically different causal structures, and lead to radically different conclusions about discrimination with respect to gender.
Simpson’s Paradox 4: Contrary to the perspective taken by conventional statistics texts, and some forms of econometric analysis (VAR models), we cannot do data analysis without understanding the causal structures of the real-world which gives rise to the data. The jobs of the field expert and the statistical consultant cannot be separated. To illustrate this point, we consider the same data generated for the Berkeley admissions, and consider it as batting averages of two different batters against left and right-handed pitchers. Then the Simpson’s Paradox takes the following form. Frank and Tom both perform worse against left-handed pitchers. Frank has higher batting average than Tom against left-handed pitchers and he also has higher batting average than Tom against right-hand pitchers. However, the overall batting average of Tom is higher than that of Frank because opposing teams tend to use left-handed pitchers against Frank. Similarly, better surgeons can have worse operating results because they are given the more difficult cases. Consequently data alone are not enough. Context matters.
Simpson’s Paradox 5: To further drive home the fact that data analysis cannot be confined to numbers, and be divorced from the real world environment which generated the data, we consider a third interpretation of the same data set used for Berkeley admissions. In this interpretation, we look at the effect of a drug on recovery rates from a disease. The Simpson Paradox takes the form that the drug decreases recovery rates in females, and also decreases recovery rates in males. So, it is bad for males and it is bad for females. But when we look at the population as a whole, if the gender ratio in the control group is different from that in the test population we may find that the drug improves recovery rate. So, the drug appears to be good for the general population. A causal path diagram shows that gender must be exogenous – it cannot be affected by the drug. Thus gender is a confounding variable, we must condition on this variable to get the right measure of the effect of drug on recovery. Thus we conclude that the drug is bad for everyone, and lowers the recovery rate for everyone, even though the overall data tell us otherwise. But now consider the same data set with gender replaced by blood pressure, and suppose that the drug affects blood pressure. Suppose low blood pressure is a positive factor in recovery, while the drug has a toxic effect so that the direct impact is negative. However, the drug also lowers the blood pressure, which creates a positive factor for recovery. The combined effect can be favourable, and this is what should be considered when administering the drug.
Bickel, PJ, Hammel, EA, O’Connell, JW: Sex Bias in Graduate Admissions: Data From Berkeley. Science. 187(4175), 398–404 (1975)
From: pp.2-3 of WEA Commentaries 10(1), February 2020