Monday, December 13, 2010

Simpson's Paradox



Suppose there are 300 students doing a Maths exam, 180 men and 120 women, and another 200 students doing an English exam, 60 men and 140 women. Suppose 30% of the Maths students and 20% of the English students get an A. If the performance of male and female students is exactly the same then the results will be as shown in the Table:


Proportion of men getting an A
Proportion of women getting an A
Overall Proportion getting an A
Maths
54/180 = 0.3
36/120 = 0.3
90/300 = 0.3
English
12/60 = 0.2
28/140 = 0.2
40/200 = 0.2
Combined
66/240  = .2750
64/260 = 0.2462
130/500 = 0.2600

So, despite the fact that there is no difference in men's and women's performance on the individual exams, overall 27.5% of men got an A and only 24.62% of women got an A. This phenomenon in which the trend in amalgamated data is different from the trends in the individual groups is known as Simpson's Paradox. A famous example is graduate admissions at UC Berkeley in 1973: 44% of the 8442 male applicants were admitted compared with 35% of the 4321 female applicants, a difference that was too large to be due to chance. However, when the data were disaggregated by decision making unit a different picture emerged: few units showed a statistically significant difference between  the rates of male and female admissions and there were just as many units that appeared to favour women as to favour men. The reason for the discrepancy in the overall admission rates was that women were proportionately more likely to apply to units with low admission rates. In fact, when the data were pooled taking this into account there was a small but statistically significant bias in favour of women. [P.J.Bickel, E.A. Hammel and J.W.O'Connor Science, vol 187 pp 398-404 (1975)].

The Berkeley example dates from over thirty-five years ago. Do we still see this type of error in the analysis of gender statistics? Yes, we do.

I speculate that the reasons for this include:
  • A preference for results that confirm our existing beliefs. If we believe that women are hard done by we are less likely to question results that support that belief.
  • A fear that questioning specific results will be perceived as questioning the principle of equality or will be interpreted as a reason for failing to support an otherwise useful initiative.
  • Lack of knowledge of statistics and its pitfalls.

Additional Sources: I first came across the Berkeley example in a video of a lecture 'Lies, Damned Lies and Statistics: The misapplication of statistics in everyday life' by Dr Talithia D. Williams,  a lecture in the Distinctive Voices series of the US National Academy of Sciences. It's well worth watching.

No comments:

Post a Comment