Notes to Simpson's Paradox
1. Cohen & Nagel (1934) provide the data for the following tables:
Population | Deaths | Death Rate per 100,000 | ||||
NY | Richmond | NY | Richmond | NY | Richmond | |
Caucasians | 4,675,174 | 80,895 | 8,365 | 131 | 179 | 162 |
African Americans | 91,709 | 46,733 | 513 | 155 | 560 | 332 |
Totals | 4,766,833 | 127,268 | 8,881 | 286 | 187 | 226 |
The death rate for Caucasians and for African Americans is higher in New York, while the total combined rate is higher in Richmond.
2. Lindly & Novick (1981) discuss the problems that arise from this type of case, and they conclude that the associations in the subpopulations are apt as guides to statistical inference concerning the effectiveness and safety of a treatment. Glymore & Meek 1994, and Pearl 2000, find nothing in the data that Lindly and Novick use nor in the background situation that is surmised to warrant the conclusion that Lindly and Novick reach.
3. This is illustrated by example in Malinas 1997 (352–3). Cartwright 1979 makes the point with great generality. She writes:
Consider two different partitions for the same space, \(K_1 ,\ldots ,K_n\) and \(I_1 ,\ldots ,I_s\), which cross grain each other—the \(K_i\) are mutually disjoint and exhaustive, and so are the \(I_j\). Then it is easy to produce a measure over the field \((\pm G, \pm C, \pm K_i, \pm I_j)\) such that \[ \sum_{j=1}^{n} P(G\mid C \amp K_j) \ne \sum_{j=1}^{n} P(G\mid C \amp I_j)P(I_j) \]
4. Critical regions for the aggregated data below are \(z\) greater than \(1.645\) and \(z = 1.708\). When the data are partioned by gender, the critical regions are \(z\) less than \(-1.645\) and \(z = -2.136\) for the \(M\)-table and the critical regions are \(z\) less than \(-1.645\) and \(z = -1.741\) for the \({\sim}M\)-table.
5. See Malinas 1997 for examples of sets of data that instantiate these percentages and an urn model that instantiates the associated probabilities.
6. This idea is introduced in Suppes 1970. He proposed that a prima facie cause must raise the probability of its effect occurring, but not that any event that raises the probability of a subsequent event is thereby a cause of it.
7. This and the following example are developed in Hesslow 1976.
8. Recent causal reductionist proposals based on Suppes proposal that causes raise the chances of their effects are advanced by e.g., Spohn (2001) and Thalos (2003).
9. Cf. Hardcastle 1991. He proposes that ordinary language descriptions of causal relations have to be submitted to redescriptions in terms of the vocabularies of scientific theories that mark out natural kinds. Then, the laws of the theories will determine which reference classes are causally homogenous and what the causal connections are between events so described.