This is a file in the archives of the Stanford Encyclopedia of Philosophy. |
version |
Stanford Encyclopedia of Philosophy |
content revised
|
Reichenbach's common cause principle needs to be modified. Consider, for instance, the following example. Harry normally takes the 8 a.m. train from New York to Washington. But he does not like full trains, so if the 8 a.m. train is full he sometimes takes the next train. He also likes trains that have diner cars, so if the 8 a.m. train does not have a diner car he sometimes takes the next train. If the 8 a.m. train is both full and has no diner car, he is very likely to take the next train. Johnny, an unrelated commuter, also normally takes the 8 a.m. train from New York to Washington. Johnny, it so happens, also does not like full trains, and he also likes diner cars. Whether or not Harry and Johnny take the 8 a.m. train will therefore be correlated. But, since the probability of Harry and Johnny taking the 8 a.m. train depends on the occurrence of two distinct events (the train being full, the train having a diner car) there is no single event C, such that conditional upon C and conditional upon ~C we have independence. Thus Reichenbach's common cause principle as stated above is violated. Yet this example clearly does not violate the spirit of Reichenbach's common cause principle, for there is a partition into four possibilities such that conditional upon each of these four possibilities the correlation disappears.
More generally, we would like to have a common cause principle for cases in which the common causes and the effects are sets of quantities with continuous or discrete sets of values, rather than single events that occur or do not occur. A natural way to modify Reichenbach's common cause principle in order to deal with such types of cases is as follows. If simultaneous values of quantities A and B are correlated, then there are common causes C1, C2,....., Cn, such that conditional upon any combination of values of these quantities at an earlier time, the values of A and B are probabilistically independent. (For a fuller discussion of modifications like this, including cases in which there are correlations between more than two quantities, see Uffink (forthcoming)). I will continue to call this generalization Reichenbach's common cause principle, since, in spirit, it is very close to the principle that Reichenbach originally stated.
Now let me turn to two principles, the causal Markov condition and the law of conditional independence, that are closely related to Reichenbach's common cause principle.
Penrose and Percival then say that one can prevent any influence from acting on both A and B by fixing the state c throughout such a region C. They therefore claim that states a in A and b in B will be uncorrelated conditional upon any state c in C. To be precise, they suggest the law of conditional independence: "If A and B are two disjoint 4-regions, and C is any 4-region which divides the union of the pasts of A and B into two parts, one containing A and the other containing B, then A and B are conditionally independent given c. That is, Pr(a&b/c) = Pr(a/c)xPr(b/c), for all a,b." (Penrose and Percival 1962, p. 611).
This is a time asymmetric principle which is clearly closely related to Reichenbach's common cause principle and the causal Markov condition. However one should not take states c in region C to be, or include, the common causes of the (unconditional) correlations that might exist between the states in regions A and B. It is merely a region such that influences from a past common source on both A and B must pass through it, assuming that such influences do not travel at speeds exceeding the speed of light. Note also that the region must stretch to the beginning of time. Thus, one cannot derive anything like Reichenbach's common cause principle or the causal Markov condition from the law of conditional independence, and one therefore would not inherit the richness of applications of these principles, especially the causal Markov condition, even if one were to accept the law of conditional independence.
Let us now turn to reasons that have been given for not believing any of these principles.
More generally, suppose that there is a quantity Q, which is a function f(q1,....,qn) of quantities qi. Suppose that some of the quantities qi develop indeterministically, but that quantity Q is conserved in such developments. There will then be correlations among the values of the quantities qi which have no prior screener off. The only way that common cause principles can hold when there are conserved global quantities is when the development of each of the quantities that jointly determine the value of the global quantity is deterministic. And then it holds in the trivial sense that the prior determinants make everything else irrelevant. The results of quantum mechanical measurements are not determined by the quantum mechanical state prior to those measurements. And often there are conserved quantities during such a measurement. For instance, the total spin of 2 particles in a quantum singlet state is 0. This quantity is conserved when one measures the spins of each of those 2 particles in the same direction: one will always find opposite spins during such a measurement, i.e. the spins that one finds will be perfectly anti-correlated. However what spins one will find is not determined by the prior quantum state. Thus the prior quantum state does not screen off the anti-correlations. There is no quantum common cause of such correlations.
One might think that this violation of common cause principles is a reason to believe that there must then be more to the prior state of the particles than the quantum state; there must be hidden variables that screen off such correlations. (And we have seen above that such hidden variables must determine the results off the measurements if they are to screen of the correlations.) However, one can show, given some extremely plausible assumptions, that there cannot be any such hidden variables. (See, for instance, van Fraassen 1982 and Elby 1992 for more detail.)
One can also show that such correlations without a possible prior screener off are not confined to very special states, but occur generically in quantum mechanics and quantum field theory. (See Redhead (1995), Clifton, Feldman, Halvorson, Redhead & Wilce (1998), and Clifton & Ruetsche (forthcoming).)
More generally, any coexistence law, such as Newtonian gravitation, or Pauli's exclusion principle, will imply correlations which have no prior common cause conditionally upon which they disappear. Therefore, contrary to what one might hope, there are relativistic co-existence laws which violate common cause principles.
There is a way of understanding common cause principles such that this example is not a counterexample to it. Suppose that in nature there are transition chances from values of quantities at earlier times to values of quantities at later times. ( For more in this idea see Arntzenius 1997). One could then state a common cause principle as follows: conditional upon the values of all the quantities upon which the transition chances to quantities X and Y depend, X and Y will be probabilistically independent. In Sober's example, there are transition chances from earlier costs of bread to later costs of bread, and there are transition chances from earlier water levels to later water levels. Conditional upon earlier costs of bread, later costs of bread are independent of later water levels. A common cause principle formulated as above thus holds in this case. Of course, if one looks at a collection of (simultaneous) data for water levels and bread prices one will see a correlation due to similar laws of development (similar transition chances). But a common cause principle, understood in terms of transition chances, does not imply that there should be a common cause of this correlation. The data (which include these correlations) should be understood as evidence for what the transition chances in nature are, and it is those transition chances that could be demanded to satisfy a common cause principle.
This does not imply a violation of the causal Markov
condition. However, in order to be able to infer causal relations
from statistical ones, Spirtes, Glymour and Scheines in effect assume
that whenever (unconditionally correlated) quantities Qi
and Qj are independent conditional upon some quantity
Qk, then Qk is a cause of either Qi
or Qj. To be more precise they assume the Faithfulness
condition, which states that there are no probabilistic
independencies in nature other than the ones entailed by the causal
Markov condition. Since the values of such quantities
X at later times
t
surely are not direct causes of X at
t, Faithfulness is violated, and with it goes our ability to
infer causal relations from probabilistic relations, and much of the
practical value of the causal Markov
condition.[5]
A quantity like X whose values at a later time
t
are deterministically related to the
values of X at t, will in general correspond to a non-natural,
non-local, and not directly observable quantity. Perhaps a common
cause principle can hold of some natural class of
quantities. However, the next two sections will show that two
suggestions for such a natural class do not work.
Let us now consider another type of counterexample to the idea that a common cause principle can hold of macroscopic quantities, namely cases in which order arises out of chaos. When one lowers the temperature of certain materials, the spins of all the atoms of the material, which originally are not aligned, will line up in the same direction. Pick any two atoms in this structure. Their spins will be correlated. However, it is not the case that the one spin orientation caused the other spin orientation. Nor is there a simple or macroscopic common cause of each orientation of each spin. The lowering of the temperature determines that the orientations will be correlated, but not the direction in which they will line up. Indeed, typically, what determines the direction of alignment, in the absence of an external magnetic field, is a very complicated fact about the total microscopic prior state of the material and the microscopic influences upon the material. Thus, other than virtually the complete microscopic state of the material and its environment there is no prior screener off of the correlation between the spin alignments.
In general when chaotic developments result in ordered states there will be final correlations which have no prior screener off, other than virtually the full microscopic state of the system and its environment. (For more examples, see Prigogine 1980). In such cases the only screener off will be a horrendously complex microscopic quantity.
Next consider a flock of birds that flies, more or less, like a single unit in a rather varied trajectory through the sky. The correlation between the motions of each bird in the flock could have a rather straightforward common cause explanation: there could be a leader bird that every other bird follows. But it could also be that there is no leader bird, that each bird reacts to certain factors in the environment (presence of predator birds, insects, etc.), while at the same time constraining the distance that it will remove itself from its neighboring birds in the flock (as if tied to them by springs that pull harder the further away it gets from the other birds). In the latter case there will be a correlation of motions for which there is no local common cause. There will be an equilibrium correlation that is maintained in the face of external perturbations. In equilibrium the flock acts more or less as a unit, and reacts as a unit, possibly in a very complicated way, in response to its environment. The explanation of the correlation among the motions of its parts is not a common cause explanation, but the fact that in equilibrium the myriad connections between its parts make it act as a unit.
In general we have learned to divide the world into systems which we regard as single units, since their parts normally (in equilibrium) behave in a highly correlated manner. We routinely do not regard correlations among the motions and properties of the parts of these systems as demanding a common cause explanation.
But when should one expect such independence? P. Horwich (Horwich 1987) has suggested that such independence follows from initial microscopic chaos. (See also Papineau 1985 for a similar suggestion.) His idea is that if all the determinants outside S are microscopic, then they will all be uncorrelated since all microscopic factors will be uncorrelated when they are chaotically distributed. However, even if one has microscopic chaos (i.e. a uniform probability distribution in certain parts of state-space in a canonical coordinatization of the state-space), it is still not the case that all microscopic factors are uncorrelated. Let me give a generic counterexample.
Suppose that quantity C is a common cause of quantities A and B, that the system in question is deterministic, and that the quantities a and b which, in addition to C, determine the values of A and B are microscopic and independently distributed for each value of C. Then A and B will be uncorrelated conditional upon each value of C. Now define quantities D:A+B and E:A-B. ("+" and "-" here represent ordinary addition and subtraction of the values of quantities.) Then, generically, D and E will be correlated conditional upon each value of C. To illustrate why this is so let me give a very simple example. Suppose that for a given value of C quantities A and B are independently distributed, that A has value 1 with probability 1/2 and value -1 with probability 1/2, and that B has value 1 with probability 1/2 and value -1 with probability 1/2. Then the possible values of D are -2, 0 and 2, with probabilities 1/4, 1/2 and 1/4 respectively. The possible values of E are also -2, 0 and 2, with probabilities 1/4, 1/2 and 1/4 respectively. But note, for instance, that if the value of D is -2, then the value of E must be 0. In general a non-zero value for D implies value 0 for E and a non-zero value for E implies value 0 for D. Thus, the values of D and E are strongly correlated for the given value of C. And it is not too hard to show that, generically, if quantities A and B are uncorrelated, then D and E are correlated. Now, since D and E are correlated conditional upon any value of C, it follows that C is not a prior common cause which screens off the correlation between D and E. And since the factors a and b which, in addition to C, determine the values of A and B, and hence those of D and E, can be microscopic and horrendously complex, there will be no screener off of the correlations between D and E other than some incredibly complex and inaccessible microscopic determinant. Thus common cause principles fail if one uses quantities D and E rather than quantities A and B to characterize the later state of the system.
One might try to save common cause principles by suggesting that in addition to C being a cause of D and of E, D is also a cause of E, or E is also a cause of D. (See Glymour and Spirtes 1994, pp 277-278 for such a suggestion). This would explain why D and E are still correlated conditional upon C. Nonetheless, this does not seem a plausible suggestion. In the first place, D and E are simultaneous. In the second place, the situation sketched is symmetric with respect to D and E, so which is supposed to cause which? It seems far more plausible to admit that common cause principles fail if one uses quantities D and E.
One might next try to defend common cause principles by suggesting that D and E are not really independent quantities, given that each is defined in terms of A and B, and that one should only expect common cause principles to be true of good, honest, independent quantities. Although this argument is along the right lines, as it stands it is too quick and simple. One cannot say that D and E are not independent because of the way they are defined in terms of A and B. For similarly A = ½(D+E) and B = ½(D-E), and unless there are reasons independent of such equations to claim that A and B are bona fide independent quantities while D and E are not, one is stuck. For now let us therefore conclude that an attempt to prove the common cause principle by assuming that all microscopic factors are uncorrelated rests on a false premise.
Nonetheless such arguments are pretty close to being correct: microscopic chaos does imply that a very large and useful class of microscopic conditions are independently distributed. For instance, assuming a uniform distribution of microscopic states in macroscopic cells, it follows that the microscopic states of two spatially separated regions will be independently distributed, given any macroscopic states in the two regions. Thus microscopic chaos and spatial separation is sufficient to provide independence of microscopic factors. This in fact covers a very large and useful class of cases. For almost all correlations that we are interested in are between factors of systems that are not exactly in the same location. Consider, for instance, an example due to Reichenbach.
Suppose that two actors almost always eat the same food. Every now and then the food will be bad. Let us assume that whether or not each of the actors become sick depends on the quality of the food that they consume and on other local factors (properties of their body etc.) at the time of consumption (and perhaps also later), which previously have developed chaotically. The values of these local factors for one of the actors will then be independent of the values of these local factors for the other actor. It then follows that there will be a correlation between their states of health, and that this correlation will disappear conditional upon the quality of the food. In general when one has a process that physically splits into two separate processes which remain separated in space, then all the microscopic influences on those two processes will be independent from then on. Indeed there are very many cases in which two processes, whether spatially separated or not, will have a point after which microscopic influences on the processes are independent given microscopic chaos. In such cases common cause principles will be valid as long as one chooses as one's quantities the (relevant aspects of the) macroscopic states of the processes at the time of such separations (rather than the macroscopic states significantly prior to such separations) and some aspects of macroscopic states somewhere along each separate process (rather than some amalgam of quantities of the separate processes).
One should also not be interested in common cause principles which allow any conditions, no matter how microscopic, scattered and unnatural, to count as common causes. For, as we have seen, this would trivialize such principles in deterministic worlds, and would hide from view the remarkable fact that when one has a correlation among fairly natural localized quantities that are not related as cause and effect, almost always one can find a fairly natural, localized prior common cause that screens off the correlation. The explanation of this remarkable fact, which was suggested in the previous section, is that Reichenbach's common cause principle, and the causal Markov condition, must hold if the determinants, other than the causes, are independently distributed for each value of the causes. The fundamental assumptions of statistical mechanics imply that this independence will hold in a large class of cases given a judicious choice of quantities characterizing the causes and effects. In view of this, it is indeed more puzzling why common cause principles fail in cases like those described above, such as the coordinated flights of certain flocks of birds, equilibrium correlations, order arising out of chaos, etc. The answer is that in such cases the interactions between the parts of these systems are so complicated, and there are so many causes acting on the systems, that the only way one can get independence of further determinants is by specifying so many causes as to make this a practical impossibility. This, in any case, would amount to allowing just about any scattered and unnatural set of factors to count as common causes, thereby trivializing common cause principles. Thus, rather than do that, we regard such systems as single unified systems, and do not demand a common cause explanation for the correlated motions and properties of their parts. A fairly intuitive notion of what counts as a single system, after all, is a system that behaves in a unified manner, i.e. a system whose parts have a very strong correlation in their motions and/or other properties, no matter how complicated the set of influences acting on them. For instance a rigid physical object has parts whose motions are all correlated, and a biological organism has parts whose motions and properties are strongly correlated, no matter how complicated the influences acting on it. These systems therefore are naturally and usefully treated as single systems for almost any purpose. The core truth of common cause principles thus in part relies on our choice as to how to partition the world into unified and independent objects and quantities, and in part on the objective, temporally asymmetric, principles that lie at the foundation of statistical mechanics.
Frank Arntzenius arntzeni@rci.rutgers.edu |