This is a file in the archives of the Stanford Encyclopedia of Philosophy. 
version history

Stanford Encyclopedia of PhilosophyA  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z

last substantive content change

There exist in the literature several, closely related, common cause principles. In the next three subsections I describe three such common cause principles.
It seems that a correlation between events A and B indicates either that A causes B, or that B causes A, or that A and B have a common cause. It also seems that causes always occur before their effects and, thus, that common causes always occur before the correlated events. Reichenbach was the first to formalize this idea rather precisely. He suggested that when Pr(A&B) > Pr(A) × Pr(B) for simultaneous events A and B, there exists an earlier common cause C of A and B, such that Pr(A/C) > Pr(A/~C), Pr(B/C) > Pr(B/~C), Pr(A&B/C) = Pr(A/C) × Pr(B/C) and Pr(A&B/~C) = Pr(A/~C) × Pr(B/~C). (See Reichenbach 1956 pp. 158159.) C is said to ‘screen off’ the correlation between A and B when A and B are uncorrelated conditional upon C. Thus Reichenbach's principle can also be formulated as follows: simultaneous correlated events have a prior common cause that screens off the correlation.^{[1]} ^{[2]}
Reichenbach's common cause principle needs to be modified. Consider, for instance, the following example. Harry normally takes the 8 a.m. train from New York to Washington. But he does not like full trains, so if the 8 a.m. train is full he sometimes takes the next train. He also likes trains that have diner cars, so if the 8 a.m. train does not have a diner car he sometimes takes the next train. If the 8 a.m. train is both full and has no diner car, he is very likely to take the next train. Johnny, an unrelated commuter, also normally takes the 8 a.m. train from New York to Washington. Johnny, it so happens, also does not like full trains, and he also likes diner cars. Whether or not Harry and Johnny take the 8 a.m. train will therefore be correlated. But, since the probability of Harry and Johnny taking the 8 a.m. train depends on the occurrence of two distinct events (the train being full, the train having a diner car) there is no single event C, such that conditional upon C and conditional upon ~C we have independence. Thus Reichenbach's common cause principle as stated above is violated. Yet this example clearly does not violate the spirit of Reichenbach's common cause principle, for there is a partition into four possibilities such that conditional upon each of these four possibilities the correlation disappears.
More generally, we would like to have a common cause principle for cases in which the common causes and the effects are sets of quantities with continuous or discrete sets of values, rather than single events that occur or do not occur. A natural way to modify Reichenbach's common cause principle in order to deal with such types of cases is as follows. If simultaneous values of quantities A and B are correlated, then there are common causes C_{1}, C_{2},…, C_{n}, such that conditional upon any combination of values of these quantities at an earlier time, the values of A and B are probabilistically independent. (For a fuller discussion of modifications like this, including cases in which there are correlations between more than two quantities, see Uffink (1999)). I will continue to call this generalization ‘Reichenbach's common cause principle’, since, in spirit, it is very close to the principle that Reichenbach originally stated.
Now let me turn to two principles, the ‘causal Markov condition’ and the ‘law of conditional independence’, that are closely related to Reichenbach's common cause principle.
Penrose and Percival then say that one can prevent any influence from acting on both A and B by fixing the state c throughout such a region C. They therefore claim that states a in A and b in B will be uncorrelated conditional upon any state c in C. To be precise, they suggest the ‘law of conditional independence’: “If A and B are two disjoint 4regions, and C is any 4region which divides the union of the pasts of A and B into two parts, one containing A and the other containing B, then A and B are conditionally independent given c. That is, Pr(a&b/c) = Pr(a/c) × Pr(b/c), for all a,b.” (Penrose and Percival 1962, p. 611).
This is a time asymmetric principle which is clearly closely related to Reichenbach's common cause principle and the causal Markov condition. However one should not take states c in region C to be, or include, the common causes of the (unconditional) correlations that might exist between the states in regions A and B. It is merely a region such that influences from a past common source on both A and B must pass through it, assuming that such influences do not travel at speeds exceeding the speed of light. Note also that the region must stretch to the beginning of time. Thus, one cannot derive anything like Reichenbach's common cause principle or the causal Markov condition from the law of conditional independence, and one therefore would not inherit the richness of applications of these principles, especially the causal Markov condition, even if one were to accept the law of conditional independence.
There are, unfortunately, many counterexamples to the above common cause principles. The next five subsections describe some of the more significant counterexamples.
More generally, suppose that there is a quantity Q, which is a function f(q_{1},…,q_{n}) of quantities q_{i}. Suppose that some of the quantities q_{i} develop indeterministically, but that quantity Q is conserved in such developments. There will then be correlations among the values of the quantities q_{i} which have no prior screener off. The only way that common cause principles can hold when there are conserved global quantities is when the development of each of the quantities that jointly determine the value of the global quantity is deterministic. And then it holds in the trivial sense that the prior determinants make everything else irrelevant. The results of quantum mechanical measurements are not determined by the quantum mechanical state prior to those measurements. And often there are conserved quantities during such a measurement. For instance, the total spin of 2 particles in a quantum ‘singlet’ state is 0. This quantity is conserved when one measures the spins of each of those 2 particles in the same direction: one will always find opposite spins during such a measurement, i.e., the spins that one finds will be perfectly anticorrelated. However what spins one will find is not determined by the prior quantum state. Thus the prior quantum state does not screen off the anticorrelations. There is no quantum common cause of such correlations.
One might think that this violation of common cause principles is a reason to believe that there must then be more to the prior state of the particles than the quantum state; there must be ‘hidden variables’ that screen off such correlations. (And we have seen above that such hidden variables must determine the results of the measurements if they are to screen of the correlations.) However, one can show, given some extremely plausible assumptions, that there can not be any such hidden variables. There do exist hidden variable theories which account for such correlations in terms of instantaneous nonlocal dependencies. Since such dependencies are instantaneous (in some frame of reference) they violate Reichenbach's common cause principle, which demands a prior common cause which screens off the correlations. (For more detail, see, for instance, van Fraassen 1982, Elby 1992, Grasshoff, Portmann & Wuethrich (2003) [in the Other Internet Resources section], and the entries on Bell's theorem and on Bohmian mechanics in this encyclopedia.)
One can also show that such correlations without a possible prior screener off are not confined to very special states, but occur generically in quantum mechanics and quantum field theory. (See Redhead (1995), Clifton, Feldman, Halvorson, Redhead & Wilce (1998), and Clifton & Ruetsche (1999).)
Maxwell's equations not only govern the development of electromagnetic fields, they also imply simultaneous (in all frames of reference) relations between charge distributions and electromagnetic fields. In particular they imply that the electric flux through a surface which encloses some region of space must equal the total charge in that region. Thus electromagnetism implies that there is a strict and simultaneous correlation between the state of the field on such a surface and the charge distribution in the region contained by that surface. And this correlation must hold even on the spacelike boundary at the beginning of the universe (if there be such). This violates all three common cause principles. (For more detail and subtlety, see Earman 1995, chapter 5).
More generally, any coexistence law, such as Newtonian gravitation, or Pauli's exclusion principle, will imply correlations which have no prior common cause conditionally upon which they disappear. Therefore, contrary to what one might hope, there are relativistic coexistence laws which violate common cause principles.
There is a way of understanding common cause principles such that this example is not a counterexample to it. Suppose that in nature there are transition chances from values of quantities at earlier times to values of quantities at later times. ( For more in this idea see Arntzenius 1997). One could then state a common cause principle as follows: conditional upon the values of all the quantities upon which the transition chances to quantities X and Y depend, X and Y will be probabilistically independent. In Sober's example, there are transition chances from earlier costs of bread to later costs of bread, and there are transition chances from earlier water levels to later water levels. Conditional upon earlier costs of bread, later costs of bread are independent of later water levels. A common cause principle formulated as above thus holds in this case. Of course, if one looks at a collection of (simultaneous) data for water levels and bread prices one will see a correlation due to similar laws of development (similar transition chances). But a common cause principle, understood in terms of transition chances, does not imply that there should be a common cause of this correlation. The data (which include these correlations) should be understood as evidence for what the transition chances in nature are, and it is those transition chances that could be demanded to satisfy a common cause principle.
Suppose a particular type of object has 4 possible states: S_{1}, S_{2}, S_{3} and S_{4}. Suppose that if such an object is in state S_{i} at time t, and is not interfered with (isolated), then at time t+1 it has probability ½ of being in the same state S_{i}, and probability ½ of being in state S_{i+1}, where we define 4 + 1 = 1 (i.e., ‘+’ represents addition mod 4). Now suppose we put many such objects in state S_{1} at time t = 0. Then at time t = 1 approximately half of the systems will be in state S_{1}, and approximately half will be in state S_{2}. Let us define property A to be the property that obtains precisely when the system is either in state S_{2} or in state S_{3}, and let us define property B to be the property that obtains precisely when the system is either in state S_{2} or in state S_{4}. At time t = 1 half of the systems are in state S_{1}, and therefore have neither property A nor property B, and the other half are in state S_{2}, so that they have both property A and property B. Thus A and B are perfectly correlated at t = 1. Since these correlations remain conditional on the full prior state (S_{1}), there can be no quantity such that conditional upon a prior value of this quantity A and B are uncorrelated. Thus all three principles fail in this case. One can generalize this example to all generic statespace processes with indeterministic laws of developments, namely Markov processes. At least, one can do this if one allows arbitrary partitions of statespace to count as quantities. (In particular, therefore, Markov processes generically do not satisfy the causal Markov condition. The similarity of names is thus a bit misleading. See Arntzenius 1993 for more detail.)
Suppose that the state of the world (or a system of interest) at any time determines the state of the world (that system) at any other time. It then follows that for any quantity X (of that system) at any time t, there will be at any other time t′, in particular any later time t′, a quantity X′ (to be precise: a partition of statespace) such that the value of X′ at t′ uniquely determines the value of X at t. Conditional upon the value of X′ at t′, the value of X at t will be independent of any value of any quantity at any time. (For more detail see Arntzenius 1993.) Reichenbach's common cause principle thus fails in deterministic contexts. The problem is not that there will not always be earlier events conditional upon which the correlations disappear. Conditional upon the deterministic causes all correlations disappear. The problem is that there will also always be later events that determine whether the earlier correlated events occur. Reichenbach's common cause principle thus fails in so far as it claims that typically there are no later events conditional upon which earlier correlated simultaneous events are uncorrelated.
This does not imply a violation of the causal Markov condition. However, in order to be able to infer causal relations from statistical ones, Spirtes, Glymour and Scheines in effect assume that whenever (unconditionally correlated) quantities Q_{i} and Q_{j} are independent conditional upon some quantity Q_{k}, then Q_{k} is a cause of either Q_{i} or Q_{j}. To be more precise they assume the ‘Faithfulness condition’, which states that there are no probabilistic independencies in nature other than the ones entailed by the causal Markov condition. Since the values of such quantities X′ at later times t′ surely are not direct causes of X at t, Faithfulness is violated, and with it goes our ability to infer causal relations from probabilistic relations, and much of the practical value of the causal Markov condition.^{[5]}
Now, of course, a quantity like X′ whose values at a later time t′ are deterministically related to the values of X at t, will in general correspond to a nonnatural, nonlocal, and not directly observable quantity. So one might wish to claim that the existence of such a later quantity does not violate the spirit of common cause principles. Relatedly, note that in the deterministic case, for correlated events (or quantities) A and B one can always find earlier events (or quantities) C and D which occur iff A and B, respectively, occur. Thus the conjunction of C and D will screen off the correlation between A and B. Again, such a conjunction is not anything one would naturally call a common cause of the later correlated events, and therefore is not the kind of event that Reichenbach was intent on capturing with his common cause principle. Both of these cases suggest that the common cause principle should be limited to some natural subclass of quantities. Let's examine that idea more closely.
The following three subsections will examine some ways in which one could try to rescue common cause principles from the above counterexamples.
Let us now consider another type of counterexample to the idea that a common cause principle can hold of macroscopic quantities, namely cases in which order arises out of chaos. When one lowers the temperature of certain materials, the spins of all the atoms of the material, which originally are not aligned, will line up in the same direction. Pick any two atoms in this structure. Their spins will be correlated. However, it is not the case that the one spin orientation caused the other spin orientation. Nor is there a simple or macroscopic common cause of each orientation of each spin. The lowering of the temperature determines that the orientations will be correlated, but not the direction in which they will line up. Indeed, typically, what determines the direction of alignment, in the absence of an external magnetic field, is a very complicated fact about the total microscopic prior state of the material and the microscopic influences upon the material. Thus, other than virtually the complete microscopic state of the material and its environment there is no prior screener off of the correlation between the spin alignments.
In general when chaotic developments result in ordered states there will be final correlations which have no prior screener off, other than virtually the full microscopic state of the system and its environment. (For more examples, see Prigogine 1980). In such cases the only screener off will be a horrendously complex microscopic quantity.
Next consider a flock of birds that flies, more or less, like a single unit in a rather varied trajectory through the sky. The correlation between the motions of each bird in the flock could have a rather straightforward common cause explanation: there could be a leader bird that every other bird follows. But it could also be that there is no leader bird, that each bird reacts to certain factors in the environment (presence of predator birds, insects, etc.), while at the same time constraining the distance that it will remove itself from its neighboring birds in the flock (as if tied to them by springs that pull harder the further away it gets from the other birds). In the latter case there will be a correlation of motions for which there is no local common cause. There will be an ‘equilibrium’ correlation that is maintained in the face of external perturbations. In ‘equilibrium’ the flock acts more or less as a unit, and reacts as a unit, possibly in a very complicated way, in response to its environment. The explanation of the correlation among the motions of its parts is not a common cause explanation, but the fact that in ‘equilibrium’ the myriad connections between its parts make it act as a unit.
In general we have learned to divide the world into systems which we regard as single units, since their parts normally (in ‘equilibrium’) behave in a highly correlated manner. We routinely do not regard correlations among the motions and properties of the parts of these systems as demanding a common cause explanation.
Many authors have noted that there are circumstances in which the causal Markov condition, and the common cause principle that it implies, provably hold. Roughly speaking, this is the case when the world is deterministic, and the factors A and B which, in addition to the common cause C, determine whether effects D and E occur, are uncorrelated. Let me be more general and precise. Consider a deterministic world and a set of quantities S with certain causal relations holding between them. For any quantity Q, let us call the factors not in S which, when combined with the direct causes of Q that are in S, determine whether Q occurs, the ‘determinants of Q outside S’. Suppose now that the determinants outside S are all independent, i.e., that the joint distribution of all determinants outside S is a product of distributions for each such determinant outside S. One can then prove that the causal Markov condition holds in S.^{[6]}
But when should one expect such independence? P. Horwich (Horwich 1987) has suggested that such independence follows from initial microscopic chaos. (See also Papineau 1985 for a similar suggestion.) His idea is that if all the determinants outside S are microscopic, then they will all be uncorrelated since all microscopic factors will be uncorrelated when they are chaotically distributed. However, even if one has microscopic chaos (i.e., a uniform probability distribution in certain parts of statespace in a canonical coordinatization of the statespace), it is still not the case that all microscopic factors are uncorrelated. Let me give a generic counterexample.
Suppose that quantity C is a common cause of quantities A and B, that the system in question is deterministic, and that the quantities a and b which, in addition to C, determine the values of A and B are microscopic and independently distributed for each value of C. Then A and B will be uncorrelated conditional upon each value of C. Now define quantities D:A+B and E:AB. (“+” and “” here represent ordinary addition and subtraction of the values of quantities.) Then, generically, D and E will be correlated conditional upon each value of C. To illustrate why this is so let me give a very simple example. Suppose that for a given value of C quantities A and B are independently distributed, that A has value 1 with probability 1/2 and value 1 with probability 1/2, and that B has value 1 with probability 1/2 and value 1 with probability 1/2. Then the possible values of D are 2, 0 and 2, with probabilities 1/4, 1/2 and 1/4 respectively. The possible values of E are also 2, 0 and 2, with probabilities 1/4, 1/2 and 1/4 respectively. But note, for instance, that if the value of D is 2, then the value of E must be 0. In general a nonzero value for D implies value 0 for E and a nonzero value for E implies value 0 for D. Thus, the values of D and E are strongly correlated for the given value of C. And it is not too hard to show that, generically, if quantities A and B are uncorrelated, then D and E are correlated. Now, since D and E are correlated conditional upon any value of C, it follows that C is not a prior common cause which screens off the correlation between D and E. And since the factors a and b which, in addition to C, determine the values of A and B, and hence those of D and E, can be microscopic and horrendously complex, there will be no screener off of the correlations between D and E other than some incredibly complex and inaccessible microscopic determinant. Thus common cause principles fail if one uses quantities D and E rather than quantities A and B to characterize the later state of the system.
One might try to save common cause principles by suggesting that in addition to C being a cause of D and of E, D is also a cause of E, or E is also a cause of D. (See Glymour and Spirtes 1994, pp 277278 for such a suggestion). This would explain why D and E are still correlated conditional upon C. Nonetheless, this does not seem a plausible suggestion. In the first place, D and E are simultaneous. In the second place, the situation sketched is symmetric with respect to D and E, so which is supposed to cause which? It seems far more plausible to admit that common cause principles fail if one uses quantities D and E.
One might next try to defend common cause principles by suggesting that D and E are not really independent quantities, given that each is defined in terms of A and B, and that one should only expect common cause principles to be true of good, honest, independent quantities. Although this argument is along the right lines, as it stands it is too quick and simple. One cannot say that D and E are not independent because of the way they are defined in terms of A and B. For similarly A = ½(D+E) and B = ½(D−E), and unless there are reasons independent of such equations to claim that A and B are bona fide independent quantities while D and E are not, one is stuck. For now let us therefore conclude that an attempt to prove the common cause principle by assuming that all microscopic factors are uncorrelated rests on a false premise.
Nonetheless such arguments are pretty close to being correct: microscopic chaos does imply that a very large and useful class of microscopic conditions are independently distributed. For instance, assuming a uniform distribution of microscopic states in macroscopic cells, it follows that the microscopic states of two spatially separated regions will be independently distributed, given any macroscopic states in the two regions. Thus microscopic chaos and spatial separation is sufficient to provide independence of microscopic factors. This in fact covers a very large and useful class of cases. For almost all correlations that we are interested in are between factors of systems that are not exactly in the same location. Consider, for instance, an example due to Reichenbach.
Suppose that two actors almost always eat the same food. Every now and then the food will be bad. Let us assume that whether or not each of the actors become sick depends on the quality of the food that they consume and on other local factors (properties of their body etc.) at the time of consumption (and perhaps also later), which previously have developed chaotically. The values of these local factors for one of the actors will then be independent of the values of these local factors for the other actor. It then follows that there will be a correlation between their states of health, and that this correlation will disappear conditional upon the quality of the food. In general when one has a process that physically splits into two separate processes which remain separated in space, then all the ‘microscopic’ influences on those two processes will be independent from then on. Indeed there are very many cases in which two processes, whether spatially separated or not, will have a point after which microscopic influences on the processes are independent given microscopic chaos. In such cases common cause principles will be valid as long as one chooses as one's quantities the (relevant aspects of the) macroscopic states of the processes at the time of such separations (rather than the macroscopic states significantly prior to such separations) and some aspects of macroscopic states somewhere along each separate process (rather than some amalgam of quantities of the separate processes).
One should also not be interested in common cause principles which allow any conditions, no matter how microscopic, scattered and unnatural, to count as common causes. For, as we have seen, this would trivialize such principles in deterministic worlds, and would hide from view the remarkable fact that when one has a correlation among fairly natural localized quantities that are not related as cause and effect, almost always one can find a fairly natural, localized prior common cause that screens off the correlation. The explanation of this remarkable fact, which was suggested in the previous section, is that Reichenbach's common cause principle, and the causal Markov condition, must hold if the determinants, other than the causes, are independently distributed for each value of the causes. The fundamental assumptions of statistical mechanics imply that this independence will hold in a large class of cases given a judicious choice of quantities characterizing the causes and effects. In view of this, it is indeed more puzzling why common cause principles fail in cases like those described above, such as the coordinated flights of certain flocks of birds, equilibrium correlations, order arising out of chaos, etc. The answer is that in such cases the interactions between the parts of these systems are so complicated, and there are so many causes acting on the systems, that the only way one can get independence of further determinants is by specifying so many causes as to make this a practical impossibility. This, in any case, would amount to allowing just about any scattered and unnatural set of factors to count as common causes, thereby trivializing common cause principles. Thus, rather than do that, we regard such systems as single unified systems, and do not demand a common cause explanation for the correlated motions and properties of their parts. A fairly intuitive notion of what counts as a single system, after all, is a system that behaves in a unified manner, i.e., a system whose parts have a very strong correlation in their motions and/or other properties, no matter how complicated the set of influences acting on them. For instance a rigid physical object has parts whose motions are all correlated, and a biological organism has parts whose motions and properties are strongly correlated, no matter how complicated the influences acting on it. These systems therefore are naturally and usefully treated as single systems for almost any purpose. The core truth of common cause principles thus in part relies on our choice as to how to partition the world into unified and independent objects and quantities, and in part on the objective, temporally asymmetric, principles that lie at the foundation of statistical mechanics.
Frank Arntzenius arntzeni@rci.rutgers.edu 
A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z