Stanford Encyclopedia of Philosophy
A |
B |
C |
D |
E |
F |
G |
H |
I |
J |
K |
L |
M |
N |
O |
P |
Q |
R |
S |
T |
U |
V |
W |
X |
Y |
Z
Probabilistic Causation
"Probabilistic Causation" designates a group of philosophical theories
that aim to characterize the relationship between cause and
effect using the tools of probability
theory. A primary motivation for the development of such theories is
the desire for a theory of causation that does not presuppose physical
determinism. The central idea behind these
theories is that causes raise the probabilities of their effects, all
else being equal. As we shall see, a great deal of the work that has
been done in this area has been concerned with making the ceteris
paribus clause more precise. Issues within, and objections to,
probabilistic theories of causation will also be discussed.
According to David Hume, causes are sufficient
conditions for their effects: "We may define a cause to be an
object, followed by another, and where all the objects similar to the
first, are followed by objects similar to the second." (1748,
section VII.) Later writers refined Hume's theory, but still
characterized the causal relation in terms of necessary and sufficient
conditions. One of the best known approaches is Mackie's theory of
inus conditions. An inus condition for some effect is an
insufficient but non-redundant part of an unnecessary but sufficient
condition. Suppose, for example, that a lit match causes a forest
fire. The lighting of the match, by itself, is not sufficient; many
matches are lit without ensuing forest fires. The lit match is,
however, a part of some constellation of conditions that are jointly
sufficient for the fire. Moreover, given that this set of conditions
occurred, rather than some other set sufficient for fire, the lighting
of the match was necessary: without it, the fire would not have
occurred.
The necessity/sufficiency approach makes causation incompatible with
indeterminism: if an event is not determined to occur, then no event
can be a part of a sufficient condition for that event. (An analogous
point may be made about necessity.) The recent success of
quantum mechanics -- and to a lesser extent, other
theories employing probability -- has shaken our faith in
determinism. Thus it has struck many philosophers
as desirable to develop a theory of causation that does not presuppose
determinism. The central idea behind probabilistic theories of
causation is that causes raise the probability
of their effects; an effect may still occur in the
absence of a cause or fail to occur in its presence.
Suggested
Readings: Hume (1748), especially section VII;
Mackie (1974), especially chapter 3.
Many philosophers find the idea of indeterministic causation
counterintuitive. Indeed, the word "causality" is sometimes used as a
synonym for determinism. A strong case for
indeterministic causation can be made by considering the epistemic
warrant for causal claims. There is now very strong empirical
evidence that smoking causes lung cancer. Yet the question of whether
there is a deterministic relationship between smoking and lung cancer
is wide open. The formation of cancer cells depends upon mutation,
which is a strong candidate for being an indeterministic process.
Moreover, whether an individual smoker develops lung cancer or not
depends upon a host of additional factors, such as whether or not she
is hit by a bus before cancer cells begin to form. Thus the price of
preserving the intuition that causation presupposes determinism is
agnosticism about even our best supported causal claims.
Suggested Readings: Suppes (1970) makes the case for a
probabilistic theory of causation in his introduction. Humphreys
(1989), contains a sensitive treatment of issues involving
indeterminism and causation; see especially sections 10 and 11.
The central idea that causes raise the probability
of their effects can be expressed formally using the apparatus of
conditional probability. Let A, B, C,... represent entities that
potentially stand in causal relations. Depending upon the account,
these may be particular events, such as the
assassination of Archduke Ferdinand, or event types, such as exposure
to ultraviolet radiation. We will discuss this issue at greater
length in section 4.3 below. For now we will adopt
the generic word "factor" to describe the relevant entities. Let P be
a probability function, satisfying the normal rules of the
probability calculus, such that P(A) represents the
empirical probability that factor A occurs or is instantiated (and
likewise for the other factors). The issue of how empirical
probability is to be interpreted will not be addressed here. (See the
entry under probability). The probability of B,
given A, is represented as a conditional probability:
P(B|A) = P(A & B)/P(A).
One natural way of
understanding the idea that A raises the probability of B is that
P(B|A) > P(B|not-A). Thus a first attempt at a probabilistic theory
of causation would be: PR: A causes B if and only
if P(B|A) > P(B|not-A).
This formulation is labeled
PR for "Probability-Raising".
There are two central problems with this theory. The first is that
probability-raising is symmetric: if P(B|A) > P(B|not-A), then P(A|B)
> P(A|not-B). The causal relation, however, is asymmetric: if
A causes B, then typically B does not cause A. The problem of causal
asymmetry arises for virtually every theory of causation, and
probabilistic theories of causation are no exception.
The second problem concerns spurious correlations. If, for
example, A and B are both caused by some third event, say C, then it
may be that P(B|A) > P(B|not-A) even though A does not cause B. For
example, let A be an individual's having yellow-stained fingers, and B
that individual's having lung cancer. Then we would expect that
P(B|A) > P(B|not-A). The reason that those with yellow-stained
fingers are more likely to suffer from lung cancer is that smoking
tends to produce both effects. Because individuals with
yellow-stained fingers are more likely to be smokers, they are also
more likely to suffer from lung cancer. Intuitively, the way to
address this problem is to require that causes raise the probabilities
of their effects ceteris paribus. The history of probabilistic
causation is to a large extent a history of attempts to resolve these
two central problems.
Hans Reichenbach introduced the terminology of
"screening off" to apply to a particular type of probabilistic
relationship. If P(B|A & C) = P(B|C), then C is said to screen A off
from B. Intuitively, C renders A probabilistically irrelevant to B.
With this notion in hand, we can attempt to avoid the problem of
spurious correlations by adding a `no screening off' condition to the
basic probability-raising condition:
NSO: Factor A occurring at time t, is a cause of
the later factor B if and only if:
1. P(B|A) > P(B|not-A)
2. There is no factor C, occurring earlier than or simultaneously with A, that screens A off from B.
We will call this the NSO, or `No Screening Off' formulation.
Suppose, as in our example above, that smoking (C) causes both
yellow-stained fingers (A) and lung cancer (B). Then smoking will
screen yellow-stained fingers off from lung cancer: given that an
individual smokes, his yellow-stained fingers have no impact upon his
probability of developing lung cancer.
The second condition of NSO does not suffice to resolve the
problem of spurious correlations, however. This condition was added
to eliminate cases where spurious correlations give rise to factors
that raise the probability of other factors without causing them.
Spurious correlations can also give rise to cases where a cause does
not raise the probability of its effect. So genuine causes need not
satisfy the first condition of NSO. Suppose, for
example, that smoking is highly correlated with exercise: those who
smoke are much more likely to exercise as well. Smoking is a cause of
heart disease, but suppose that exercise is an even stronger
preventative of heart disease. Then it may be that smokers are, over
all, less likely to suffer from heart disease than non-smokers. That
is, letting A represent smoking, C exercise, and B heart disease,
P(B|A) < P(B|not-A). Note, however, that if we conditionalize on
whether one exercises or not, this inequality is reversed: P(B|A & C)
> P(B|not-A & C), and P(B|A & not-C) > P(B|not-A & not-C).
The next step is to replace conditions 1 and 2 with the
requirement that causes must raise the probability of their effects in
test situations:
TS: A causes B if P(B|A & T) > P(B|not-A & T) for
every test situation T.
A test situation is a conjunction of factors. When such a conjunction
of factors is conditioned on, those factors are said to be "held
fixed". To specify what the test situations will be, then, we must
specify what factors are to be held fixed. In the previous example,
we saw that the true causal relevance of smoking for lung cancer was
revealed when we held exercise fixed, either positively (conditioning
on C) or negatively (conditioning on not-C). This suggests that in
evaluating the causal relevance of A for B, we need to hold fixed
other causes of B, either positively or negatively. This suggestion
is not entirely correct, however. Let A and B be smoking and lung
cancer as above. Suppose C is a causal intermediary, say the presence
of tar (and other carcinogens) in the lungs. If A causes B
exclusively via C, then C will screen A off from B: given the presence
(absence) of carcinogens in the lungs, the probability of lung cancer
is not affected by whether those carcinogens got there by smoking (are
absent despite smoking). Thus we will not want to hold fixed any
causes of B that are themselves caused by A. Let us call the
set of all factors that are causes of B, but are not caused by A, the
set of independent causes of B. A test situation for A and B
will then be a maximal conjunction, each of whose conjuncts is either
an independent cause of B, or the negation of an independent cause of
B.
Note that the specification of factors that need to be held fixed
appeals to causal relations. This appears to rob the theory of its
status as a reductive analysis of causation. We will see in
section 4.4 below, however, that the issue is
substantially more complex than that. In any event, even if there is
no reduction of causation to probability, a theory detailing the
systematic connections between causation and probability would be of
great philosophical interest.
TS can be generalized in a number of ways. For example, one
could define a `negative cause' or `preventer' or `inhibitor' as a
factor that lowers the probability of its `effect' in all test
situations, and a `mixed' or `interacting' cause as one that affects
the probability of its `effect' in different ways in different test
situations. Or one could define causal relationships between
variables that are non-binary, such as caloric intake and blood
pressure. In principle, there are infinitely many ways in which one
variable might depend probabilistically on another, even holding fixed
some particular test situation, so this approach abandons any neat
classification of causal factors into causes and preventers. These
generalizations will also suggest revisions of the method for
constructing test situations, since they suggest different sorts of
factors to be held fixed.
An alternative approach to the problem of spurious correlations is
through counterfactuals. According to a
probabilistic counterfactual theory of causation (PC), A causes
B if both occur and the probability that B would occur, at the time of
A's occurrence, was much higher than it would have been at the
corresponding time if A had not occurred. This counterfactual is to
be understood in terms of possible worlds: it is true if, in the
nearest possible world(s) where A does not occur, the probability of B
is much lower than it was in the actual world. On this account, one
does not compare conditional probabilities, but unconditional
probabilities in different possible worlds. The test situation is not
some specified conjunction of factors, but the sum total of all that
remains unchanged in moving to the nearest possible world(s) where A
does not occur. Obviously a great deal hinges here upon the account
of what makes some worlds nearer than others; for more on this issue,
see the entry under "causation, counterfactual
theories."
Suggested Readings: This section more or less follows the main
developments in the history of probabilistic theories of causation.
Versions of the NSO theory are found in
Reichenbach (1956, section 23), and Suppes (1970,
chapter 2). Salmon (1980) is an influential critique of these
theories. The first version of TS was presented in Cartwright
(1979). Eells (1991, chapters 2, 3, and 4) and Hitchcock (1993) carry
out the two generalizations of TS described.
Lewis (1986) is the locus classicus for PC.
Good (1961, 1962) is an early essay on probabilistic causation that is
rich in insights, but has had surprisingly little influence on the
formulation of later theories.
The second major problem with the basic probability-raising idea was
that the relationship of probability-raising is symmetrical. One way
of cutting through the Gordian knot is to require that causes precede
their effects in time. This has several systematic
disadvantages. It rules out the possibility of backwards-in-time
causation a priori, whereas many believe that it is only a
contingent fact that causes precede their effects in
time. This is less of a worry if one is not
concerned to give a conceptual analysis of causation. Second, this
approach rules out the possibility of developing a causal theory of
temporal order (on pain of vicious circularity), a theory that has
seemed attractive to some philosophers. Note also that while
assigning temporal locations to particular events
is entirely coherent, it is not so clear what it means to say that one
property or event type occurs before another. For example, what does
it mean to say that smoking precedes lung cancer? There have been
many episodes of smoking, and many of lung cancer, and not all of the
former occurred prior to all of the latter. This will be a problem
for those who are interested in providing a probabilistic theory of
causal relations among properties or event types.
A more ambitious approach to the problem of causal asymmetry is to try
to characterize that asymmetry in terms of probability relations
alone. The best-known proposal of this sort is due to Hans
Reichenbach. Suppose that factors A and B are
positively correlated:
1. P(A & B) > P(A)P(B)
It is easy to see that this will hold exactly when A raises the
probability of B and vice versa. Suppose, moreover, that there is
some factor C having the following properties:
2. P(A & B|C) = P(A|C)P(B|C) 3. P(A & B|not-C) =
P(A|not-C)P(B|not-C)
4. P(A|C) > P(A|not-C)
5. P(B|C) >
P(B|not-C).
In this case, the trio ACB is said to form a conjunctive fork.
Conditions 2 and 3 stipulate that C and not-C screen off A from B. As
we have seen, this sometimes occurs when C is a common cause of A and
B. Conditions 2 through 5 entail 1, so in some sense C explains the
correlation between A and B. If C occurs earlier than A and B, and
there is no event satisfying 2 through 5 that occurs later than A and
B, then ACB is said to form a conjunctive fork open to the
future. Analogously, if there is a future factor satisfying 2
through 5, but no past factor, we have a conjunctive fork open to the
past. If a past factor C and a future factor D both satisfy 2 through
5, then ACBD forms a closed fork. Reichenbach's proposal was that the
direction from cause to effect is the direction in which open forks
predominate. In our world, there are many forks open to the future,
few or none open to the past.
It is not clear, however, that this asymmetry between forks open to
the past and open to the future will be as pervasive as this proposal
seems to presuppose. In quantum mechanics, there are correlated
effects that are believed to have no common cause that screens them
off. Moreover, if ACB forms a conjunctive fork in which C precedes A
and B, but C has a deterministic effect D which occurs after A and B,
then ACBD will form a closed fork. A further difficulty with this
proposal is that since it provides a global ordering of causes and
effects, it seems to rule out a priori the possibility that some
effects might precede their causes. More complex attempts to derive
the direction of causation from probabilities have been offered; the
issues here intersect with the problem of reduction, discussed in
section 4.4 below.
Proponents of counterfactual theories of causation attempt to derive
the asymmetry of causation from a corresponding asymmetry in the truth
values of counterfactuals. For details see the entry for "
causation, counterfactual theories."
Suggested
Readings: Suppes (1970, chapter 2) and Eells (1991, chapter 5)
define causal asymmetry in terms of temporal asymmetry.
Reichenbach's proposal is presented in his (1956,
chapter IV). Some difficulties with this proposal are discussed in
Arntzenius (1993). Papineau (1993) is a good overall discussion of
the problem of causal asymmetry within probabilistic theories.
According to TS, a cause must raise the probability of its
effect in every test situation. This has been called the
requirement of context-unanimity. This requirement is
vulnerable to the following sort of counterexample. Suppose that
there is a gene that has the following unusual effect: those that
possess the gene have their chances of contracting lung cancer
lowered when they smoke. This gene is very rare, let us
imagine -- indeed, it need not exist at all in the human population,
so long as humans have some non-zero probability of possessing this
gene (perhaps as a result of a very improbable mutation). In this
scenario, there would be test situations (those that hold fixed the
presence of the gene) in which smoking lowers the probability of lung
cancer: thus smoking would not be a cause of lung cancer according to
the context-unanimity requirement. Nonetheless, it seems unlikely
that the discovery of such a gene (or of the mere possibility of its
occurrence) would lead us to abandon the claim that smoking causes
lung cancer.
This line of objection is surely right about our ordinary use of
causal language. It is nonetheless open to the defender of
context-unanimity to respond that she is interested in supplying a
precise concept to replace the vague notion of causation that
corresponds to our everyday usage. In a population consisting of
individuals lacking the gene, smoking causes lung cancer. In a
population consisting entirely of individuals who possess the gene,
smoking prevents lung cancer. In contexts where one desires causal
information for purposes of deliberation (say concerning whether to
smoke), it is this more precise type of information that is desired.
Suggested Readings: Dupré (1984) presents this challenge
to the context-unanimity requirement, and offers an alternative.
Eells (1991, chapters 1 and 2), defends context-unanimity using the
idea that causal claims are made relative to a population.
Given the basic probability-raising idea, one would expect putative
counterexamples to probabilistic theories of causation to be of two
basic types: cases where causes fail to raise the probabilities of
their effects, and cases where non-causes raise the probabilities of
non-effects. The discussion in the literature has focused almost
entirely on the first sort of example. Consider the following
example, due to Deborah Rosen. A golfer badly slices a golf ball,
which heads toward the rough, but then bounces off a tree and into the
cup for a hole in one. The golfer's slice lowered the probability
that the ball would wind up in the cup, yet nonetheless caused this
result. One way of avoiding this problem is to attend to the
probabilities that are being compared. If we label the slice A, not-A
is a disjunction of several alternatives. One such alternative is a
clean shot -- compared to this alternative, the slice lowered the
probability of a hole-in-one. Another alternative is no shot at all,
relative to which the slice increases the probability of a
hole-in-one. By making the latter sort of comparison, we can recover
our original intuitions about the example.
For an example of the second type, suppose that two gunmen shoot at a
target. Each has a certain probability of hitting, and a certain
probability of missing. Assume that none of the probabilities are one
or zero. As a matter of fact, the first gunman hits, and the second
gunman misses. Nonetheless, the second gunman did fire, and by
firing, increased the probability that the target would be hit, which
it was. While it is obviously wrong to say that the second gunman's
shot caused the target to be hit, it would seem that a probabilistic
theory of causation is committed to this consequence. A natural
approach to this problem would be to try to strengthen the
probabilistic theory of causation with a requirement of spatiotemporal
connection between cause and effect (see the entry on
"causation, causal processes"), but to date, no
successful proposal along these lines has been proffered.
Suggested Readings: Salmon (1980) presents several examples of
probability-lowering causes. Hitchcock (1995) presents a response.
Woodward (1990) describes the structure that is instantiated in the
example of the two gunmen. Humphreys (1989, section 14) responds.
Menzies (1989, 1996) discusses examples involving causal pre-emption
where non-causes raise the probabilities of non-effects.
We make at least two different kinds of causal claim. Singular
causal claims, such as "Jill's heavy smoking during the `80's caused
her to develop lung cancer," relate particular
events that have spatiotemporal locations.
General causal claims, such "smoking causes lung cancer" relate
event types or properties. With this distinction in mind, we may note
that the counterexamples mentioned above are both formulated in terms
of singular causation. The examples do not undermine the General
causal claims that a probabilistic theory of causation would
appear to license in these cases: slices prevent (are negative causes
of) holes-in-one; shooting at targets causes them to be hit. So one
possible reaction to the counterexamples of the previous section would
be to maintain that the probabilistic theory of causation whose
development was sketched in section 3 above is a theory of general
causation only, and that singular causation requires a distinct
philosophical theory. One consequence of this move is that there are
(at least) two distinct species of causal relation, each requiring its
own philosophical account--not an altogether happy predicament.
Suggested Readings: The need for distinct theories of singular
and general causation is defended in Good (1961, 1962), Sober (1985),
and Eells (1991, introduction and chapter 6). Eells (1991, chapter 6)
offers a distinct probabilistic theory of singular causation in terms
of the temporal evolution of probabilities. Carroll (1991) and
Hitchcock (1995) offer two quite different lines of response.
Returning to the theories outlined in section 3, recall that theory
NSO was an attempt at a reductive analysis of causation
in terms of probabilities (and perhaps also temporal order). By
contrast, TS defines causal relations in terms of probabilities
conditional upon specifications of test conditions, which are
themselves characterized in causal terms. Thus it appears that the
latter theories cannot be analyses of causation, since causation
appears in the analysans. Given that TS contains much needed
improvements over NSO, it looks as though there can be no
reduction of causation to probabilities. This may be giving up too
soon, however. In order to determine whether a probabilistic
reduction of causation is possible, the central issue is not whether
the word `cause' appears in both the analysandum and the analysans;
rather, the key question should be whether, given an assignment of
probabilities to a set of factors, there is a unique set of causal
relations among those factors compatible with the probability
assignment and the theory in question. Suppose that a set of factors,
and a system of causal relations among those factors is given: call
this the causal structure CS. Let T be a theory
connecting causal relations among factors with probabilistic relations
among factors. Then the causal structure CS will be
probabilistically distinguishable relative to T, if for
every assignment of probabilities to the factors in CS that is
compatible with CS and T, CS is the unique causal
structure compatible with T and those probabilities. (One
could formulate a weaker sense of distinguishability by requiring that
only some assignment of probabilities uniquely determines CS).
Intuitively, T allows you to infer that the causal structure is
in fact CS given the probability relations between factors.
Given a probabilistic theory of causation T, it is possible to
imagine many different properties it might have. Here are some
possibilities:
1. All causal structures are probabilistically distinguishable
relative to T
2. All causal structures having some interesting property are
probabilistically distinguishable relative to T
3. Any causal structure can be embedded in a causal structure that is
probabilistically distinguishable relative to T
4. The actual causal structure of the world (assuming there is such a
thing) is probabilistically distinguishable relative to T .
It is not obvious which type of distinguishability properties a theory
must have in order to constitute a reduction of causation to
probabilities. This sort of approach to the question of probabilistic
reduction is quite new, and currently an active area of
investigation.
Suggested Readings: The most detailed treatment of
probabilistic distinguishability is given in Spirtes, Glymour and
Scheines (1993); see especially chapter 4. Spirtes, Glymour and
Scheines prove (theorem 4.6) a result along the lines of 3 for a
theory that they propose. This work is very technical. An accessible
presentation is contained in Papineau (1993), which defends a position
along the lines of 4.
- Arntzenius, Frank (1993) "The Common Cause Principle," in Hull, Forbes, and Okruhlik (1993), pp. 227 - 237.
- Carroll, John (1991) "Property-level Causation?" Philosophical Studies 63: 245-70.
- Cartwright, Nancy (1979) "Causal Laws and Effective Strategies," Noûs 13: 419-437.
- Dupré, John (1984) "Probabilistic Causality Emancipated," in Peter French, Theodore Uehling, Jr., and Howard Wettstein, eds., (1984) Midwest Studies in Philosophy IX (Minneapolis: University of Minnesota Press), pp. 169 - 175.
- Eells, Ellery (1991) Probabilistic Causality. Cambridge, U.K.: Cambridge University Press.
- Good, I. J. (1961) "A Causal Calculus I," British Journal for the Philosophy of Science 11: 305-18.
- Good, I. J. (1962) "A Causal Calculus II," British Journal for the Philosophy of Science 12: 43-51.
- Hitchcock, Christopher (1993) "A Generalized Probabilistic Theory of Causal Relevance," Synthese 97: 335-364.
- Hitchcock, Christopher (1995) "The Mishap at Reichenbach Fall: Singular vs. General Causation," Philosophical Studies 78: 257 - 291.
- Hull, David, Mickey Forbes, and Kathleen Okruhlik, eds. (1993) PSA 1992, Volume Two (East Lansing: Philosophy of Science Association).
- Hume, David (1748) An Enquiry Concerning Human Understanding.
- Humphreys, Paul (1989) The Chances of Explanation: Causal Explanations in the Social, Medical, and Physical Sciences, Princeton: Princeton University Press.
- Lewis, David (1986) "Causation" and "Postscripts to `Causation'," in Philosophical Papers, Volume II, Oxford: Oxford University Press, pp. 172-213.
- Mackie, John (1974) The Cement of the Universe. Oxford: Clarendon Press.
- Menzies, Peter (1989) "Probabilistic Causation and Causal Processes: A Critique of Lewis," Philosophy of Science 56: 642-63.
- Menzies, Peter (1996) "Probabilistic Causation and the Pre-emption
Problem", Mind 105: 85-117.
- Papineau, David (1993) "Can We Reduce Causal Direction to
Probabilities?" in Hull, Forbes and Okruhlik (1993), pp. 238-252.
- Reichenbach, Hans (1956) The Direction of Time. Berkeley and Los Angeles: University of California Press.
- Salmon, Wesley (1980) "Probabilistic Causality," Pacific Philosophical Quarterly 61: 50 - 74.
- Sober, Elliott (1985) "Two Concepts of Cause" in Peter Asquith and Philip Kitcher, eds., PSA 1984, Vol. II (East Lansing: Philosophy of Science Association), pp. 405-424.
- Spirtes, Peter, Clark Glymour, and Richard Scheines (1993) Causation, Prediction and Search. New York: Springer-Verlag.
- Suppes, Patrick (1970) A Probabilistic Theory of Causality. Amsterdam: North-Holland Publishing Company.
- Woodward, James (1990) "Supervenience and Singular Causal Claims," in Dudley Knowles, ed., Explanation and its Limits (Cambridge, U.K: Cambridge University Press), pp. 211 - 246.
causation: causal processes |
causation: counterfactual theories |
cause and effect |
counterfactuals |
determinism |
event |
Hume, David |
probability calculus |
probability calculus: interpretations of |
quantum mechanics |
Reichenbach's common cause principle |
time
Copyright © 1997 by
Christopher Hitchcock
cricky@hss.caltech.edu
A |
B |
C |
D |
E |
F |
G |
H |
I |
J |
K |
L |
M |
N |
O |
P |
Q |
R |
S |
T |
U |
V |
W |
X |
Y |
Z
Table of Contents
First published: July 11, 1997
Content last modified: July 17, 1997