#### Supplement to Probabilistic Causation

## The Markov Condition

### 1. Factorization

When the probability distribution P over the variable set
**V** satisfies the MC, the joint distribution factorizes
in a very simple way. Let **V** =
{*X*_{1}, *X*_{2}, …, *X*_{n}}. Then

P(X_{1},X_{2}, …,X_{n}) = Π_{i}P(X_{i}|PA(X))._{i}

This is easily seen in the following way. Since we are assuming that
the graph over **V** is acyclic, we may re-label the
subscripts on the variables so that they are ordered from
‘earlier’ to ‘later’, with only earlier
variables causing later ones. It follows from the probability calculus
that
P(*X*_{1}, *X*_{2}, …, *X*_{n}) =
P(*X*_{1}) ×
P(*X*_{2} | *X*_{1}) × …
×
P(*X*_{n} | *X*_{1}, *X*_{2}, …,*X*_{n−1}).
For each term
P(*X*_{i} | *X*_{1}, *X*_{2}, …, *X*_{i−1}), our
ordering ensures that all of the parents of *X*_{i} will
be included on the right hand side, and none of its descendants will.
The MC then tells us that we can eliminate all of the terms from the
right hand side except for the parents of *X*_{i}.

### 2. D-separation

The (MC) immediately implies that certain variables are conditionally
independent of others. Further conditional independence relations will
follow from the (MC) together with the probability calculus. In a
complex graph, it will not always be easy to determine which
conditional independence relations do and do not follow the (MC). Geiger
(1987) and Verma and Pearl (1988) have developed a purely graphical
criterion that is both necessary and sufficient for conditional
independence to be a consequence of the (MC). (However, a probability
measure that violates the Faithfulness Condition—discussed in
Section 3.3—with respect to a given graph may include conditional
independence relations that are not consequences of the (MC).) Let
**G** be a directed acyclic graph over **V**.
A *path* in **G** is a sequence of variables in
**V**,
⟨*X*_{1}, …, *X*_{k}⟩,
such that for any two consecutive variables in the
sequence *X*_{i},
*X*_{i+1}, there is either an arrow from
*X*_{i} to *X*_{i+1} or an arrow from
*X*_{i+1} to *X*_{i}. Such a path is said
to be a path from *X*_{1} to
*X*_{k}. A variable *X*_{i} on this
path is said to be a *collider* just in case *i* ≠ 1,
k and there are arrows from both *X*_{i-1} and
*X*_{i+1} into *X*_{i}. Intuitively,
*X*_{i} is a collider just in case the arrows converge
on *X*_{i} in the path. For any two variables
*X*, *Y* in **V** and any subset
**Z** of **V**, we define the relation of
*d-separation* as follows:

(d-sep)

Zd-separatesXandYjust in case every path ⟨X=X_{1}, …,X_{k}=Y⟩ fromXtoYcontains at least one variableX_{i}such that either:

X_{i}is a collider, and no descendant ofX_{i}(includingX_{i}itself) is inZ; orX_{i}is not a collider, andX_{i}is inZ.

Then:

(MC) entails thatXandYare probabilistically independent conditional uponZjust in caseZd-separatesXandYinG.