#### Supplement to Chance versus Randomness

## A. Some Basic Principles About Chance

- A.1 The Principal Principle
- A.2 The Basic Chance Principle
- A.3 Frequencies, Reductionism, and the Stable Trial Principle

### A.1 The Principal Principle

The most prominent constraint has been the idea that chances, when known, should guide rational credence, at least when other things are equal. Reasonable people who know the chance of some outcome, and know nothing else of relevance, should set their personal confidence in the outcome eventuating to the same value as the chance. This commonsensical claim was made precise and elevated to the status of a principle in Lewis (1980), who called it the ‘Principal Principle’—‘principal’ because ‘it seems… to capture all we know about chance’ (Lewis 1980: 86). Lewis’ more precise formulation goes as follows. Let \(p\) be some proposition about the outcome of some chance process (say, a coin toss, or the decay of an atom of some radioactive element), \(C\) be a reasonable initial credence function, and \(E\) be the background evidence at \(t\), where \(E\) crucially doesn’t include information that pertains to the truth of \(p\) except through its chance (Lewis 1980: 92; see also Hoefer 2007: 553). It is standardly assumed that, at least, historical information prior to \(t\) and information about how possible histories and possible laws bear on possible chances is admissible. Then the Principal Principle is (assuming \(C(\ulcorner\Ch(p) = x\urcorner \wedge E) \gt 0)\):

**(PP)**- \(C(p \mid \ulcorner\Ch(p) = x\urcorner \wedge E) = x\).

Assuming one updates credences by conditionalising, this form of the Principle Principle then entails that when rational agents come to know the chances, their credences are equal to the chances when they have no inadmissible information about the outcome. Chance thus plays the role of an ‘expert’ probability function, a norm for the credences of rational agents (Gaifman, 1988; Hall, 2004). The expertise of the chance function in the original Principal Principle is unconditional—rational initial credences should simply adopt values equal to the chances. When the evidence \(E\) is ordinary, this is unproblematic; but if there are ever cases where the evidence trumps the chances, a more nuanced principle is required—the New Principal, of which more below.

Lewis showed that, from the Principal Principle, much of what we know about chance follows. For example, if it is accepted, we needn’t add as a separate constraint that chances are probabilities. Suppose one knew all the propositions stating the chances at some particular time of all future outcomes, and had no inadmissible evidence. Suppose one began with rational credence before learning the chances. Then in accordance with the Principal Principle, one assigns the same value to conditional credence (conditional on this evidence about the chances) in each future proposition as the value of its chance. Since rational conditional credences are probabilities, so too must chances be (Lewis 1980: 98). Furthermore, the chance of past events is always 1. Suppose \(A\) has already happened; since historical information is admissible, the Principal Principle implies that the chance of \(A\), at \(t\), is 1. It does not imply that the chance of \(A\) was always 1 (as the evidence admissible at one time may well have been inadmissible at some earlier time); so chances change over time, in accordance with the common belief that only the future is chancy, and the past is fixed. And finally, since an agent’s credences guide their actions—that’s partly what makes them that agent’s credences—if the agent updates their credences rationally and in line with the Principal Principle, then their beliefs about chances will guide their credences. Chance, then, is the kind of probability which ‘is the very guide of life’ (Butler 1736: Introduction).

Of course someone who didn’t believe in credences wouldn’t accept the Principal Principle (Kyburg 1978), and there have been modifications and amendments proposed to respond to various problems some have perceived with the Principal Principle (see below). But the former group is vanishingly small in number, and even those who propose modifications agree that the Principal Principle is an extremely good approximation to the correct principle. Even if PP turns out to be not exactly right, the commonsense belief that it gives precise form to would still remain as a guiding constraint for any theory that could reasonably be considered a theory of chance:

A feature of Reality deserves the name of chance to the extent that it occupies the definitive role of chance; and occupying the role means obeying the old Principle [PP], applied as if information about present chances, and the complete theory of chance, were perfectly admissible. Because of undermining, nothing perfectly occupies the role, so nothing perfectly deserves the name. But near enough is good enough. If nature is kind to us, the chances ascribed by the probabilistic laws of the best system will obey the old Principle to a very good approximation in commonplace applications. They will thereby occupy the chance-role well enough to deserve the name. To deny that they are really chances would be just silly. (Lewis 1994: 489)

There is thus widespread agreement that the Principal Principle, or
something close to it, captures a basic truth about
chance.^{[1]}

As Lewis states, it is the problem of ‘undermining’
which has come to be seen as most problematic for the Principal
Principle. This is the problem that knowing the chances itself seems to
provide trumping evidence about the chances! According to reductionist
theories of chance, discussed further below
(§3),
the values of the chances are fixed by the total history
of occurring events. A simple view of this sort is the view that says
the chances are the occurring frequencies, rounded up to give simple
values. So if a coin is repeatedly tossed and lands heads
*about* half the time, round up the frequency to one-half
exactly—that is the chance of heads. But if the chance of heads
is now \(\frac{1}{2}\), the chance of one million consecutive heads is
1/\((2^{10}6) \gt 0\). Supposing the coin has only been
tossed a reasonably small number of times, a million further heads will
swamp the currently observed frequencies; in other words, if that very
surprising but nevertheless possible event were to occur, the chance of
heads would be 1, or very close to it—not \(\frac{1}{2}\). Such a
possible future, which has some chance of making the current chances
false, is called an *undermining* future. The problem arises
because the current chance of an undermining future is positive, but
since if the undermining future came to pass, the current chances would
not be what they are—they would be different. So we know that if
the current chances are as we think, the undermining future will
*not* come to pass. So we can know a priori that if the chances
are as we think they are, the undermining future is impossible, and we
should assign no credence in the undermining future conditional on the
chances being as they are. But the PP entails that we should place some
positive credence in the undermining future conditional on the chances
being as they are. Contradiction (Lewis, 1994; Hall, 1994; Thau,
1994).

In response, Lewis, Hall and Thau advocate moving to another
principle. Hall diagnoses the problem as arising because the present
chances involve, on this reductionist picture, information which
interacts problematically with the chances assigned to undermining
futures—they aren’t independent of one another. So the
chances aren’t an unconditional expert. But the chances are still
expert for you, not in the sense that you should slavishly adopt the
chances as your credences, but in the sense that the chance, given your
current information, of some outcome is still a better guide to what
you should set your credence to be than any alternative. That is, Hall
(2004) argues, chance is an *analyst-expert*, which ‘earns
[its] epistemic status because [it] is particularly good at evaluating
the relevance of one proposition to another’—in this case,
evaluating the relevance of your evidence (even evidence about the
chances) to future outcomes. In that case, we should adopt only something like this
principle connecting chance and credence:

**(Chance-Analyst)**

\(C(p \mid \ulcorner\Ch(p\mid E) = x\urcorner \wedge E) = x\).

If we adopt this principle instead of PP (modulo some important qualifications that Hall (2004: pp. 102–5) discusses), we can avoid the problem of undermining futures. Consider an undermining future \(F\). It remains true that it has some chance of coming to pass, so that there is some chance that the chances are otherwise. Earlier we used the PP to show that rational agents should therefore have some positive credence in \(F\) conditional on the chances, even though they are a priori inconsistent. But the above principle only tells us that

- (3)
- \( C(F \mid \ulcorner\Ch(F\mid E) = x\urcorner \wedge E) = x\).

Since it follows immediately if \(E\) includes facts about the present chances that \(E\) and \(F\) are inconsistent, the chance of \(F\) conditional on \(E\) is zero and therefore \(x = 0\)—we cannot derive from the analyst-expert role of chance the problematic claim. This is just to say that if the present evidence suffices to establish the chances, it also suffices to rule out undermining futures which would falsify what \(E\) says about the present chances (though, since undermining futures are possible, evidence about chances is not perfectly admissible).

For various reasons, Chance-Analyst hasn’t been thought to
perfectly capture the full sense in which we should epistemically
defer to chance. For one thing, that principle involves credence being
conditionalised on a proposition about chance, that the chance
function is itself *not* conditionalised upon; it seems perhaps
odd that one would count as failing to defer to chance if the chance
function did not assign chance 1 to the proposition about chance that
you are conditionally certain of, and yet that is a consequence of
Chance-Analyst (Reaber 2010, Other Internet Resources). In this sense,
an overall better formulation of the New Principle that Lewis and
others invoke to respond to the problem of undermining futures might
be the version proposed by Joyce (2007). This is a
‘global’ principle, because it involves conditioning on a
claim about an entire probability function at
once. Let *chance* be a non-rigid designator of the actual
chance function (varying from world to world), and
let \(\mathbf{P}\) rigidly designate some particular probability
function. Then, Joyce (2007: 198) proposes, this principle captures what it means
for someone to defer their credence to the chances:

**(NP)**- Let \(C\) be the credence function for someone whose evidence is limited to the past and present. Then, if the chances are given by probability function \(\mathbf{P}\), then \(C(p\mid \ulcorner \chance = \mathbf{P}\urcorner) = \mathbf{P}(p \mid\ulcorner\chance = \mathbf{P}\urcorner)\).

Joyce notes that an aspect of his principle NP is a commitment to the earlier principle Chance-Analyst, so adopting NP as one’s conception of deference to chances allows the reductionist to block the problem of undermining in the way sketched above.

It is an issue for Humeans whether our evidence \(E\) really
does make the present chances part of our present evidence. If not, it
will be difficult to apply either of these principles, NP or
Chance-Analyst, in a way that generates unconditional credential
judgements. Lewis and others worry about how, and to what extent, we
do know about the present chances—the answer, that we can know
them to the extent that they are independent of possible undermining
futures, is not very helpful. The further observation that most future
events, even if they make some small contribution to the coming to
pass of an undermining future, can be treated as if they
are *independent* of the present chances without risk of
significant error, is more helpful. It shows that, for most
particular localised future events, we can treat chance as governed by
the PP (as what Hall calls a ‘database expert’), as
Chance-Analyst simply reduces to the original PP. Indeed reductionists
and non-reductionists alike can accept the New Principle NP, knowing
in the latter case that the chances are independent of the future and
so the original PP is fine, and in the former case that for all
practical purposes the original PP is
fine.^{[2]}
(More on Lewis’ own theory of
chance, and its connection with laws and his broader metaphysical
concerns, can be found in the entry on Lewis,
Weatherson 2010:
§5.1.)

Contrary to Lewis’ contention, however, the Principal
Principle (or its sophisticated variant) is not the only truth about
chance. As Arntzenius and Hall (2003) have pointed out (in connection
with the problems for reductionism about chance), some probability
functions which obey the Principal Principle perfectly are very
*unlike* chances. They conclude that we know more about chance
than is captured by the Principal Principle alone, because we know that
these functions, constructed simply to meet the Principal Principle but
with no independent claim to be classified as chances, are not chances.
This conclusion has been widely accepted. So while the Principal
Principle captures a lot of what we know about chance, there are other
truths about chance that help to narrow the field of probability
functions which could be chances still further. The problem with the
existence of additional principles is that perhaps nothing meets all of
them perfectly. Indeed, this problem seems to be real, for every
function which has been proposed to meet most of the platitudes about
chance has turned out to violate some others (Schaffer 2003). But most
have agreed that we may adapt Lewis’ remarks above, maintaining
that the function which is near enough to meeting all or most of what
we take ourselves to know about chance is near enough to chance to
deserve the name.

### A.2 The Basic Chance Principle

Prominent among these other principles is the *Basic Chance
Principle* (BCP), connecting chance and possibility. It was named
by Bigelow, Collins and Pargetter (1993), who give this informal
argument for the existence of such a connection:

In general, if the chance of \(A\) is positive there must be a possible future in which \(A\) is true. Let us say that any such possible futuregroundsthe positive chance of \(A\) . But what kinds of worlds can have futures that ground the fact that there is a positive present chance of \(A\) in the actual world? Not just any old worlds. … [T]he positive present chance of \(A\) in this world must be grounded by the future course of events in some \(A\)-world sharing the history of our world and in which the present chance of \(A\) has the same value as it has in our world. That is precisely the content of the BCP. (Bigelowet al.1993: 459)

In other words, if the chance of \(A\) is non-zero in some world at some time, then \(A\) will in fact happen at some possible world which shares the history and chances with \(w\) —if not \(w\) itself, then a situation very like \(w\). If Ch\(_{tw}\) is the chance distribution at \(t\) in world \(w\), their formulation of the BCP is this:

Suppose \(x \gt 0\) and \(\Ch_{tw}(A) = x\). Then \(A\) is true in at least one of those worlds \(w'\) that matches \(w\) up to time \(t\) and for which \(\Ch_t(A) = x\). (Bigelowet al.1993: 459)

Again, in accepting the connection between chance and possibility
expressed by the BCP, we needn’t endorse this precise
formulation. Schaffer, for example, though he endorses a principle
stronger than the BCP, motivates it by this informal gloss: ‘if
there is a non-zero chance of \(p\) , this should entail that
\(p\) is possible, and indeed that \(p\) is compossible with
the circumstances’ (Schaffer 2007:
124).^{[3]}
Mellor (2000) endorses this
basic connection between chance and possibility, arguing that there is
a ‘necessity condition’ on chance, ensuring that chances
behave just like modalities. In particular, on Mellor’s view,
chance one behaves like necessity, and chance zero like impossibility,
so \(p\)’s having an intermediate chance entails that it is
possible. Finally, Eagle (2011) agrees with Mellor on the formal
features of chance ascriptions, but argues that the connection is
better taken to hold between chance and the modal ‘can’ of
ability ascriptions. Evidence for this thesis comes from the widespread
endorsement by ordinary speakers that there is a non-zero chance that
\(p\) iff \(p\) can happen, where ‘can’ expresses
the dynamic, ability-attributing, modal. This is not the BCP, however,
because the best semantic accounts of ‘can’ are very unlike
the conditions on ‘possibly’ imposed in the BCP (Kratzer,
1977; Lewis, 1979b). So the details of the chance-possibility
connection could turn out very differently while still endorsing the
broad thrust of the BCP.

The BCP is not a trivial truth, and is not universally
accepted.^{[4]}
One
objection is that the BCP is inconsistent with the existence of
undermining futures, those futures which have a present chance of
coming to pass, but which if they did come to pass would entail that
the present chances (or laws) are otherwise than they are. As before,
the present chance of tossing a fair coin one million times and it
landing heads every time is small but non-zero. If this event were to
occur in some possibility, the chance of heads for that coin in that
world would not—or so the story goes—be \(\frac{1}{2}\); so there
is no world with the same chances and history as ours in
which this event occurs, contrary to the BCP. The key to the existence
of undermining futures is the broadly Humean (or reductionist)
principle that whatever the chances are, they should supervene on the
total arrangement or pattern of occurrent events (see
§A.3).
So if the counterfactual pattern of events can be so
as to undermine the actual chance, the BCP will fail. Thus commitment
to the BCP prima facie involves a commitment to a non-Humean conception
of chance, one on which undermining is impossible. On the other hand,
such a view avoids the problems that undermining generates for the
original PP, so defenders of BCP can also retain the original version
of the that Principle.

This objection only has force if one accepts Humean reductionism
about chance, and while that has a strong pull for many broadly
empiricist metaphysicians and philosophers of science, it cannot be
thought to have as much direct intuitive support as the BCP itself
(indeed, Bigelow *et al*. are explicit in using BCP to argue
against Humeanism). Moreover, some versions of the BCP, like
Eagle’s, are not as obviously inconsistent with Humeanism about
chance as the original Bigelow *et al*. version. Finally, Schaffer
(2003: §4) has argued that, despite appearances, there is a
conception of chance according to which the BCP and a broadly Humean
account of chance are together consistent (and moreover consistent with
the PP).^{[5]}
So
this objection isn’t conclusive. But the fact that there is an
objection to be found in this direction at all relies on a further view
about chance, discussed in the next section.

### A.3 Frequencies, Reductionism, and the Stable Trial Principle

The third constraint is that the chance of some outcome should be approximately equal to the actual frequencies of similar outcomes in all similar circumstances. This vague constraint has metaphysical and epistemological readings. It may be understood as proposing that the value of chances be systematically related to the values of the frequencies, or merely as proposing that evidence about the values of the chances is provided by evidence about the value of the frequencies (and vice versa).

On the metaphysical side, it is easy to come to share the belief
that chance and frequency should be close. The attraction of the view
is obvious, for it proposes in effect to reduce chance to occurrent
categorical facts like the values of actual frequencies and other
Humean magnitudes. But it is difficult to formulate the reductionist
claim precisely. It could be formulated as a supervenience thesis, so
that no two worlds could differ in the chance they assign to some
outcome unless they also differed in the actual frequencies of similar
outcomes. But such a principle seems open to the objection that two
different but close chance functions could easily result in the same
pattern of outcomes, particularly if there were relatively few relevant
outcomes. (The converse supervenience thesis is subject to similar
worries, since two worlds could easily differ in their outcomes while
sharing their chances.) And this supervenience thesis is relatively
weak—any stronger connection between chances and frequencies,
such as that proposed by *frequentists* of various sorts (von
Mises, 1957; Reichenbach, 1949), is subject to these objections (among
others: Jeffrey 1977; Hájek 1997). Perhaps the most
promising reductionist view about chance is Lewis’ best system
analysis of probabilistic laws (Lewis, 1994; Loewer, 2004). Lewis
suggests that the axioms of the best systematisation of the actual
facts, including the frequency facts, deserve to be called the laws of
nature. In some cases, the best system will include probabilistic
axioms. In this case, Lewis (1994: 480) concludes: ‘So now we can
analyse chance: the chances are what the probabilistic laws of the best
system say they are.’ (A related account is offered by Hoefer
2007.) Yet on this view, the relationship between chance and frequency
is not at all straightforward. While Lewis does offer a suggestive and
persuasive vision for giving a reductive account of chance, it
nevertheless remains true that ‘no reductionist has in fact ever
provided an exact recipe that would show how categorical facts fix the
facts about objective chance’ (Hall 2004: 111).

But we may endorse the epistemological connection between chance and frequency without making any decision about the prospects of reductionism concerning the metaphysics of chance. For whether chances depend on frequencies or not, it is a fundamental principle of scientific and statistical inference that frequencies are good evidence for chances, and that chances are good evidence for frequencies. The latter claim follows quickly from the fact that chances, if they exist, are probabilities, and the fact that probabilities in many common cases with independent trials obey the Law of Large Numbers (Sinai 1992: 21). But the inference from frequencies to chances is more subtle. It is apparently required: If frequencies don’t constrain the chances, then any current opinion about the present chances may be rationally maintained in the light of any future evidence whatsoever, and yet this is something we would find quite irrational. A view about chance which does not include this connection will have a very difficult time explaining why the principles of direct and inverse inference (i.e., inference from actual statistics of outcomes to chance, and vice versa) in statistical hypothesis testing work at all (Hacking, 1965; Ismael, 1996; Eells, 1983). And yet they work very well, as the spectacular empirical success of statistically confirmed theories like quantum physics shows.

The most promising way of implementing the epistemic constraint from frequency to chance is to ground it in the Principal Principle, as Lewis suggests (1980: 106–8)—see also Levi 1980: ch. 12 and Howson and Urbach 1993: 342–7. He argues that hypotheses about chances are as amenable to differential confirmation by evidence, including frequency evidence, as any other propositions about which we have credences. If the hypotheses about chance make predictions about the observed frequencies, in line with the law of large numbers, then the observed frequencies will constrain our posterior credence in the chance hypotheses, in line with the tenets of Bayesian confirmation theory (see the entry on Bayesian epistemology). If the PP has this fundamental role, justifying this apparently independent constraint, then it is particularly pressing for a metaphysical account of chance to make a connection between chance and frequency tenable. And here reductionists like Lewis, Hoefer, and Loewer make their stand:

Be my guest—posit all the primitive unHumean whatnots you like. … But play fair in naming your whatnots. Don’t call any alleged feature of reality ‘chance’ unless you’ve already shown that you have something, knowledge of which could constrain rational credence. I think I see, dimly but well enough, how knowledge of frequencies and symmetries and best systems could constrain rational credence. I don’t begin to see, for instance, how knowledge that two universals stand in a certain special relation \(N^*\) could constrain rational credence about the future coinstantiation of those universals. (Lewis 1994:484)

But Hall (2004) argues that anti-reductionists, who propose that
chances are independent fundamental features of reality, can equally
well explain the PP by taking it to be an analytic
truth.^{[6]}

Reductionists and anti-reductionists about chance alike will admit
that, while frequencies can be evidence for chances, not all
frequencies are up to the job. Frequentists acknowledged this: only
collections of outcomes with similar generating conditions provide
frequencies which are useful for calculating the chance of a future
similar outcome. Frequencies would be worse than useless if ‘we
couldn’t distinguish natural from gerrymandered kinds; again, we
could get the analysis to yield almost any answer we liked. But we can
distinguish. (If we could not, puzzles about chance would be the least
of our worries.)’ (Lewis 1994: 477). Giving a precise answer to
the question, *which classes of events are appropriately natural and
non-gerrymandered?* is difficult—it is the famous
*reference class problem* for frequentism (see also the
discussion in the main entry at
§4.2).
But as Hájek (2007) has
argued, and the quote from Lewis implicitly makes clear, this problem
will face any view about chance whatever, and isn’t particularly
a problem for frequentism, or about chance, being rather the old
problem of the existence of natural classes. Early frequentists simply
assumed, as is now also widely believed, that it makes sense to invoke
such uniform classes of events. Von Mises is explicit:

In games of chance, in the problems of insurance, and in the molecular processes we find events repeating themselves again and again. They are mass phenomena or repetitive events. … The rational concept of probability, which is the only basis of probability calculus, applies only to problems in which either the same event repeats itself again and again, or a great number of uniform elements are involved at the same time. … It is essential for the theory of probability that experience has shown that in the game of dice, as in all the other mass phenomena which we have mentioned, the relative frequencies of certain attributes become more and more stable as the number of observations is increased. (von Mises 1957:10–2)

Set aside the dubious contention that probability requires the
existence of ‘mass phenomena’ (this would rule out as
conceptually incoherent the perfectly legitimate idea that there might
be a chance of an event which just happens to possess no similar
counterpart events). The key insight is that usable (stable)
frequencies are only found in mass phenomena, where the mass phenomena
are explicitly defined so as to require ‘repetitive’
events, and hence cannot be an arbitrary and gerrymandered collection
of
outcomes.^{[7]}

This requirement on usable frequencies, that they come from repeated trials of the same experiment, points us towards another constraint on theories of chance, that chances should depend on the properties of the chance setup. The chance of a single outcome might well be measured by the frequencies in similar trials, but the connection between the trial of interest and the other trials is not merely incidental—what makes those trials evidentially relevant is the fact that they share underlying physical similarities.

One way of capturing this idea is at least that duplicate trials, precisely similar in all respects, in the same world (and thus subject to the same laws of nature) should have the same outcome chances. This is roughly the ‘stable trial principle’ as defended by Schaffer (2003: 37) (later slightly varied as the ‘intrinsicness requirement’ in Schaffer 2007: 125). Pretty much all conceptions of chance, reductionist and non-reductionist alike, respect this constraint, it has been argued:

[any reductionist] recipe for how total history determines chances should be sensitive to basic symmetries of time and space—so that if, for example, two processes going on in different regions of spacetime are exactly alike, your recipe assigns to their outcomes the same single-case chances. (It is not that a non-reductionist will have no place for such a constraint: it is just that she will likely not view it as a substantive metaphysical thesis about chance, but as a substantive methodological thesis about how we should, in doing science, theorize about chance.) (Arntzenius and Hall 2003: 178)

Even quite exotic views about chance respect the stable trial principle. Consider the view that robust objective chances are unnecessary, and can be replaced entirely by credences with certain formal properties (a view most closely associated with de Finetti’s famous proof (1964) that exchangeable—invariant under permutations of order—credences about the outcomes of many individual trials will behave as if there was a real unknown chance guiding the agent’s overall credence in the pattern of those outcomes). This theory of ‘chance’, minimal as it is, nevertheless seems to respect something like the stable trial principle, because exchangeable credences demand the same credences in cases that are (credal) duplicates (Skyrms, 1980: 158–60). (Though see Howson and Urbach (1993: 349–51) for some doubts about whether de Finetti’s argument really makes genuine chance dispensable.) And certainly views of chance according to which chance is a real objective phenomena, grounded in the chance setup of a mass phenomena, will endorse the stable trial principle. Indeed, many such views will even endorse stronger principles, such as that the chances supervene on the physical properties of the trial device alone, or that the chance depends on some particular dispositional property of the chance setup (as in propensity theories). But the controversies that surround propensities (Eagle 2004), and envelop these stronger claims, don’t significantly undermine the original stable trial principle, and the original intuition that the underlying physical process that generates a chancy outcome is of primary importance for grounding the value of a chance, under a given set of laws.