# Epistemic Utility Arguments for Probabilism

*First published Fri Sep 23, 2011; substantive revision Thu Dec 17, 2015*

Our beliefs come in degrees; we believe some more strongly than
others. For instance, I believe that the sun will rise tomorrow more
strongly than I believe that it will rise every morning for the coming
week; and I believe both of these propositions much more strongly than
I believe that there will be an earthquake tomorrow in Bristol. We
call the strength or the degree of our belief in a proposition
our *credence* in that proposition. Suppose I know that a die
is to be rolled, and I believe that it will land on six more strongly
than I believe that it will land on an even number. In this case, we
would say that there is something wrong with my credences, for if it
lands on six, it lands on an even number, and I ought not to believe a
proposition more strongly than I believe any of its logical
consequences. This is a consequence of a popular doctrine in the
epistemology of credences called **Probabilism**, which
says that our credences at a given time ought to satisfy the axioms of
the probability calculus (given in detail below). Since this says
something about how our credences *ought to be* rather than how
they in fact *are*, we call this an epistemic norm.

In this entry, we explore a particular strategy that we might
deploy when we wish to establish an epistemic norm such
as **Probabilism**. It is called *epistemic utility
theory*, or sometimes *cognitive decision
theory*, *epistemic decision theory*, or
even *accuracy-first* or *accuracy-centered
epistemology*. I will use the former. Epistemic utility theory is
inspired by traditional utility theory, so let’s begin with a
quick summary of that.

Traditional utility theory (also known as decision theory, see
entry on normative
theories of rational choice: expected utility) explores a
particular strategy for establishing the norms that govern which
actions it is rational for us to perform in a given situation. Given a
particular situation, the framework for the theory includes *states
of the world* that are relevant to the situation, *actions*
that are available to the agent in the situation, and the
agent’s *utility function*, which takes a state of the
world and an action and returns a measure of the extent to which she
values the outcome of performing that action at that world. We call
this measure the *utility* of the outcome at the world. For
example, there might be just two relevant states of the world: one in
which it rains and one in which it does not. And there might be just
two relevant actions from which to choose: take an umbrella when you
leave the house or don’t. Then your utility function will
measure how much you value the outcomes of each action at each state
of the world: that is, it will give the value of being in the rain
without an umbrella, being in the rain with an umbrella, being with an
umbrella when there is no rain, and being without an umbrella when
there is no rain. With this framework in hand, we can state certain
very general norms of action in terms of it. For instance, we might
say that an agent ought not to perform an action if there is some
other action that has greater utility than it at every possible state
of the world. This norm is called **Naive Dominance**. We
will have a lot to say about it in section
5.1 below.

In epistemic utility theory, the states of the world remain the
same, but the possible actions an agent might perform are replaced by
the possible *epistemic states* she might adopt, and the
utility function is replaced, for each agent, by an *epistemic
utility function*, which takes a state of the world and a possible
epistemic state and returns a measure of the purely epistemic value
that the agent attaches to being in that epistemic state at that state
of the world. So, in epistemic utility theory, we appeal to epistemic
utility to ask which of a range of possible epistemic states it is
rational to adopt, just as in traditional utility theory we appeal to
non-epistemic, pragmatic utility to ask which of a range of possible
actions it is rational to perform. In fact, we will often talk of
epistemic *dis*utility rather than epistemic utility in this
entry. But it is easy to translate between them. If \(\mathfrak{EU}\)
is an epistemic utility function, then \(-\mathfrak{EU}\) is an
epistemic disutility function, and *vice versa*.

Again, certain very general norms may be stated, such as the
obvious analogue of **Naive Dominance** from above. Thus,
before the die is rolled, we might ask whether I should adopt an
epistemic state in which I believe that the die will land on six more
strongly than I believe that it will land on an even number. And we
might be able to show that I shouldn’t because there is some
other epistemic state I could adopt instead that will have greater
epistemic utility however the world turns out. In this case, we appeal
to the epistemic version of **Naive Dominance** to show
what is wrong with my credences. This is an example of how epistemic
utility theory might come to justify **Probabilism**. As
we will see, arguments just like this have indeed been given. In this
entry, we explore these arguments.

- 1. Modelling Epistemic States
- 2. The Form of Arguments in Epistemic Utility Theory
- 3. The Epistemic Norm of Probabilism
- 4. Calibration Arguments
- 5. Accuracy Arguments
- 6. Epistemic disutility arguments
- 7. Related issues
- Bibliography
- Academic Tools
- Other Internet Resources
- Related Entries

## 1. Modelling Epistemic States

In formal epistemology, epistemic states are modelled in many different ways (see entry on formal representations of belief). Given an epistemic agent and a time \(t\), we might model her epistemic state at \(t\) using any of the following:

- the set of propositions she believes at \(t\);
- the set of propositions she believes at \(t\) together with
an
*entrenchment ordering*, which specifies the order in which she is prepared to abandon her beliefs in the light of conflicting evidence; - her
*credence function*at \(t\), which takes each proposition about which she has an opinion and returns her credence in that proposition at \(t\); - a set of credence functions, each of which is a precisification of her otherwise vague or imprecise or indeterminate credences at \(t\);
- her upper and lower probability functions at \(t\);
- and so on.

Epistemic utility theory may be applied to any one of these ways of modelling epistemic states. Whichever we choose, we define an epistemic disutility function to be a function that takes an epistemic state modelled in this way, together with a state of the world, to a non-negative real number or the number \(\infty\), and we take this number to measure the epistemic disutility of having that epistemic state at that world.

However, the vast majority of work carried out so far in epistemic
utility theory has taken an agent’s epistemic state at time
\(t\) to be modelled by her credence function at \(t\). And, in any
case, the epistemic norm of **Probabilism** that
interests us here governs agents modelled in this way. Thus, we focus
on this case. In section 7, we will consider how
the argument strategy employed here to
justify **Probabilism** for agents with precise credences
might be employed to establish other norms either for agents also
represented as having precise credences or for agents represented in
other ways.

So, henceforth, we model an agent’s epistemic state at \(t\) by her credence function at \(t\). We now make more precise what this means. We assume that the set of propositions about which an agent has an opinion is finite and forms an algebra \(\mathcal{F}\). That is:

- It contains a contradictory proposition (\(\bot\)). This is a proposition that is false at all worlds.
- It contains a tautologous proposition (\(\top\)). This is a proposition that is true at all worlds.
- It is closed under disjunction, conjunction, and negation. That is, if \(A\) and \(B\) are in \(\mathcal{F}\), then \(A \vee B\), \(A\ \&\ B\), and \(\neg A\) and \(\neg B\) are also in \(\mathcal{F}\).

We then assume that our agent’s credence in a proposition in
\(\mathcal{F}\) can be measured by a real number between 0 and 1
inclusive, where 0 represents minimal credence, and 1 represents
maximal credence. Then her credence function at \(t\) is a
function *c* from \(\mathcal{F}\) to the closed unit interval
\([0, 1]\). If \(A\) is in \(\mathcal{F}\), then \(c(A)\) is our
agent’s credence in \(A\) at \(t\). Throughout, we denote by
\(\mathcal{C_F}\) the set of possible credence functions defined on
\(\mathcal{F}\). There is no principled reason for restricting to the
case in which \(\mathcal{F}\) is finite. We do it here only because
the majority of work on this problem has been carried out under this
assumption. It is an interesting question how the results here might
be extended to the case in which \(\mathcal{F}\) is infinite, but we
will not explore it here (again, see section 7).

So, an epistemic utility function for credences takes a credence function, together with a way the world might be, and returns a measure of the epistemic utility of having that credence function if the world were that way.

## 2. The Form of Arguments in Epistemic Utility Theory

In epistemic utility theory, we attempt to justify an epistemic
norm **N** using the following two ingredients:

- QA norm of standard utility theory (or decision theory), which is to be applied, using epistemic utility functions, to discover which credence functions it is rational for an agent to adopt in a given situation.
- EA set of conditions that a legitimate measure of epistemic utility must satisfy.

Typically, the inference from **Q**
and **E** to **N** appeals to a mathematical
theorem, which shows that, applied to any epistemic utility function
that satisfies the conditions **E**, the
norm **Q** entails the norm **N**.

Given that the existing arguments of epistemic utility theory share this common form, we might organize these arguments by the norms they attempt to justify, or by the norms of standard utility theory they employ, or by the set of constraints on epistemic utility functions they impose. We will take the latter course in this survey.

In sections 4 and 5, we
identify a specific epistemic goal and treat
epistemic *dis*utility functions as measures of the distance of
an epistemic state from that goal in a given situation; we lay down
conditions that it is claimed all such measures must
satisfy. In section 6, we take an alternative
route: we lay down putative general conditions on any
epistemic *dis*utility function, which it is claimed such a
function must satisfy regardless of whether or not it is a measure of
distance from a specified epistemic goal. In the next section, we
state **Probabilism** precisely, so that we can refer
back to it later.

## 3. The Epistemic Norm of Probabilism

**Probabilism** is often said to be a coherence
constraint on credence functions, which would mean that it governs how
an agent’s credences in some propositions should relate to her
credences in other, related propositions. It is often likened to the
consistency constraint on sets of full beliefs. In fact, this
isn’t quite right. Condition (ii) below is certainly a coherence
constraint, but condition (i) is not.

**Probabilism** A rational agent’s credence
function \(c\) at a given time is a probability function. That is:

- \(c(\bot) = 0\) and \(c(\top) = 1\).
- \(c(A \vee B) = c(A) + c(B)\), for all mutually exclusive \(A\) and \(B\) in \(\mathcal{F}\).

Note that any agent who satisfies **Probabilism** must
be *logically omniscient*: that is, she must be certain of
every tautology. Some other consequences
of **Probabilism**:

- \(c(A) \leq c(A \vee B)\) for any \(A\), \(B\) in \(\mathcal{F}\).
- \(c(A\ \&\ B) \leq c(A)\) for any \(A\), \(B\) in \(\mathcal{F}\).
- \(c(A) = c(B)\) if \(A\) and \(B\) are logically equivalent.

## 4. Calibration Arguments

In this section, we consider the conditions imposed on an epistemic
disutility function when we treat it as a measure of the distance of
an epistemic state from the goal of being *actually*
or *hypothetically calibrated* (van Fraassen 1983; Lange 1999;
Shimony 1988). We say that a credence function is actually calibrated
at a particular possible world if the credence it assigns to a
proposition matches the relative frequency with which propositions of
that kind are true at that world. Thus, credence 0.2 in proposition
\(A\) is actually calibrated if one-fifth of propositions like \(A\)
are actually true. And we say that a credence function is
hypothetically calibrated if the credence it assigns to a proposition
matches the limiting relative frequency with which propositions of
that kind *would* be true *were* there more propositions
of that kind. Thus, credence 0.2 in proposition \(A\) is
hypothetically calibrated if, as we move to worlds with more and more
propositions like \(A\), the proportion of such propositions that are
true approaches one-fifth in the limit. According to the calibration
arguments, matching the relative frequencies or limiting relative
frequencies is an epistemic goal. And they attempt to
justify **Probabilism** by appealing to this goal and
measures of distance from it.

### 4.1 Calibration measures

First, we must make precise what we mean by actual and hypothetical
calibration; then we can say which functions will count as measuring
distance from these putative goals. We treat actual calibration
first. Since we are talking of relative frequencies, we will need to
assign to each proposition in \(\mathcal{F}\) its *reference
class*: that is, the set of propositions that are relevantly
similar to it. Thus, we require an equivalence relation \(\sim\) on
\(\mathcal{F}\), where \(A \sim B\) iff \(A\) and \(B\) are relevantly
similar. For instance, if our algebra of propositions
contains *Heads on first toss of coin*, *Heads on second
toss of coin*, and *Six on first roll of die*, we might
plausibly say that the first two are relevantly similar, but neither
first nor second is relevantly similar to the third. Proponents of
calibration arguments do not claim to give an account of how the
equivalence relation is determined. Nor do they claim that there is a
single, objectively correct equivalence relation on a given algebra of
propositions: this is the notorious *problem of the reference
class* that haunts frequentist interpretations of objective
probability. Rather they treat the equivalence relation as a component
of the agent’s epistemic state, along with her credence
function. Indeed, for van Fraassen, it is determined entirely by the
credence function together with the form of the propositions in
\(\mathcal{F}\) (van Fraassen 1983: 299). However, they do impose some
rational constraints on \(\sim\) in order to establish their
conclusion. We will not discuss these conditions in any detail. Rather
we denote them \(C(\sim)\), and keep in mind that this is a
placeholder for a full account of conditions on \(\sim\). Detailed
accounts of these conditions have been given by van Fraassen (1983) and
Shimony (1988). We say that a credence function \(c\), together with
an equivalence relation \(\sim\), is perfectly calibrated or not
relative to a way the world might be. We are now ready to give our
first definitions; but we preface these with an example.

Suppose a coin is to be flipped 1000 times. And suppose that \(A\)
is the proposition *Heads on toss 1*. And suppose that the
propositions that are relevantly similar to \(A\) in algebra
\(\mathcal{F}\) are: *Heads on toss 1*, …*Heads on
toss 1000*. Finally, suppose that \(w\) is a possible world; a way
that the world might be. In fact, throughout this article, we need not
quantify over genuine possible worlds, which are maximally specific
ways the world might be; we need only quantify over ways the world
might be that are specific enough to assign truth values to each of
the propositions in the algebra \(\mathcal{F}\). Let’s call
these *possible worlds relative to \(\mathcal{F}\)* and let
\(\mathcal{W_F}\) be the set of them for a given algebra
\(\mathcal{F}\). Then the *relative frequency of \(A\) at
\(w\)* (written \(\mathrm{Freq}(\mathcal{F}, A, \sim, w)\)) is the
proportion of the propositions relevantly similar to \(A\) that are
true at \(w\): that is, the frequency of heads amongst the 1000 coin
tosses at that world. For instance, if every second toss lands heads
at \(w\), or if the first five hundred land heads and the rest land
tails at \(w\), then \(\mathrm{Freq}(\mathcal{F}, A, \sim, w) =
\frac{1}{2}\). If every third toss lands heads at \(w\), then
\(\mathrm{Freq}(\mathcal{F}, A, \sim, w) = \frac{1}{3}\). And so
on.

Now we give the definition in full generality. Suppose \(\sim\) is an equivalence relation on \(\mathcal{F}\), and \(w\) is a possible world relative to \(\mathcal{F}\). Then:

- For each \(A\) in \(\mathcal{F}\), the
*relative frequency of truths amongst propositions like \(A\)*is defined as follows: \[\mathrm{Freq}(\mathcal{F}, A, \sim, w) := \frac{|\{ X \in \mathcal{F} : X \sim A\ \&\ v_w(X) = 1\}|}{|\{X \in \mathcal{F} : X \sim A\}|}\] where \(|X|\) is the cardinality of the set \(X\) and \(v_w\) is the standard numerical truth value assignment at that world, so that \(v_w(X) = 1\) if \(X\) is true at \(w\) and \(v_w(X) = 0\) if \(X\) is false at \(w\) (we call \(v_w\) the*omniscient credence function at \(w\)*). Thus, \(\mathrm{Freq}(\mathcal{F}, A, \sim, w)\) is the proportion of true propositions amongst all propositions in \(\mathcal{F}\) that are relevantly similar to the proposition \(A\). - Relative to \(\sim\), the credence
*r*in proposition \(A\) is*actually calibrated*at \(w\) if \(r = \mathrm{Freq}(\mathcal{F}, A, \sim, w)\).

The idea is that, if \(\sim\) satisfies constraints \(C(\sim)\), then the function \(\mathrm{Freq}(\mathcal{F}, \cdot, \sim, w)\) is always a probability function on \(\mathcal{F}\).

It is clear from this definition that the calibration arguments will work only for finite algebras \(\mathcal{F}\). For an infinite algebra, the definition just given will often make no sense, since the cardinalities of the two sets involved in the ratio will often be infinite.

Next, we treat hypothetical calibration. For this, we need the notion of the limiting relative frequency of truths amongst propositions of a certain sort. The idea is that, for each proposition \(A\) in \(\mathcal{F}\), there is not just a fact of the matter about what the frequency of truths amongst propositions like \(A\) actually is; there is also a fact of the matter about what the frequency of truths amongst propositions like \(A\) would be if there were more propositions like \(A\). For instance, there is not just a fact of the matter about how many actual tosses of a given coin will land heads; there is also a fact of the matter about the frequency of heads amongst hypothetical further tosses of the same coin. In general, suppose we have a possible world \(w\), an extension \(\mathcal{F}'\) of \(\mathcal{F}\) (containing new propositions like \(A\)), and an extension \(\sim'\) of \(\sim\) to cover the new propositions in \(\mathcal{F}'\). Then there is a single unique number \(\mathrm{Freq}(\mathcal{F}', A, \sim', w)\) that gives what the relative frequency of truths amongst propositions like \(A\) would be were there all the propositions in \(\mathcal{F}'\) and where the relation of similarity amongst them is given by \(\sim'\), where this counterfactual is evaluated at the world \(w\). Again, let us illustrate this using our example of the coin toss from above.

Suppose again that \(A\) is the proposition *Heads on toss
1* and that the propositions in \(\mathcal{F}\) that are
relevantly similar to \(A\) according to \(\sim\) are *Heads on
toss 1*, …, *Heads on toss 1000*. Now suppose that
\(\mathcal{F}_1\) extends \(\mathcal{F}\) by introducing a new
proposition about a further hypothetical toss of the coin (as well as
perhaps other propositions). That is, it introduces *Heads on toss
1001* (and closes out under negation, disjunction, and
conjunction). And suppose that \(\sim_1\) extends \(\sim\), so that
the new proposition *Heads on toss 1001* is considered
relevantly similar to each *Heads on toss 1*,
…, *Heads on toss 1000*. Then those who appeal to
hypothetical limiting frequencies must claim that there is a unique
number that gives what the frequency of heads would be, were the coin
tossed 1001 times. They denote this number
\(\mathrm{Freq}(\mathcal{F}_1, A, \sim_1, w)\). Now suppose that
\(\mathcal{F}_2\) extends \(\mathcal{F}_1\) by adding the new
proposition *Heads on toss 1002* and \(\sim_2\) extends
\(\sim_1\), so that the new proposition *Heads on toss 1002* is
considered relevantly similar to each *Heads on toss 1*,
…, *Heads on toss 1001*. And so on. Then the limiting
relative frequency of \(A\) at \(w\) (written
\(\mathrm{LimFreq}(\mathcal{F}, A, \sim, w)\)) is the number towards
which the following sequence tends:
\[\mathrm{Freq}(\mathcal{F}, A, \sim, w), \mathrm{Freq}(\mathcal{F}_1,
A, \sim_1, w), \mathrm{Freq}(\mathcal{F}_2, A, \sim_2, w),
\ldots\]

In general, for each algebra \(\mathcal{F}\) and equivalence relation \(\sim\), there is an infinite sequence \[(\mathcal{F}, \sim) = (\mathcal{F}_0, \sim_0), (\mathcal{F}_1, \sim_1), (\mathcal{F}_2, \sim_2), \ldots\] of pairs of algebras \(\mathcal{F}_i\) and equivalence relations \(\sim_i\) such that each \(\mathcal{F}_{i+1}\) is an extension of \(\mathcal{F}_i\) and each \(\sim_{i+1}\) is an extension of \(\sim_i\) and, for each \(i\), \(C(\sim_i)\). Using this, we can define the notion of limiting relative frequency and the associated notion of hypothetical calibration in full generality. Suppose \(\sim\) is an equivalence relation on \(\mathcal{F}\) and \(w\) is a possible world. And suppose \[(\mathcal{F}_0, \sim_0), (\mathcal{F}_1, \sim_1), (\mathcal{F}_2, \sim_2), \ldots\] is the sequence just mentioned. Then:

- For each \(A\) in \(\mathcal{F}\), the
*limiting relative frequency of truths amongst propositions like \(A\)*is \[\mathrm{LimFreq}(\mathcal{F}, A, \sim, w) := \lim_{n \rightarrow \infty} \mathrm{Freq}(\mathcal{F}_n, A, \sim_n, w)\] That is, the limiting relative frequency of \(A\) is the number approached arbitrarily closely by the hypothetical relative frequencies of truths as we extend the algebra \(\mathcal{F}\) to include more and more propositions like \(A\). - Relative to \(\sim\), the credence
*r*in proposition \(A\) is*hypothetically calibrated*at \(w\) if \[r = \mathrm{LimFreq}(\mathcal{F}, A, \sim, w)\]

According to some calibration arguments, actual calibration is an
epistemic goal; according to others, hypothetical calibration is the
goal. Whichever it is, the epistemic disutility of a credence ought to
be given by its distance from this epistemic goal. We say that an
epistemic disutility function is *local* if it measures only
the epistemic disutility of an individual credence at a world; we say
that it is *global* if it measures the epistemic disutility of
an entire credence function at a world. In this section, we will be
concerned only with local epistemic disutility functions. In
sections 5 and 6, we
will be concerned instead with global epistemic disutility
functions.

The goals of actual calibration and hypothetical calibration give rise to the following definitions of two sorts of local epistemic disutility function:

- An
*actual calibration measure*is a function of the form \[\mathfrak{c}(r, A, \mathcal{F}, \sim, w) = f(|\mathrm{Freq}(\mathcal{F}, A, \sim, w) - r|)\] where \(f : [0, 1] \rightarrow \mathbb{R}\) is a strictly increasing continuous function with \(f(0) = 0\). Let**Actual Calibration**be the claim that \(\mathfrak{c}\) is the measure of epistemic disutility. - A
*hypothetical calibration measure*is a function of the form \[\mathfrak{hc}(r, A, \mathcal{F}, \sim, w) = f(|\mathrm{LimFreq}(\mathcal{F}, A, \sim, w) - r|)\] where again \(f : [0, 1] \rightarrow \mathbb{R}\) is a strictly increasing continuous function with \(f(0) = 0\). Let**Hypothetical Calibration**be the claim that \(\mathfrak{hc}\) is the measure of epistemic disutility.

Our next task is to identify the norms of standard decision
theory/utility theory that are deployed in conjunction with this
characterization to derive **Probabilism**.

### 4.2 Calibration arguments for Probabilism

In this section, we consider the two accounts of epistemic
disutility for credences given in the previous section and we combine
them with decision-theoretic norms to derive epistemic norms. When we
state the decision-theoretic norms in question, we state them in full
generality. In practical decision theory, we evaluate acts: it is acts
that have practical disutilities at worlds. In epistemic decision
theory, on the other hand, we evaluate credence functions: it is
credence functions that have epistemic disutilities at worlds. And in
another context still, we might wish to use decision theory to
evaluate some other sort of thing, such as a scientific theory (Maher
1993). So we want to state the decision-theoretic norms in a way that
is neutral between these. We will talk of *options* as the
things that are being evaluated and that have utilities at
worlds. Options can thus be acts or credence functions or scientific
theories or some other sort of thing.

Here’s our first putative norm of standard decision theory (van Fraassen 1983: 297):

**Possibility of Vindication** A rational agent will
not adopt an option that has no possibility of attaining minimal
disutility, when such a minimum exists.

Here it is a little more formally: Suppose \(\mathcal{O}\) is a set of options, \(\mathcal{W}\) is the set of possible worlds, and \(\mathfrak{U}\) is a disutility function. Then, if \(o^*\) is an option, and there is no \(w^*\) in \(\mathcal{W}\) such \[\mathfrak{U}(o^*, w^*) = \min \{\mathfrak{U}(o, w) : o \in \mathcal{O}\ \&\ w \in \mathcal{W}\}\] (when this minimum exists), then \(o^*\) is irrational.

It can be shown that, together with **Actual
Calibration** from the previous section and suitable
constraints \(C(\sim)\) on the equivalence relation \(\sim\), this
norm entails something stronger than **Probabilism**. It
entails:

**Rational-valued Probabilism** At any time \(t\), a
rational agent’s credence function \(c\) is a probability
function *that takes only values in* \(\mathbb{Q}\) (where
\(\mathbb{Q}\) is the set of rational numbers).

This is a consequence of the following theorem:

**Theorem 1** Suppose \(\mathfrak{c}\) is a
calibration measure and suppose \(C(\sim)\). Then the following are
equivalent:

- \(c\) is a probability function on \(\mathcal{F}\) that takes only values in \(\mathbb{Q}\);
- There is a world at which \(c\) is actually calibrated. That is, there is a world \(w\) in \(\mathcal{W}\) such that, for all \(A\) in \(\mathcal{F}\), \(\mathfrak{c}(c(A), A, \mathcal{F}, \sim, w) = 0\).

Different versions of this theorem result from different
constraints \(C(\sim)\) on the equivalence relation \(\sim\) (van Fraassen
1983; Shimony 1988), but the result is not surprising. An agent will
satisfy **Possibility of Vindication** just in case her
credences match the relative frequencies at some world. And those
relative frequencies will satisfy the probability axioms if
\(C(\sim)\) and if we have specified that condition correctly. That
they will be rational numbers follows from the definition of the
relative frequency of a proposition at a world.

Thus, we have the following argument:

**Actual Calibration argument for Rational-valued
Probabilism**

- \((1)\)Actual Calibration
- \((2)\)Possibility of Vindicatione
- \((3)\)Theorem 1
- Therefore,
- \((4)\)Rational-valued Probabilism

Most proponents of the calibration argument are reluctant to accept
a norm that rules out every credence given by an irrational number. To
establish the weaker norm of **Probabilism**, there are
two strategies they might adopt. The first is to appeal to the
epistemic goal of hypothetical calibration instead of actual
calibration. This, together with **Possibility of
Vindication** gives us **Probabilism** via the
following theorem:

**Theorem 2** Suppose \(C(\sim)\). Then the following
are equivalent:

- \(c\) is a probability function on \(\mathcal{F}\).
- There is a world at which \(c\) is hypothetically calibrated. That is, there is a world \(w\) in \(\mathcal{W}\) such that, for all \(A\) in \(\mathcal{F}\), \(\mathfrak{hc}(c(A), A, \mathcal{F}, \sim, w) = 0\).

The reason is that, while relative frequencies are always rational numbers, the limit of an infinite sequence of rational numbers may be an irrational number. And, in fact, for any irrational number, there is a sequence of rational numbers that approaches it in the limit (indeed, there are infinitely many such sequences).

Thus, we have the following argument:

**Hypothetical Calibration argument for
Probabilism**

- \((1)\)Hypothetical Calibration
- \((2)\)Possibility of Vindication
- \((3)\)Theorem 2
- Therefore,
- \((4)\)Probabilism

An alternative route to **Probabilism** changes the
decision-theoretic norm to which we appeal, rather than the sort of
calibration from which we wish our epistemic disutility function to
measure distance. The alternative norm is:

**Possibility of Arbitrary Closeness to Vindication.**
An agent ought not to adopt an option unless there are worlds at which
it is arbitrarily close to achieving minimal disutility.

That is: Suppose \(\mathcal{O}\) is a set of options, \(\mathcal{W}\) is the set of possible worlds, and \(\mathfrak{U}\) is a disutility function. Then, if \(o^*\) is an option, and if it is not the case that, for any \(\varepsilon > 0\), there is a possible world \(w^*_\varepsilon\) in \(\mathcal{W}\) such \[| \mathfrak{U}(o^*, w^*_\varepsilon) - \min \{\mathfrak{U}(o, w) : o \in \mathcal{O}\ \&\ w \in \mathcal{W}\}| < \varepsilon\] (when these minima exist), then \(o^*\) is irrational.

Together with the characterization of calibration measures given
above, suitable constraints \(C(\sim)\) on the equivalence relation
\(\sim\), and two extra assumptions, this norm does
establish **Probabilism**. The extra assumptions are
these: First, if our agent has a credence function \(c\) in
\(\mathcal{C_F}\), the possible worlds that we are considering include
not only all (consistent) truth assignments to \(\mathcal{F}\), but
also any (consistent) truth assignments to any (finite) algebra
\(\mathcal{F}'\) that extends \(\mathcal{F}\). And, second, given any
such \(\mathcal{F}'\), the equivalence relation \(\sim\) can be
extended in any possible way, providing the extension \(\sim'\) of
\(\sim\) satisfies \(C(\sim')\).

**Theorem 3** Suppose \(C(\sim)\). Then the following
are equivalent:

- \(c\) is a probability function on \(\mathcal{F}\).
- For all \(\varepsilon > 0\), there is a finite extension \(\mathcal{F}'\) of \(\mathcal{F}\) and an extension \(\sim'\) of \(\sim\) that satisfies \(C(\sim')\), and a possible world \(w'\) in \(\mathcal{W}\) such that, for all \(A\) in \(\mathcal{F}\), \(\mathfrak{c}(c(A), A, \mathcal{F}', \sim', w') < \varepsilon\)

Thus, if our agent satisfies **Probabilism**, then
however close she would like to be to actual calibration, there is
some possible world at which she is that close. And conversely.

Thus, we have the following argument:

**Actual Calibration argument for Probabilism**

- \((1)\)Actual Calibration
- \((2)\)Possibility of Arbitrary Closeness to Vindication
- \((3)\)Theorem 3
- Therefore,
- \((4)\)Probabilism

These are the calibration arguments
for **Probabilism**. In the next section, we consider
objections that may be raised against them.

### 4.3 Objections to calibration arguments for Probabilism

Objection 1: *Calibration is not an epistemic goal.* It may
be objected that neither actual nor hypothetical calibration measures
are *truth-directed* epistemic disutility functions, where this
is taken to be a necessary condition on such a function (Joyce 1998:
595; Seidenfeld 1985). We say that a local epistemic disutility
function—that is, recall, an epistemic disutility function
defined for individual credences—is truth-directed if the
disutility that it assigns to a credence in a true proposition
increases as the credence decreases, and the disutility it assigns to
a credence in a false proposition increases as the credence
increases. Calibration measures do not have this property. To see
this, let us return to our toy example: the propositions *Heads on
toss 1*, …, *Heads on toss 1000* are in
\(\mathcal{F}\) and they are all relevantly similar according to
\(\sim\). Now suppose that the first coin toss lands heads, but all
the others land tails. Then credence 0.001 in *Heads on toss 1*
is actually calibrated, since exactly one out of one-thousand
relevantly similar propositions are true; so it has epistemic
disutility 0. Credence 0.993, on the other hand, is not, and thus
receives a positive epistemic disutility. However, it is a higher
credence in a true proposition, and thus should be assigned a lower
epistemic disutility, according to the requirement of
truth-directedness. One natural response to this objection is that it
is question-begging. Proponents of the calibration argument will
simply reject the claim that an epistemic disutility function must be
truth-directed. Credences, unlike beliefs, they might say, are not in
the business of getting close to the truth; they are in the business
of getting close to being calibrated.

Objection 2: *Limiting relative frequencies are not
well-defined.* To define the limiting relative frequency of \(A\)
at a world \(w\), we require that there is a unique sequence of
extensions of the algebra each of which contains more propositions
that are relevantly similar to \(A\) than the previous extension, and
a corresponding sequence of relative frequencies of truths amongst the
propositions like \(A\) in the corresponding algebra. But the
assumption of such a unique sequence is extremely controversial and
the problems to which it gives rise have haunted hypothetical
frequentism about objective probability (Hájek 2009).

Objection 3: *Neither Possibility of
Vindication nor Possibility of Arbitrary Closeness
to Vindication is a norm.* It might be that the only
actions that give rise to the possibility of vindication or of
arbitrary closeness to vindication also give rise to the possibility
of maximal distance from vindication. And it might be that there are
actions that do not give rise to the possibility of vindication or of
arbitrary closeness to vindication, but do limit the distance from
vindication that is risked by choosing that action. In such cases, it
is not at all clear that it is rationally required of an agent that
she ought to risk maximal distance from vindication in order to leave
open the possibility of vindication or of arbitrary closeness to
vindication. Compare: I have two options—if I choose option 1, I
will receive £0 or £100, but I don’t know which; if I choose
option 2, I will receive £99 for sure. Even before they know the
objective chances of the two possibilities that the first option
creates, many people will opt for the second. However, by doing so,
they rule out the possibility that they will receive the maximum
possible utility, which is obtained by option 1 if I receive £100. It
seems that ruling out such a possibility is not irrational. To put it
another way:

**Possibility of Vindication**and

**Possibility of Arbitrary Closeness to Vindication**are extreme risk-seeking norms. That is, they suggest that we make our decisions by trying to maximise the utility we obtain in our best-case scenario. But while it might be rationally permissible to be so risk-seeking, it is certainly not mandatory (Easwaran & Fitelson forthcoming: Section 8).

Objection 4: *The constraints on \(\sim\) are
ill-motivated.* This objection will vary with the constraints
\(C(\sim)\) that are imposed on \(\sim\). One uncontroversial
constraint is this: If \(A \sim B\), then \(c(A) = c(B)\). The further
constraints imposed by Shimony (1988) and van Fraassen (1983) are more
controversial (Joyce 1998: 594–6). Moreover, they limit the
application of the result, since they involve assumptions about the
form of the propositions in \(\mathcal{F}\). Thus, the calibration
arguments do not show in general, of any finite algebra
\(\mathcal{F}\), that a credence function on \(\mathcal{F}\) ought to
be a probability function, since not every such algebra will contain
propositions with the form required by the constraints
\(C(\sim)\).

## 5. Accuracy Arguments

In this section, we move from calibration arguments to accuracy
arguments for **Probabilism**. These arguments have the
same structure as the calibration arguments. They consist of a
mathematically-precise account of epistemic disutility and a
decision-theoretic norm. And they derive, from that norm together with
that account of disutility, an epistemic norm. In particular, they
derive **Probabilism**. And that derivation goes via a
mathematical theorem. However, they will use different accounts of
epistemic disutility and different decision-theoretic norms.

In this section, we will begin with the original accuracy-based
argument for **Probabilism** due to James M. Joyce (1998;
see also Rosenkrantz 1981). Then we’ll consider its various
components in turn, and explore the objections they have elicited and
the adjustments that have been made to them.

### 5.1 Joyce’s accuracy argument for Probabilism

Joyce’s argument consists of an account of the epistemic disutility of credences and a decision-theoretic norm. Let’s consider each in turn.

Joyce’s account of the epistemic disutility of credences itself consists of two components. The first identifies epistemic disutility with gradational inaccuracy; the second gives a mathematically-precise account of gradational inaccuracy.

In more detail: The first component of
Joyce’s account of epistemic disutility for credences is the
claim—which we will call **Credal Veritism**,
partly following Goldman (2002: 58)—that the only source of
value for credences that is relevant to their epistemic status is
their *gradational accuracy*, where the gradational accuracy of
a credence in a true proposition is higher when the credence is closer
to 1, which we might think of as the ideal or vindicated credence in a
true proposition, while the gradational accuracy of a false
proposition is higher when the credence is closer to 0, which we might
think of as the ideal or vindicated credence in a false
proposition. Thus, the only source of *dis*value for credences
is their gradational *in*accuracy.

The second component of Joyce’s
account of epistemic disutility for credences is a set of
mathematically-precise conditions that a measure of the gradational
inaccuracy of a credence function at a given possible world must
satisfy. A putative inaccuracy measure for credence functions over an
algebra \(\mathcal{F}\) is a mathematical function \(\mathfrak{I}\)
that takes a credence function \(c\) in \(\mathcal{C_F}\) and a
possible world \(w\) in \(\mathcal{W_F}\) and returns a number
\(\mathfrak{I}(c, w)\) in \([0, \infty]\) that measures the inaccuracy
of \(c\) at \(w\). (The set \([0, \infty]\) contains all non-negative
real numbers together with \(\infty\).) Here is an example, called
the *Brier score*:
\[\mathfrak{B}(c, w) := \sum_{X \in \mathcal{F}} |v_w(X) - c(w)|^2\]
Thus, the Brier score measures the inaccuracy of a credence function
at a world as follows: it takes each proposition to which the credence
function assigns credences; it takes the difference between the
credence that the credence function assigns to that proposition and
the ideal or vindicated credence in that proposition at that world; it
squares this difference; and it sums up the results. I will not give
all of Joyce’s conditions here, but I will note that the Brier
score just defined satisfies them all. Let us say that any putative
inaccuracy measure \(\mathfrak{I}\) that satisfies these conditions is
a *Joycean inaccuracy measure*. And let **Joycean
Inaccuracy** be the claim that all legitimate inaccuracy
measures are Joycean inaccuracy measures.

Combining **Credal Veritism** and **Joycean
Inaccuracy**, we have the claim that the epistemic disutility
of a credence function at a world is given by its inaccuracy at that
world as measured by a Joycean inaccuracy measure.

Let us turn now to the decision-theoretic norm to which Joyce
appeals. We have met it already above in the introduction to this
article: it is the norm of **Naive Dominance**. We will
state it here precisely:

**Naive Dominance** A rational agent will not adopt an
option when there is another option that has lower disutility at all
worlds.

That is: Suppose \(\mathcal{O}\) is a set of options, \(\mathcal{W}\) is the set of possible worlds, and \(\mathfrak{U}\) is a disutility function. Then, if \(o^*\) is an option, and if there is another option \(o'\) such that \(\mathfrak{U}(o', w) < \mathfrak{U}(o^*, w)\) for all worlds \(w\) in \(\mathcal{W}\), then \(o^*\) is irrational. (In this situation, we say that \(o^*\) \(\mathfrak{U}\)-dominates \(o'\).)

The idea behind **Naive Dominance** is this: If there
is one option that is guaranteed to have lower disutility than another
option, then the latter is guaranteed to be worse than the former; so
the agent can know *a priori* that the latter is worse than the
former. And surely it is irrational to adopt an option if there is
another that you know *a priori* to be better.

Thus, we have the substantial components of Joyce’s
argument: **Credal Veritism**, **Joycean
Inaccuracy**, and **Naive Dominance**. From these,
we can derive **Probabilism** via the following
mathematical theorem Joyce (1998: 597–600):

**Theorem 4 (Joyce’s Main Theorem)** Suppose
\(\mathcal{F}\) is an algebra and \(\mathfrak{I} : \mathcal{C_F}
\times \mathcal{W_F} \rightarrow [0, \infty]\) is a Joycean inaccuracy
measure for the credence functions on \(\mathcal{F}\). Now suppose
that \(c^*\) is a credence function in \(\mathcal{C_F}\) that
violates **Probabilism**. Then there is a credence
function \(c'\) in \(\mathcal{C_F}\) such that \(\mathfrak{I}(c', w)
< \mathfrak{I}(c^*, w)\) for all \(w\) in \(\mathcal{W_F}\). (In
this situation, we say that *\(c'\) accuracy dominates \(c^*\)
relative to \(\mathfrak{I}\)*.)

Figure 1 illustrates this result in the
particular very simple case in which \(\mathcal{F}\) contains just a
proposition, *Heads*, and its negation, *Tails*, and
inaccuracy is measured using the Brier score.

Thus, we have the following argument:

**Joyce’s accuracy argument for
Probabilism**

- \((1)\)Credal Veritism + Joycean Inaccuracy
- \((2)\)Naive Dominance
- \((3)\)Theorem 4
- Therefore,
- \((4)\)Probabilism

Figure 1: In this figure, we plot the
various possible credence functions defined on a
proposition *Heads* and its negation *Tails* in the unit
square. Thus, we plot the credence in *Heads* along the
horizontal axis and the credence in *Tails* up the vertical
axis. We also plot the vindicated credence functions \(v_{w_1}\) and
\(v_{w_2}\) for the two worlds \(w_1\) (at which *Tails* is
true and *Heads* is false) and \(w_2\) (at which *Heads*
is true and *Tails* is false). The diagonal line between them
contains all and only the credence functions on these two propositions
that are probability functions and thus
satisfy **Probabilism**. \(c^*\) (which assigns 0.7
to *Heads* and 0.6 to *Tails*)
violates **Probabilism**. The lower right-hand arc
contains all the credence functions that are exactly as inaccurate as
\(c\) at world \(w_2\), where that inaccuracy is measured using the
Brier score. To see this, note that the Brier score of \(c^*\) at
\(w_2\) is the square of the Euclidean distance of the point \(c^*\)
from the point \(v_{w_2}\). Thus, the credence functions that have
exactly the same Brier score as \(c^*\) at \(w_2\) are those that lie
equally far from \(v_{w_2}\). For the same reason, the upper left-hand
arc contains all the credence functions that are exactly as inaccuracy
as \(c\) at world \(w_1\). Every credence function that lies between
the two arcs is more accurate than \(c^*\) at both worlds. These are
the ones whose squared Euclidean distance from \(v_{w_2}\) is less
than the squared Euclidean distance of \(c^*\) from \(v_{w_2}\), and
similarly for \(v_{w_1}\). It assigns 0.55 to *Heads* and 0.45
to *Tails*. \(c'\) is such a credence function. \(c'\) also
satisfies **Probabilism**.

### 5.2 The source(s) of epistemic disutility

Let us start by considering the first of the two components that
comprise Joyce’s account of epistemic disutility for credences,
namely, **Credal
Veritism**. This says that the sole fundamental source of
epistemic disutility for a credence is its gradational inaccuracy. Any
other vice that the credence has, it is claimed, must derive from this
vice (Goldman 2002: 52).

First, let’s note why it is important to make this
assumption. Would it not be sufficient to say merely that one of the
sources of disutility for a credence is its inaccuracy, and then to
point out that any credence function that isn’t a probability
function is accuracy dominated? If it could always be guaranteed that
one of the credence functions that does the accuracy dominating does
not have some other epistemic vice to a greater degree than does the
credence function it accuracy dominates, then this would be
sufficient. But if it were possible that all of the accuracy
dominating credence functions, while guaranteed to be better along the
dimension of inaccuracy, were worse along some other dimension of
epistemic disutility, then being accuracy dominated would not rule out
a credence function as irrational. Thus, we must claim,
with **Credal
Veritism**, that inaccuracy is the only source of epistemic
disutility for credences.

How are we to establish this? How can we be sure there aren’t
other sources of disutility. For instance, perhaps it is a virtue of a
credence function if the credences it assigns cohere with one another
in a particular way, and a vice if they do not. This is
a *coherentist* claim of the sort endorsed for full beliefs,
rather than credences, by the likes of BonJour (1985) and Harman
(1973). Or perhaps it is a virtue of a credence in a particular
proposition if it matches the degree of support given to that
proposition by the agent’s current total evidence. This claim is
dubbed *evidential proportionalism* by Goldman (2002:
55–7). Recent proponents might include Williamson (2000) and
White (2009). Both of these seem plausible. How is the credal veritist
to answer the objection that there are sources of epistemic
disutility, such as these, that go beyond inaccuracy? Of course, it is
notoriously difficult to prove a negative existential claim, such as
the credal veritist claim that there are no other epistemic vices
beyond inaccuracy. But here is a natural strategy: for each proposed
candidate epistemic vice besides accuracy, the credal veritist should
provide an account of how its badness derives from the badness of
inaccuracy.

In the case of the coherentist described above, who proposes that
it is a vice to have credences that fail to cohere in a particular
way, there is a very natural instance of this strategy. The coherence
that we demand of credences is precisely that they relate to one
another in the way that **Probabilism** demands, so that,
for instance, no disjunction is assigned lower credence than is
assigned to either of the disjuncts, no proposition is assigned very
high credence at the same time that its negation is also assigned very
high credence, and so on. If that is correct, then of course
Joyce’s accuracy argument for **Probabilism**
detailed above provides an argument that this vice derives its badness
from the badness of inaccuracy: after all, if a credence function
lacks the coherence that the coherentist considers virtuous, they will
be accuracy dominated.

What of the evidential proportionalist? Here it is a little more
difficult. There are principles that the evidential proportionalist
will take to govern evidential support that go beyond
merely **Probabilism**, which is a relatively weak and
undemanding principle. So it is not sufficient to point to the
accuracy argument for that principle in the way we did in response to
the coherentist. However, here is an attempt at an answer. It comes
from collecting together a series of accuracy arguments for other
principles of rationality that we take to govern our credences. For
instance, Greaves & Wallace (2006) give an accuracy argument for
the principle of conditionalization, which says that, if an agent is
rational, her credence function at a later time will be obtained from
her credence function at an earlier time by conditionalizing on the
total evidence she obtains between those two times; Easwaran (2013)
and Huttegger (2013) extend the argument, and Schoenfield ms.)
clarifies the norm that it establishes. Moreover, Pettigrew (2013a)
gives an accuracy argument for the Principal Principle, which says
that, if an agent is rational, her credences in propositions
concerning the objective chances will relate to propositions to which
those chances attach in a particular way. Pettigrew (2014b) and Konek
(ms.) give rather different accuracy-based arguments for the Principle
of Indifference, which says how a rational agent with no evidence will
distribute their credences. Moss (2011), Lam (2013), and Levinstein
(2015) describe principles that rational agents will obey in the
presence of peer disagreement and provide accuracy-based arguments in
their favour. And finally Horowitz (2014) uses accuracy-based
arguments to evaluate various species of permissivism. The point is
that, piece by piece, the principles that are taken to govern the
degree of support provided to a proposition by a body of evidence are
being shown to follow from accuracy considerations alone. This, it
seems, constitutes a response to the concerns of the evidential
proportionalist.

However, both the response to the coherentist and the response to
the evidential proportionalist leave the accuracy argument
for **Probabilism** in a strange position. The argument
for, or defence of, one component of its first premise,
namely, **Credal
Veritism** appeals to the argument of which it is a
premise! In fact, this isn’t problematic. The credal veritist
and her opponent might agree that the argument at least establishes a
conditional: *if* credal veritism is true, *then*
probabilism is true. You need not accept credal veritism to accept
that conditional. And it is that conditional to which the credal
veritist appeals in defending her position against the coherentist and
the evidential proportionalist. Having successfully defended credal
veritism in this way, she can then appeal to its truth to
derive **Probabilism**.

### 5.3 Measures of inaccuracy

#### 5.3.1 Joyce on Convexity

So much, then, for the first component of the first premise of the
accuracy argument for **Probabilism**,
namely, **Credal
Veritism**. In this section, we turn to the second
component, namely, **Joycean Inaccuracy**. I will
focus on a particular condition that Joyce places on measures of
inaccuracy, namely, **Strong Convexity** (Joyce calls it
Weak Convexity, but I change the name in this presentation because, as
Patrick Maher (2002) points out, it is considerably stronger than
Joyce imagines.)

**Strong Convexity** Suppose \(\mathfrak{I}\) is a
legitimate inaccuracy measure. Then if \(c \neq c'\) and
\(\mathfrak{I}(c, w) = \mathfrak{I}(c', w)\), then
\[\mathfrak{I}\left(\frac{1}{2}c + \frac{1}{2}c', w\right) <
\mathfrak{I}(c, w) = \mathfrak{I}(c', w)\] (Given two credence
functions, \(c\) and \(c'\), we define a third credence function
\(\frac{1}{2}c + \frac{1}{2} c'\) as follows: the credence that
\(\frac{1}{2}c + \frac{1}{2}c'\) assigns to a proposition is the
straight average of the credences that \(c\) and \(c'\) assign to
it. Thus, \((\frac{1}{2}c + \frac{1}{2}c')(X) = \frac{1}{2}c(X) +
\frac{1}{2}c'(X).\) We call this the *equal mixture of \(c\) and
\(c'\)*.)

This says that, for any two distinct credence functions that are equally inaccurate at a given world, the third credence function obtained by “splitting the difference” between them and taking an equal mixture of the two is less inaccurate than either of them. Here is Joyce’s justification of this condition:

[Strong] Convexity is motivated by the intuition that extremism in the pursuit of accuracy is no virtue. It says that if a certain change in a person’s degrees of belief does not improve accuracy then a more radical change in the same direction and of the same magnitude should not improve accuracy either. Indeed, this is just what the principle says. (Joyce 1998: 596)

Joyce’s point is this: Suppose we have three credence functions, \(c\), \(m\), and \(c'\). And suppose that, to move from \(m\) to \(c'\) is just to move in the same direction and by the same amount as to move from \(c\) to \(m\), which is exactly what will be true if \(m\) is the equal mixture of \(c\) and \(c'\). Now suppose that \(m\) is at least as inaccurate as \(c\)—that is, the change from \(c\) to \(m\) does not “improve accuracy”. Then, Joyce claims, \(c'\) must be at least as inaccurate as \(m\)—that is, the change from \(m\) to \(c'\) also does not “improve accuracy”.

Objection: *The justification given doesn’t
justify Strong Convexity.* The problem with this
justification is that it establishes a weaker principle
than

**Strong Convexity**. This was first pointed out by Patrick Maher (2002), who noted that Joyce’s justification in fact motivates the following weaker principle:

**Weak Convexity** Suppose \(\mathfrak{I}\) is a
legitimate inaccuracy measure. Then if \(c \neq c'\) and
\(\mathfrak{I}(c, w) = \mathfrak{I}(c', w)\), then
\[\mathfrak{I}\left(\frac{1}{2}c + \frac{1}{2}c', w\right) \leq
\mathfrak{I}(c, w) = \mathfrak{I}(c', w)\]

That is, Joyce’s motivation rules out situations in which
inaccuracy *increases* from \(c\) to \(m\) and
then *decreases* from \(m\) to \(c'\). And this is
what **Weak Convexity** also rules
out. But **Strong Convexity** furthermore rules out
situations in which inaccuracy *remains the same* from \(c\) to
\(m\) and then from \(m\) to \(c'\). And Joyce has given no reason to
think that such changes are problematic. What’s more, as Maher
proves, the stronger convexity condition is crucial for Joyce’s
proof. With only the weaker condition, the theorem is false.

#### 5.3.2 Local and global inaccuracy

In this section, we consider alternative sets
of conditions on inaccuracy measures that are presented by Leitgeb
& Pettigrew (2010a). These propose that we replace the
claim **Joycean
Inaccuracy** in Joyce’s accuracy argument
for **Probabilism** with an alternative claim that says
that the legitimate inaccuracy measures are (amongst) those that
satisfy Leitgeb and Pettigrew’s alternative conditions. Unlike
Joyce’s conditions, these are sufficient to narrow the field of
legitimate inaccuracy measures to just a single one, namely, the Brier
score \(\mathfrak{B}\) that we met in section 5.1 above. Let us say
that **Brier Inaccuracy** is the claim that the Brier
score is the only legitimate measure of inaccuracy. And note that, if
we replace **Joycean Inaccuracy**
with **Brier Inaccuracy** in Joyce’s argument
for **Probabilism**, we retain our argument for that
epistemic norm:

**Brier accuracy-based argument for Probabilism:
I**

- \((1)\)Credal Veritism + Brier Inaccuracy
- \((2)\)Naive Dominance
- \((3)\)Theorem 4
- Therefore,
- \((4)\)Probabilism

So far, in this section, we have been concerned only with what we
might call *global* measures of inaccuracy—that is,
measures of the inaccuracy of entire credence functions. Leitgeb and
Pettigrew are certainly interested in those. But they are also
interested in what we might call *local* measures of
inaccuracy—that is, measures of the inaccuracy of individual
credences. Indeed, they are interested in how these two sorts of
inaccuracy measure interact. They lay down constraints on each of the
inaccuracy measures individually, and then they lay down constraints
on how they combine. The guiding idea in each case is that any feature
of the inaccuracy of credences that is determined from the point of
view of local inaccuracy measures—such as their total
inaccuracy, or the urgency with which an agent with inaccurate
credences should change them—should match that same feature when
it is determined from the point of view of global inaccuracy
measures. If this doesn’t happen, then the agent will face a
rational dilemma when choosing which of the two ways she should use to
determine that feature. Here, I will focus only on one of the most
powerful of Leitgeb and Pettigrew’s conditions, which also turns
out to be the most problematic. Here it is:

**Global Normality and Dominance** If \(\mathfrak{I}\)
is a legitimate global inaccuracy measure, there is a strictly
increasing \(f:[0, \infty) \rightarrow [0, \infty)\) such
\[\mathfrak{I}(c, w) = f(||v_w - c||_2).\] where, for any two credence
functions \(c\), \(c'\) defined on \(\mathcal{F}\),
\[||c - c'||_2 := \sqrt{\sum_{X \in \mathcal{F}} |c(X) - c'(X)|^2}\]
and we call \(||c - c'||_2\) the *Euclidean distance between \(c\)
and \(c'\)*; and, recall, \(v_w\) is the omniscient credence
function at \(w\), so that \(v_w(X) = 1\) if \(X\) is true at \(w\)
and \(v_w(X) = 0\) if \(X\) is false at \(w\).

Thus, **Global Normality and Dominance** says that the
inaccuracy of a credence function at a world should supervene in a
certain way upon the Euclidean distance between that credence function
and the omniscient credence function at that world. Indeed, it should
be a strictly increasing function of that distance between them.

Objection 1: *There is no motivation for the appeal to Euclidean
distance.* Leitgeb and Pettigrew show that the only inaccuracy
measure that satisfies **Global Normality and Dominance**,
together with their other conditions on inaccuracy measures, is the
Brier score, which we defined above. That is, imposing these
conditions entails **Brier Inaccuracy**. The problem
with this characterization, however, is that it depends crucially on
the appeal to the Euclidean distance made
in **Global Normality
and Dominance**, and no reason is given for appealing to
the Euclidean distance measure in particular, rather than some other
measure of distance between credence functions. Suppose we replace
that condition with one that says that a legitimate global inaccuracy
measure must be a strictly increasing function of the
so-called *Manhattan* or *city block* distance measure,
where the distance between two credence functions measured in this way
is defined as follows:
\[||c - c'||_1 := \sum_{X \in \mathcal{F}} |c(X) - c'(X)|\] That is,
the Manhattan distance between two credence functions is obtained by
summing the differences between the credences they each assign to the
various propositions on which they are defined. Together with the
other constraints that Leitgeb and Pettigrew place on inaccuracy
measures, this alternative constraint entails that the only legitimate
inaccuracy measure is the so-called *absolute value score*,
which is defined as follows:
\[\mathfrak{A}(c, w) := \sum_{X \in \mathcal{F}} |v_w(X) - c(X)|\]

Now, it turns out that the absolute value score cannot ground an
accuracy argument for **Probabilism**. In fact, there are
situations in which non-probabilistic credence functions accuracy
dominate probabilistic credence functions when inaccuracy is measured
using the absolute value score. Let \(\mathcal{F} = \{X_1, X_2,
X_3\}\), where \(X_1\), \(X_2\), and \(X_3\) are mutually exclusive
and exhaustive propositions. And consider the following two credence
functions: \(c(X_i) = \frac{1}{3}\) for each \(i = 1, 2, 3\);
\(c'(X_i) = 0\) for each \(i = 1, 2, 3\). The former, \(c\), is
probabilistic; the latter, \(c'\), is not. But, if we measure
inaccuracy using the absolute score, the inaccuracy of \(c\) at each
of the three possible worlds is \(\frac{4}{3}\), whereas the
inaccuracy of \(c'\) at each of the three possible worlds is
\(1\). The upshot of this observation is that it is crucial, if our
accuracy argument for **Probabilism** is to succeed, to
rule out the absolute value score. The problem with the Leitgeb and
Pettigrew characterization is that it rules out this measure
essentially by fiat. It rules it out by demanding that the inaccuracy
of a credence function at a world supervenes on the Euclidean distance
between the credence function and the omniscient credence function at
that world. But it gives no reason for favouring this measure of
distance over another, such as Manhattan distance.

Objection 2: *Using the Brier score to measure inaccuracy has
unintuitive consequences.* A further objection to Leitgeb and
Pettigrew’s characterization of inaccuracy measures is given by
Levinstein (2012). In the sequel to the paper in which they give this
characterization, Leitgeb and Pettigrew use it to argue in favour of
an updating rule for credences that applies in the same situations as
so-called Jeffrey Conditionalization (or Probability Kinematics) but
offers different advice (Jeffrey 1965; Leitgeb & Pettigrew
2010b). Levinstein objects to the use of the Brier score to measure
inaccuracy on the grounds that this alternative updating rule gives
deeply unintuitive results.

#### 5.3.3 Calibration and accuracy

The final characterization of inaccuracy measures that I will consider here is due to Pettigrew (forthcoming-a). Again, I won’t enumerate all of the conditions here. Instead, I’ll describe the most contentious and mathematically powerful of the conditions—the one that in some sense does the main mathematical “heavylifting” when it comes to showing what putative inaccuracy measures these conditions permit.

So far in this entry, we have presented calibration accounts of
epistemic utility and accuracy accounts as separate and
incompatible. The condition on inaccuracy measures that Pettigrew
proposes and that we consider in this section denies that. Rather, it
claims that closeness to calibration in fact plays a role in
determining the accuracy of a credence function; the difference
between this approach and the calibration arguments of
section 4 is that Pettigrew does not think that
closeness to calibration is the whole story. Let \(\mathfrak{D}\) be a
putative measure of the distance between two credence functions. That
is, \(\mathfrak{D} : \mathcal{C_F} \times \mathcal{C_F} \rightarrow
[0, \infty]\), and we’ll assume that \(\mathfrak{D}(c, c') = 0\)
iff \(c = c'\). Now first we use this measure of distance to define a
measure of the distance that a credence function lies from being
perfectly calibrated at a world. Then, following a point already made
above in our treatment of calibration arguments
for **Probabilism**, we note that this, on its own,
cannot define a measure of inaccuracy because it lacks a crucial
feature that we demand of any such measure: it is not
truth-directed. However, we then note how to supplement the measure of
distance from calibration in order to give an inaccuracy measure that
does have the crucial feature. And we claim that all inaccuracy
measures are produced by supplementing a measure of distance from
calibration in this way.

As in section 4.1, we let \(\sim\) be an
equivalence relation on the set \(\mathcal{F}\) of propositions to
which our agent assigns opinions. It is the relation of relevant
similarity between two propositions. In section
4.1, we said that we would impose conditions \(C(\sim)\) on this
equivalence relation, but we said no more to identify those
conditions. In this section, we in fact define this equivalence
relation. We take it to be relative to a credence function \(c\), so
we write it \(\sim_c\), and we define it as follows: \(A \sim_c B\)
iff \(c(A) = c(B)\). That is, two propositions are relevantly similar
for our agent with credence function \(c\) if \(c\) assigns them the
same credence. Thus, given a possible world \(w\), we say that a
credence function \(c\) is *perfectly calibrated at \(w\)* if,
for each \(A\) in \(\mathcal{F}\),
\[c(A) = \mathrm{Freq}(\mathcal{F}, A, \sim_c, w)\]

Next, given a credence function \(c\) and a world \(w\),
the *perfectly calibrated counterpart of \(c\) at \(w\)* is a
credence function also defined on \(\mathcal{F}\) that is defined as
follows: for each \(A\) in \(\mathcal{F}\)
\[c^w(A) = \mathrm{Freq}(\mathcal{F}, A, \sim_c, w)\]
That is, the perfectly calibrated counterpart of \(c\) at \(w\)
assigns to each proposition \(A\) the frequency of truths at \(w\)
amongst all propositions to which \(c\) assigns the same credence that
it assigns to \(A\). Note that \(c^w\) is perfectly calibrated at
\(w\). And if \(c\) is perfectly calibrated at \(w\), then \(c^w =
c\). Now, we define the distance that a credence function \(c\) lies
from calibration at a world \(w\) to be the distance,
\(\mathfrak{D}(c^w, c)\), from \(c^w\) to \(c\). Now, as we saw in
Objection 1 from section 4.3 above,
this measure does not itself give a measure of epistemic
disutility. The problem is that an agent can move closer to
calibration at a world \(w\) while moving uniformly further from the
omniscient credence function at that world: that is, the measure of
epistemic disutility provided by the distance of the credence function
from its perfectly calibrated counterpart is not truth-directed. Thus,
if an agent’s distance from her perfectly calibrated counterpart
is to contribute to a measure of her inaccuracy, it must be
supplemented by something that ensures that the resulting measure
avoids this consequence. The idea that Pettigrew proposes is this: the
inaccuracy of \(c\) at \(w\) is given by the distance of \(c\) from
the omniscient credence function \(v_w\) at \(w\); and that is given
by adding the distance of \(c\) from its perfectly calibrated
counterpart \(c^w\) to the distance of \(c^w\) from \(v_w\). Thus,
while moving to a credence function that is closer to its perfectly
calibrated counterpart may move you further from the omniscient
credence function, this can only be because the perfectly calibrated
counterpart of your new credence function is further from the
omniscient credence function than the perfectly calibrated counterpart
of your current credence function. If their perfectly calibrated
counterparts are the same, or if they are different but equally close
to the omniscient credence function, then moving closer to them will
move you closer to the omniscient credence function. Thus, Pettigrew
imposes the following constraint:

**Decomposition** Suppose \(\mathfrak{I}\) is a
legitimate inaccuracy measure and \(\mathfrak{D}\) is a distance
measure such that \(\mathfrak{I}(c, w) = \mathfrak{D}(v_w, c)\). Then
\[\mathfrak{I}(c, w) = \mathfrak{D}(v_w, c) = \mathfrak{D}(c^w, c) + \mathfrak{D}(v_w, c^w)\]

Together with the other conditions that Pettigrew
imposes, **Decomposition** narrows down the class of
legitimate inaccuracy measures to a single one, namely, the Brier
score. That is, imposing these conditions
entails **Brier
Inaccuracy**.

Objection 1: *Appeal to summation is arbitrary.* One concern
about **Decomposition** is this: it is crucial for the
proof that the Brier score and only the Brier score satisfies all of
Pettigrew’s conditions that in **Decomposition** we
combine the distance between \(c\) and \(c^w\) with the distance
between \(c^w\) and \(v_w\) by summing them together. But we could
have combined those quantities in other ways: we might have multiplied
them together, for instance; or, we might have summed them and then
taken a strictly increasing function of that sum. It might be
mathematically natural simply to add them together: but that
doesn’t privilege that means of combining them for philosophical
purposes. However, if we combine them in any of these alternative
ways, Pettigrew’s conditions will no longer hold of the Brier
score.

Objection 2: *Proximity to calibration is not a good.*
Another concern is that, while proximity to being perfectly calibrated
seems epistemically good in the standard cases that are used to
motivate calibrationist accounts, it seems less compelling in other
cases. For instance, suppose you have opinions only about three
propositions: *First coin toss lands heads*, *Second coin
toss lands heads*, *Third coin toss lands heads*. And
suppose you assign to each of them the same credence,
\(\frac{1}{3}\). Then, in that situation, it seems plausible that you
are doing better if one out of the three tosses comes up heads. Now
suppose that I have opinions only about three
propositions: *Djibouti is the capital of Ghana*, *Serena
Williams is a badminton player*, *Doris Lessing wrote The
Golden Notebook*. And suppose I assign to each of them the
same credence, \(\frac{1}{3}\). Then, in that situation, do we really
retain the intuition that I do best if one out of the three turns out
true?

### 5.4 Dominance principles

So far, we have considered the two components of the first premise
of Joyce’s accuracy argument
for **Probabilism**: **Credal Veritism**
and **Joycean
Inaccuracy**. We have left the former intact, but we have
seen concerns with the latter, and we have considered arguments for a
stronger claim, **Brier Inaccuracy**, though these
also face difficulties. In this section, we move from the account of
epistemic disutility on which the argument is based to the
decision-theoretic principle to which we appeal in order to
derive **Probabilism** from this account. Let’s
recall the version of the principle to which Joyce appeals in his
original paper:

**Naive
Dominance** A rational agent will not adopt an option when
there is another option that has lower disutility at all worlds.

That is: Suppose \(\mathcal{O}\) is a set of options, \(\mathcal{W}\) is the set of possible worlds, and \(\mathfrak{U}\) is a disutility function. Then, if \(o^*\) is an option, and if there is another option \(o'\) such that \(\mathfrak{U}(o', w) < \mathfrak{U}(o^*, w)\) for all worlds \(w\) in \(\mathcal{W}\), then \(o^*\) is irrational.

Thus, according to Joyce, a credence function is irrational if it is accuracy dominated.

In this section, we’ll consider four objections that have
been raised against **Naive Dominance** in the context of
the accuracy argument for **Probabilism**.

#### 5.4.1 The Bronfman objection

The first objection to the application
of **Naive
Dominance** in the context of the accuracy argument
for **Probabilism** was first stated in an unpublished
manuscript by Aaron Bronfman entitled “A Gap in Joyce’s
Proof of Probabilism”; it has been discussed by Hájek
(2008) and Pettigrew (2010, 2013b). The starting point for the
objection is the observation that **Credal Veritism**
and **Joycean
Inaccuracy** do not together narrow down the class of
legitimate measures of epistemic disutility to a single function; they
characterize a family of such measures. But, for all
that Theorem 4
(Joyce’s Main Theorem) tells us, it may well be that, for a
given non-probabilistic credence function \(c^*\), different measures
in this family of legitimate inaccuracy measures give different sets
of credence functions that accuracy dominate \(c^*\). Thus, an agent
with a non-probabilistic credence function \(c^*\) might be faced with
a range of credence functions, each of which accuracy dominates
\(c^*\) relative to a different legitimate inaccuracy
measure. Moreover, it may be that any credence function that accuracy
dominates \(c^*\) relative to Joycean inaccuracy measure
\(\mathfrak{I}\) does not accuracy dominate \(c^*\) relative to the
alternative Joycean measure \(\mathfrak{I}'\); indeed, it may be that
any credence function that dominates \(c^*\) relative to
\(\mathfrak{I}\) risks very high inaccuracy at some world relative to
\(\mathfrak{I}'\), and *vice versa*. In this situation, it is
plausible that the agent is rationally permitted to stick with her
non-probabilistic credence function \(c^*\).

There are two replies to this objection. According to the first, the objection relies on a false meta-normative claim; according to the second, it misunderstands the purpose of Joyce’s conditions.

Reply 1: *No requirement to give advice.* The meta-normative
claim on which the objection seems to rely is the following: For a
norm to hold, there must be specific advice available to those who
violate that norm concerning how to improve their
behaviour. Bronfman’s objection begins with the observation
that, for any specific advice that one might give to a
non-probabilistic agent concerning which credence function she should
adopt in favour of her own, there will be inaccuracy measures that
satisfy Joyce’s conditions, but don’t sanction this
advice; indeed, there will be inaccuracy measures relative to which
that advice is very bad. Thus, Joyce’s accuracy argument
violates the meta-normative constraint. But, the reply submits, the
meta-normative claim is false: for a norm to hold, it is sufficient
that there is a serious defect suffered by those who violate the norm
that is not shared by those who satisfy the norm; it is not also
required that there should be advice on which specific action an agent
should perform to improve her behaviour. And Joyce’s argument
satisfies this sufficient condition. An agent ought to
satisfy **Probabilism** because non-probabilistic
credence functions suffer from a serious epistemic defect (namely,
being accuracy dominated) that does not beset probabilistic ones. And
this fact is “supertrue”, so to speak: that is, it is true
on any precisification of the notion of accuracy that obeys
Joyce’s conditions on an inaccuracy measure.

Reply 2: *Each agent uses a single inaccuracy measure.* The
second reply to this objection does not take issue with the
meta-normative claim mentioned above; indeed, on the understanding of
the accuracy argument for **Probabilism** that it
proposes, the argument satisfies the necessary condition imposed by
that claim. That is, according to this reply, the accuracy argument,
properly understood, does in fact provide specific advice to
non-probabilistic agents. The idea is this: There are (at least) three
ways to understand the purpose of Joyce’s conditions on
inaccuracy measures. First, we might think that the notion of
inaccuracy is vague; and we might say that any inaccuracy measure that
satisfies the conditions is a legitimate precisification of it. This
is a *supervaluationist* approach. On this approach, there is
no specific advice available to non-probabilistic agents that is
sanctioned by all precisifications. Second, we might think that the
notion of inaccuracy is precise, but that we have only limited
knowledge about it, and that the sum total of our knowledge is
embodied in the conditions. This is an *epistemicist*
approach. On this approach, there is specific advice, but it is not
available to us. Third, we might think that there is no objectively
correct inaccuracy measure; rather, any inaccuracy measure that
satisfies the conditions is rationally permissible. But nonetheless,
any particular agent has exactly one such measure. This is
a *subjectivist* approach. On this understanding, there is
specific advice for any non-probabilistic agent. Any such agent uses
an inaccuracy measure that satisfies Joyce’s conditions. And
this gives, for any non-probabilistic credence function, a
probabilistic credence function that strongly dominates it. So the
specific advice is this: adopt one of the probabilistic credence
functions that strongly dominates your non-probabilistic credence
function relative to your favoured measure of inaccuracy. This gives
us **Probabilism** and does so without violating the
meta-normative claim on which Bronfman’s objection relies.

However, this response isn’t without its own problems. For
instance, it assumes that each agent values inaccuracy in a
sufficiently specific way that they narrow down the class of
inaccuracy measures to a single measure that they can then use to
obtain this advice. But, at least for those who think
that **Joycean
Inaccuracy** is the strongest condition we can place on the
inaccuracy measures, this seems too strong. How can we assume that
each rational agent will have a unique inaccuracy measure in mind when
we don’t think that there are conditions that demand that we
narrow down the class of legitimate inaccuracy measures this far?

#### 5.4.2 Undominated dominance

The second objection to **Naive Dominance** comes from
Pettigrew (2014a). Here, Pettigrew observes that there are decisions
in which **Naive
Dominance** does not seem to hold because the irrationality
of being dominated depends on the status of the dominating options in
some way. Here’s Pettigrew’s central example:

**Name Your Fortune\(^*\)** You have a choice: play a
game with God or don’t. If you don’t, you receive 2 utiles
for sure. If you do, you then pick an integer. If you pick \(k\), God
will then do one of two things: (i) give you \(2^{k-1}\) utiles; or
(ii) give you \(2 - \frac{1}{2^{k-1}}\) utiles. (Pettigrew 2014a:
587)

In this example, the only option that isn’t dominated is the
option in which you do not play the game with God. If you choose that
option, you get 2 utiles for sure. If, on the other hand, you choose
to play the game and pick integer \(k\), then choosing integer \(k+1\)
will be guaranteed to get you more utility: either \(2^{k+1}\) utiles
compared with \(2^k\) or \(2 - \frac{1}{2^k}\) utiles compared with
\(2 - \frac{1}{2^{k-1}}\). However, the option in which you get 2
utiles for sure seems a lousy option given the other possibilities
available. One way to see this is as follows: Take a probability
distribution over the two possibilities (i) and (ii) between which God
will choose if you choose to play; then, providing it doesn’t
assign all probability to God choosing (i), there will be some option
you can take if you play the game that has greater expected utility
than the option of not playing the
game. If **Naive
Dominance** is correct, however, not playing the game is
the only rational option. This seems to tell
against **Naive
Dominance**.

The moral that Pettigrew draws from this example is the
following. Not all dominated options are irrational. Whether or not a
dominated option is irrational depends on the status of the options
that dominate it. If all of the options that dominate a given option
are themselves dominated, then being dominated does not rule out the
given option as irrational. Thus, in **Name Your Fortune\(^*\)**, none of the
options are ruled irrational because they are dominated; after all,
all of the dominated options are only dominated by other options that
are also themselves dominated. Thus, Pettigrew instead suggests a
decision-theoretic principle to replace **Naive Dominance**. To state it, we
must distinguish between two notions of dominance: a strong notion and
a weak notion. Suppose \(o^*\) and \(o'\) are options. We say
that *\(o^*\) strongly dominates \(o'\)* if \(o^*\) has greater
utility than \(o'\) at all worlds. We say that *\(o^*\) weakly
dominates \(o'\)* if \(o^*\) has at least as great utility as
\(o'\) at all worlds and greater utility at some world.

**Undominated Dominance** A rational agent will not
adopt an option that is strongly dominated by an option that is not
itself even weakly dominated.

Now, it turns out that, if we accept **Brier Inaccuracy**, we can still
derive **Probabilism** using only **Undominated
Dominance**. This is a consequence of the following
theorem:

**Theorem 5 (de Finetti)** Suppose that \(c^*\) is a
credence function in \(\mathcal{C_F}\) that
violates **Probabilism**. Then there is a credence
function \(c'\) in \(\mathcal{C_F}\) such that (i) \(\mathfrak{B}(c',
w) < \mathfrak{B}(c^*, w)\) for all \(w\) in \(\mathcal{W_F}\), and
(ii) there is no credence function \(c\) such that \(\mathfrak{B}(c,
w) \leq \mathfrak{B}(c', w)\) for all \(w\) in \(\mathcal{W_F}\) and
\(\mathfrak{B}(c, w) < \mathfrak{B}(c', w)\) for some \(w\) in
\(\mathcal{W_F}\).

Thus, we have the following argument:

**Brier-based accuracy argument for Probabilism:
II**

- (1)Credal Veritism + Brier Inaccuracy
- (2)Undominated Dominance
- (3)Theorem 5
- Therefore,
- (4)Probabilism

#### 5.4.3 Evidence and Accuracy

The next objection to **Naive Dominance** is similar to the
objection raised in the previous section. In the previous section, the
moral we drew from **Name Your Fortune\(^*\)** is that a
dominated option is only ruled irrational in virtue of being dominated
if at least one of the options that dominate it is not itself
dominated. But there may be other features that a credence function
might have besides itself being dominated such that being dominated by
that credence function does not entail irrationality. Easwaran &
Fitelson (2012) suggest the following feature. Suppose that your
credence function is non-probabilistic, but it matches the evidence
that you have: that is, the credence it assigns to a proposition
matches the extent to which your evidence supports that
proposition. And suppose that none of the credence functions that
accuracy dominate your credence function have that feature. Then, we
might say, the fact that your credence function is accuracy dominated
does not rule it irrational. After all, it is dominated only by
credence functions that violate the constraints that your evidence
imposes on your credences. Thus, Easwaran and Fitelson suggest the
following decision-theoretic principle, which applies only when the
options in question are credence functions:

**Evidential Dominance** A rational agent will not
adopt a credence function that is strongly dominated by an alternative
credence function that is not itself even weakly dominated and which
matches the agent’s evidence if the dominated credence function
does.

Easwaran and Fitelson then object that there are situations in
which **Evidential Dominance** does not
entail **Probabilism**. For instance, suppose that a
trick coin is about to be tossed. Your evidence tells you that the
chance of it landing heads is 0.7. Your credence that it will lands
heads is 0.7 and your credence that it will land tails is 0.6. Then
you might think that your credences match your evidence, because you
have evidence only about it landing heads and your credence that it
will land heads equals the known chance that it will land
heads. However, it turns out that all of the credence functions that
accuracy dominate your credence function (when accuracy is measured by
the Brier score) fail to match this evidence: that is, they assign
credence other than 0.7 to *Heads*. Thus, **Evidential
Dominance** does not entail that your credence function is
irrational. Figure 2 illustrates this
result. Pettigrew (2014a) responds to this objection on behalf of the
accuracy argument for **Probabilism**.

Figure 2: In this figure, as
in Figure 1, we plot the various possible credence
functions defined on a proposition *Heads* and its
negation *Tails* in the unit square. The diagonal line contains
all and only the probability functions. Let \(c^*\) be your credence
function: that is, it assigns 0.7 to *Heads* and 0.6
to *Tails*. So it violates **Probabilism**. The
credence functions that lie between the two arcs are all and only the
credence functions that accuracy dominate \(c^*\). The credence
functions on the dashed line are all and only the credence functions
that match your evidence that the chance of *Heads* is
0.7. Notice that the dashed line does not overlap with the set of
credence functions that accuracy dominate yours at any point. This is
the crucial fact on which Easwaran and Fitelson’s objection
rests.

#### 5.4.4 Dominance and Act-State Dependence

The final objection to **Naive Dominance** comes from Hilary
Greaves (2013) and Michael Caie (2013), who point out that, in
practical decision theory, only a restricted version of that principle
is accepted (see also Jenkins 2007; Berker 2013a,b; Carr ms.). To see
why such a restriction is needed, consider the following case:

**Driving Test** My driving test is in a week’s
time. I can choose now whether or not I will practise for it. Other
things being equal, I prefer not to practice. But I also want to pass
the test, and I know that I won’t pass if I don’t
practise, and I will pass if I do. Here is my decision table:

Pass | Fail | |
---|---|---|

Practise | 10 | 2 |

Don’t Practise | 15 | 7 |

According to **Naive Dominance**, it is irrational
to practise. After all, whether or not I pass or fail, I obtain higher
utility if I don’t practise, so not practising strongly
dominates practising. But this is clearly the wrong result. The reason
is that I should not compare practising at the world at which I pass
with not practising at that world, and practising at the world at
which I fail with not practising at that world. For if I practise, I
will pass; and if I don’t, I will fail. Moreover, I know all
this. So I should compare practising at the world at which I pass with
not practising at the world at which I fail. And then my utility is
higher if I practise.

The moral of this example is that **Naive Dominance** should be
restricted so that it applies only in situations in which the options
between which the agent is choosing will not influence the way the
world is if they are adopted. Such situations are sometimes called
situations of *act-state independence*. In situations in which
the acts (options) influence the states (of the
world), **Naive
Dominance** does not apply. To see how this affects the
accuracy argument for **Probabilism**, consider the
following example, which borrows from Caie’s and Greaves’
examples:

**Thwarted Accuracy** Suppose I can read your
mind. You have opinions only about two propositions, \(A\) and \(\neg
A\). And suppose that I have control over the truth of \(A\) and
\(\neg A\). I decide to do the following. First, define the
non-probabilistic credence function \(c^\dag(A) = 0.99\) and
\(c^\dag(\neg A) = 0.005\). Then:

- If your credence function is \(c^\dag\), I will make \(A\) true (and thereby make your credence function very accurate);
- If your credence function is not \(c^\dag\) and your credence in \(A\) is greater than 0.5, I will make \(A\) false (and thereby make your credence function rather inaccurate);
- If your credence function is not \(c^\dag\) and your credence in \(A\) is at most 0.5, I will make \(A\) true (and thereby make your credence function rather inaccurate).

In this case, since the credence function \(c^\dag\) is not a
probability function, it is accuracy dominated by Joyce’s
theorem and thus it is ruled out as irrational
by **Naive
Dominance**, just as the option of practising is ruled out
as irrational in Driving Test. However, this is a situation in which
adopting an option influences the way the world is in such a way that
it affects the utility of the option, just as choosing whether or not
to practise does in Driving Test. If I were to have credence function
\(c^\dag\), I would be more accurate than I would be were I to have
any other credence function. Thus, it seems that, just as we said that
practising is in fact the only option that shouldn’t be ruled
irrational in Driving Test, so now we must say that credence function
\(c^\dag\) is the only option that shouldn’t be ruled irrational
in Thwarted Accuracy. But of course, it then follows
that **Probabilism** is false, for there are situations
such as this one in which it is irrational to do anything other than
have a non-probabilistic credence function.

There are three responses available here: the first is to bite the
bullet, accept the restriction to **Naive Dominance**, and therefore
accept a restriction on the cases in
which **Probabilism** holds; the second is to argue that
the practical case and the epistemic case are different, with
different decision-theoretic principles applying to each; the third,
of course, is to abandon the accuracy argument
for **Probabilism**. Joyce (forthcoming) and Pettigrew (forthcoming-b) argue for the first response. They advocate different
decision-theoretic principles to replace **Naive Dominance** in the epistemic
case: Joyce advocates standard causal decision theory together with a
Ratifiability condition (Jeffrey 1983); Pettigrew omits the
ratifiability condition. But they both agree that these principles
will agree with **Naive Dominance** in cases of
act-state independence; and they agree with the verdict that
\(c^\dag\) is the only credence function that isn’t ruled out as
irrational in Thwarted Accuracy. Konek & Levinstein (ms) argue for
the second response, claiming that, since doxastic states and actions
have different directions of fit, different decision-theoretic
principles will govern them. They hold
that **Naive
Dominance** (or, perhaps, **Undominated
Dominance**) is the correct principle when the options are
credence functions, even though it is not the correct principle when
the options are actions. Caie (2013) and Berker (2013b), on the other
hand, argue for the third option.

## 6. Epistemic disutility arguments

So far, we have considered calibration arguments and accuracy
arguments for **Probabilism**. In each of these cases, we
identify a particular feature of a credence function—the
proximity of its credences to being calibrated, or their proximity to
the omniscience credences—we claim that it is the source of all
epistemic utility, and we attempt to characterize the mathematical
functions that legitimately measure the extent to which the credence
function has that feature. In this section, we consider an argument,
due again to Joyce, that attempts to characterize epistemic disutility
functions directly (Joyce 2009). Here, I focus only on the central
condition:

**Coherent Admissibility** Suppose \(\mathfrak{D}\) is
a measure of epistemic disutility. Then, if \(c^*\) is a probabilistic
credence function, then \(c^*\) is not weakly dominated relative to
\(\mathfrak{D}\). That is, for any probabilistic credence function
\(c^*\), there is no credence function \(c'\) such that (i)
\(\mathfrak{D}(c', w) \leq \mathfrak{D}(c^*, w)\) for all \(w\); and
(ii) \(\mathfrak{D}(c', w) < \mathfrak{D}(c^*, w)\) for some
\(w\).

Together with **Undominated
Dominance**, Joyce’s new set of conditions on an
epistemic disutility function
entail **Probabilism**. Let’s say
that **Joycean Disutility** is the claim that all
legitimate measures of epistemic disutility satisfy **Coherent
Admissibility** along with the other new conditions that Joyce
imposes. Then we have:

**Theorem 5 (Joyce 2009)** Suppose \(\mathcal{F}\) is
an algebra and \(\mathfrak{D} : \mathcal{C_F} \times \mathcal{W_F}
\rightarrow [0, \infty]\) is a Joycean epistemic disutility function
for the credence functions on \(\mathcal{F}\). Now suppose that
\(c^*\) is a credence function in \(\mathcal{C_F}\) that
violates **Probabilism**. Then there is a credence
function \(c'\) in \(\mathcal{C_F}\) such that (i) \(\mathfrak{D}(c',
w) < \mathfrak{D}(c^*, w)\) for all \(w\) in \(\mathcal{W_F}\), and
(ii) there is no credence function \(c\) such that \(\mathfrak{D}(c,
w) \leq \mathfrak{D}(c', w)\) for all \(w\) in \(\mathcal{W_F}\) and
\(\mathfrak{D}(c, w) < \mathfrak{D}(c', w)\) for some \(w\) in
\(\mathcal{W_F}\).

Thus, we have the following argument:

**Joycean epistemic disutility argument for
Probabilism**

- \((1)\)Joycean Disutility
- \((2)\)Undominated Dominance
- \((3)\)Theorem 5
- Therefore,
- \((4)\)Probabilism

Joyce argues for **Coherent Admissibility** as
follows.

- \((1)\)For each probabilistic credence function \(c\), there is a possible world at which \(c\) is the objective chance function.
- \((2)\)If an agent learns with certainty that \(c\) is the objective chance function, and nothing more, then the unique rational response to her evidence is to set her credence function to \(c\). (This is close to David Lewis’ Principal Principle (Lewis 1980).)
- \((3)\)Thus, by (1) and (2): for each probabilistic credence function \(c\), there is an evidential situation in which an agent might find herself such that \(c\) is the unique rational response to that evidential situation.
- \((4)\)Thus, by (3): Let \(c^*\) be a probabilistic credence function. Then there is an evidential situation in which \(c^*\) is the unique rational response.
- \((5)\)If \(c'\) weakly dominates \(c^*\) relative to a legitimate measure of epistemic disutility, and \(c^*\) is rationally permitted, then \(c'\) is also rationally permitted.
- \((6)\)Thus, by (4) and (5): if \(c^*\) is weakly dominated, there is no evidential situation in which \(c^*\) the unique rational response.
- Therefore,
- \((7)\)\(c^*\) is not weakly dominated relative to any legitimate measure of epistemic disutility.

Alan Hájek (2008) has raised two objections to this argument.

Objection 1: *Not all probabilistic credence functions could be
chance functions.* The first objection denies (1). As Hájek
notes, if \(c\) is defined on propositions concerning ethical matters,
or mathematical matters, or aesthetic matters, or facts about the
current time or the agent’s current location, it is not clear
that it could possibly be the chance function of any world, since
chances cannot attach to these sorts of proposition. Pettigrew (2014b:
5.2.1) replies on Joyce’s behalf.

Objection 2: *The argument over-generates.* The second
objection claims that, in the absence of **Probabilism**,
which is supposed to be the conclusion of the argument for
which **Coherent Admissibility** is a crucial part, this
argument overgenerates. Consider, for instance, the following
claim:

- (\(2'\))If an agent learns with certainty that \(c\) is the credence function that constitutes the unique rational response to her evidence at that time, and nothing more, then the unique rational response is to set her credence function to \(c\).

Now, suppose \(c^\dag\) is a non-probabilistic credence function
and apply the version of Joyce’s argument that results from
replacing (2) with (2’). That is, we assume that it is possible
that the agent learn with certainty that \(c^*\) is the unique
rational response to her evidence, even if in fact it is not. We might
assume, for instance, that a mischievous God whispers in the
agent’s ear that this is the case. Then we must conclude that
\(c^\dag\) is not weakly dominated relative to any legitimate measure
of epistemic disutility. But now we have that no credence function is
weakly dominated, whether it is probabilistic or not. And, combined
with Joyce’s other considerations, this is impossible. If no
probabilistic credence functions are weakly dominated relative to an
epistemic disutility function, then all of the non-probabilistic
credence functions are: that’s the lesson of Theorem 5 above. Of course, the natural
response to this objection is to note that (2’) only holds when
\(c\) is a probabilistic credence function. But such a restriction is
unmotivated until we have
established **Probabilism**.

## 7. Related issues

That completes our survey of the existing literature on the epistemic utility arguments for Probabilism. We have considered three families of argument: calibration arguments, accuracy arguments, and epistemic disutility arguments. In this final section, we briefly consider ways in which the argument strategy employed here (and described in section 2) might be generalised.

### 7.1 Infinite probability spaces

We have assumed throughout that the set of propositions on which an agent’s credence function is defined is finite. What happens when we lift this restriction? Can we justify Countable Additivity, for instance? Some work has been done in this area, but there is great scope for further investigation (Easwaran 2013; Huttegger 2013; Konek ms.).

### 7.2 Other principles of rationality for credences

We have focussed here on the synchronic coherence principle
of **Probabilism**. But there are many other principles
that are thought to govern rational credence. It is natural to ask
whether we can give similar arguments for those. As we saw above
in section 5.2, a number of epistemic norms
have been explored in this framework, but of course there are many
more still to consider.

### 7.3 Other doxastic states

In this entry, we have considered agents represented as having precise credence functions. But there are, of course, many other models of doxastic states that are considered in current epistemology. As mentioned at the outset, we might represent an agent by the set of propositions that they believe; or we might represent them using a set of precise probability functions; or a comparative confidence ordering; or a precise primitive conditional probability function. And, when modelled in this way, there are principles of rationality that apply to these agents. Are there accuracy arguments in their favour? See (Easwaran 2015) and (Easwaran & Fitelson forthcoming) for some work on this question for the case of full beliefs. And see Seidenfeld, Schervish, & Kadane 2012, Schoenfield 2015, and Mayo-Wilson & Wheeler forthcoming for results that suggest that it may be difficult to extend the framework to the case of imprecise credences.

## Bibliography

- Ahlstrom-Vij, K. & J. Dunn (eds.), forthcoming,
*Epistemic Consequentialism*, Oxford: Oxford University Press. - Berker, S., 2013a, “Epistemic Teleology and the Separateness
of Propositions”,
*Philosophical Review*, 122(3): 337–393. - –––, 2013b, “The Rejection of Epistemic
Consequentialism”,
*Philosophical Issues (Supp. Noûs)*, 23(1): 363–387. - BonJour, L., 1985,
*The Structure of Empirical Knowledge*, Cambridge, MA: Harvard University Press. - Caie, M., 2013, “Rational Probabilistic
Incoherence”,
*Philosophical Review*, 122(4): 527–575. - Carr, J., ms.,
*Epistemic Utility Theory and the Aim of Belief*. - Easwaran, K., 2013, “Expected Accuracy Supports
Conditionalization—and Conglomerability and
Reflection”,
*Philosophy of Science*, 80(1): 119–142. - –––, 2015, “Dr Truthlove, Or: How I
Learned to Stop Worrying and Love Bayesian
Probabilities”,
*Noûs*, doi:10.1111/nous.12099 - Easwaran, K. & B. Fitelson, 2012, “An
‘evidentialist’ worry about Joyce’s argument for
Probabilism”,
*Dialectica*, 66(3): 425–433. - –––, forthcoming, “Accuracy, Coherence, and
Evidence”,
*Oxford Studies in Epistemology*, 5. [preprint of Easwaran & Fitelson forthcoming] - Fraassen, B.C. van, 1983, “Calibration: Frequency
Justification for Personal Probability”, in R.S. Cohen &
L. Laudan (eds.),
*Physics, Philosophy, and Psychoanalysis*, Dordrecht: Springer. - Goldman, A.I., 2002,
*Pathways to Knowledge: Private and Public*, New York: Oxford University Press. - Greaves, H., 2013, “Epistemic Decision
Theory”,
*Mind*, 122(488): 915–952. - Greaves, H. & D. Wallace, 2006, “Justifying
Conditionalization: Conditionalization Maximizes Expected Epistemic
Utility”,
*Mind*, 115(459): 607–632. - Harman, G., 1973,
*Thought*, Princeton, NJ: Princeton University Press. - Hájek, A., 2008, “Arguments For—Or
Against—Probabilism?”,
*The British Journal for the Philosophy of Science*, 59(4): 793–819. - –––, 2009, “Fifteen Arguments against
Hypothetical Frequentism”,
*Erkenntnis*, 70: 211–235. - Horowitz, S., 2014, “Immoderately
rational”,
*Philosophical Studies*, 167: 41–56. - Huttegger, S.M., 2013, “In Defense of
Reflection”,
*Philosophy of Science*, 80(3): 413–433. - Jeffrey, R., 1965,
*The Logic of Decision*, New York: McGraw-Hill. - Jeffrey, R., 1983,
*The Logic of Decision*(2^{nd}). Chicago; London: University of Chicago Press. - Jenkins, C.S., 2007, “Entitlement and
Rationality”,
*Synthese*, 157: 25–45. - Joyce, J.M., 1998, “A Nonpragmatic Vindication of
Probabilism”,
*Philosophy of Science*, 65(4): 575–603. - –––, 2009, “Accuracy and Coherence:
Prospects for an Alethic Epistemology of Partial Belief”, in
F. Huber & C. Schmidt-Petri (eds.),
*Degrees of Belief*, Springer. - –––, forthcoming, “The True Consequences of Epistemic Consequentialism”, in Ahlstrom-Vij & Dunn forthcoming.
- Konek, J., ms., “Probabilistic Knowledge and Cognitive Ability”,
- Konek, J. & B.A. Levinstein, ms.,
*The Foundations of Epistemic Decision Theory*. - Lam, B., 2013, “Calibrated Probabilities and the
Epistemology of Disagreement”,
*Synthese*, 190(6): 1079–1098. - Lange, M., 1999, “Calibration and the Epistemological Role
of Bayesian Conditionalization”,
*The Journal of Philosophy*, 96(6): 294–324. - Leitgeb, H. & R. Pettigrew, 2010a, “An Objective
Justification of Bayesianism I: Measuring
Inaccuracy”,
*Philosophy of Science*, 77: 201–235. - –––, 2010b, “An Objective Justification of
Bayesianism II: The Consequences of Minimizing
Inaccuracy”,
*Philosophy of Science*, 77: 236–272. - Levinstein, B.A., 2012, “Leitgeb and Pettigrew on Accuracy
and Updating”,
*Philosophy of Science*, 79(3): 413–424. - –––, 2015, “With All Due Respect: The
Macro-Epistemology of Disagreement”,
*Philosophers’ Imprint*, 15(3): 1–20. - Lewis, D., 1980, “A Subjectivist’s Guide to Objective
Chance”, in R.C. Jeffrey (ed.),
*Studies in Inductive Logic and Probability*(Vol. II). Berkeley: University of California Press. - Maher, P., 1993,
*Betting on Theories*, Cambridge: Cambridge University Press. - –––, 2002, “Joyce’s Argument for
Probabilism”,
*Philosophy of Science*, 69(1): 73–81. - Mayo-Wilson, C. & G. Wheeler, forthcoming, “Scoring
Imprecise Credences: A Mildly Immodest Proposal”,
*Philosophy and Phenomenological Research*. [preprint of Mayo-Wilson & Wheeler forthcoming] - Moss, S., 2011, “Scoring Rules and Epistemic
Compromise”,
*Mind*, 120(480): 1053–1069. - Pettigrew, R., 2010, “Modelling
uncertainty”,
*Grazer Philosophische Studien*, 80. - –––, 2013a, “A New Epistemic Utility
Argument for the Principal Principle”,
*Episteme*, 10(1): 19–35. - –––, 2013b, “Epistemic Utility and Norms
for Credence”,
*Philosophy Compass*, 8(10): 897–908. - –––, 2014a, “Accuracy and
Evidence”,
*Dialectica*. - –––, 2014b, “Accuracy, Risk, and the
Principle of Indifference”,
*Philosophy and Phenomenological Research*. - –––, forthcoming-a,
*Accuracy and the Laws of Credence*, Oxford: Oxford University Press. - –––, forthcoming-b, “Making Things Right: the true consequences of decision theory in epistemology”, in Ahlstrom-Vij & Dunn forthcoming. [draft of Pettigrew forthcoming-b]
- Rosenkrantz, R.D., 1981,
*Foundations and Applications of Inductive Probability*, Atascadero, CA: Ridgeview Press. - Schoenfield, M., ms., “Conditionalization does not (in general) Maximize Expected Accuracy”. [Schoenfield ms.]
- –––, 2015, “The Accuracy and Rationality
of Imprecise Credences”,
*Noûs*, doi:10.1111/nous.12105. - Seidenfeld, T., 1985, “Calibration, Coherence, and Scoring
Rules”,
*Philosophy of Science*, 52(2): 274–294. - Seidenfeld, T., M.J. Schervish, & J.B. Kadane, 2012,
“Forecasting with imprecise
probabilities”,
*International Journal of Approximate Reasoning*, 53: 1248–1261. - Shimony, A., 1988, “An Adamite Derivation of the Calculus of
Probability”, in J. Fetzer (ed.), 1988,
*Probability and Causality: Essays in Honor of Wesley C. Salmon*, Dordrecht: Reidel. - White, R., 2009, “Evidential Symmetry and Mushy
Credence”,
*Oxford Studies in Epistemology*, 3: 161–186. - Williamson, T., 2000,
*Knowledge and its Limits*, Oxford: Oxford University Press.

## Academic Tools

How to cite this entry. Preview the PDF version of this entry at the Friends of the SEP Society. Look up this entry topic at the Indiana Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers, with links to its database.

## Other Internet Resources

- Buchak, Lara and Branden Fitelson, Separability Assumptions in Scoring-Rule-Based Arguments for Probabilism, slides from a talk presented at Second Formal Epistemology Festival: Causal Decision Theory and Scoring Rules, University of Michigan, Ann Arbor May 29–31, 2009
- J.R.G. Williams, “Gradational Accuracy and Non-Classical Semantics”, unpublished manuscript
- Richard Pettigrew, Accuracy, Chance, and the Principal Principle, slides from a talk presented in Bristol, October 2010
- James M. Joyce, The Accuracy of Partial Beliefs, I and II, slides from a talk presented at Formal Epistemology Workshop 2004, May 21–23, 2004.