# Bayesian Epistemology

*First published Thu Jul 12, 2001; substantive revision Wed Mar 26, 2008*

‘Bayesian epistemology’ became an epistemological movement
in the 20^{th} century, though its two main features can be
traced back to the eponymous Reverend Thomas Bayes (c. 1701–61). Those
two features are: (1) the introduction of a *formal apparatus*
for inductive logic; (2) the introduction of a *pragmatic
self-defeat test* (as illustrated by Dutch Book Arguments) for
*epistemic* rationality as a way of extending the justification
of the laws of deductive logic to include a justification for the laws
of inductive logic. The formal apparatus itself has two main elements:
the use of the laws of probability as coherence constraints on
rational degrees of belief (or degrees of confidence) and the
introduction of a rule of probabilistic inference, a rule or principle
of *conditionalization*.

Bayesian epistemology did not emerge as a philosophical program
until the first formal axiomatizations of probability theory in the
first half of the 20^{th} century. One important
application of Bayesian epistemology has been to the analysis of
scientific practice in *Bayesian Confirmation Theory*. In
addition, a major branch of statistics, *Bayesian
statistics*, is based on Bayesian principles. In psychology,
an important branch of learning theory, *Bayesian learning
theory*, is also based on Bayesian principles. Finally, the
idea of analyzing rational degrees of belief in terms of rational
betting behavior led to the 20^{th} century development of a
new kind of decision theory, *Bayesian decision theory*, which is
now the dominant theoretical model for the both the descriptive and
normative analysis of decisions. The combination of its precise
formal apparatus and its novel pragmatic self-defeat test for
justification makes Bayesian epistemology one of the most important
developments in epistemology in the 20^{th} century, and one of
the most promising avenues for further progress in epistemology in the
21^{st} century.

- 1. Deductive and Probabilistic Coherence and Deductive and Probabilistic Rules of Inference
- 2. A Simple Principle of Conditionalization
- 3. Dutch Book Arguments
- 4. Bayes' Theorem and Bayesian Confirmation Theory
- 5. Bayesian Social Epistemology
- 6. Potential Problems
- 7. Other Principles of Bayesian Epistemology
- Bibliography
- Academic Tools
- Other Internet Resources
- Related Entries

## 1. Deductive and Probabilistic Coherence and Deductive and Probabilistic Rules of Inference

There are two ways that the laws of deductive logic have been thought
to provide rational constraints on belief: (1) Synchronically, the
laws of deductive logic can be used to define the notion of deductive
consistency and inconsistency. Deductive inconsistency so defined
determines one kind of incoherence in belief, which I refer to as
*deductive incoherence*. (2) Diachronically, the laws of
deductive logic can constrain admissible changes in belief by
providing the *deductive rules of inference*. For example,
*modus ponens* is a deductive rule of inference that requires
that one infer *Q* from premises *P* and *P*
→ *Q*.

Bayesians propose additional standards of synchronic coherence —
standards of *probabilistic coherence* — and additional rules
of inference — *probabilistic rules of inference* — in both
cases, to apply not to beliefs, but degrees of belief (degrees of
confidence). For Bayesians, the most important standards of
probabilistic coherence are the laws of probability. For more on the
laws of probability, see the following supplementary article:

Supplement on Probability Laws

For Bayesians, the most important probabilistic rule of inference is
given by a *principle of conditionalization*.

## 2. A Simple Principle of Conditionalization

If unconditional probabilities (e.g. *P*(*S*)) are taken
as primitive, the conditional probability of *S* on *T*
can be defined as follows:

Conditional Probability:

P(S/T) =P(S&T)/P(T).

By itself, the definition of conditional probability is of little epistemological significance. It acquires epistemological significance only in conjunction with a further epistemological assumption:

Simple Principle of Conditionalization:

If one begins with initial orpriorprobabilitiesP_{i}, and one acquires new evidence which can be represented as becoming certain of an evidentiary statementE(assumed to state the totality of one's new evidence and to have initial probability greater than zero), then rationality requires that one systematically transform one's initial probabilities to generate final orposteriorprobabilitiesP_{f}by conditionalizing onE— that is: WhereSis any statement,P_{f}(S) =P_{i}(S/E).^{[1]}

In epistemological terms, this Simple Principle of Conditionalization
requires that the effects of evidence on rational degrees be analyzed
in two stages: The first is non-inferential. It is the change in the
probability of the evidence statement *E* from
*P*_{i}(*E*), assumed to be greater
than zero and less than one, to
*P*_{f}(*E*) = 1. The second is a
probabilistic inference of conditionalizing on *E* from initial
probabilities (e.g., *P*_{i}(*S*)) to
final probabilities (e.g., *P*_{f}(*S*)
= *P*_{i}(*S*/*E*)).

Problems with the Simple Principle (to be discussed below) have led many Bayesians to qualify the Simple Principle by limiting its scope. In addition, some Bayesians follow Jeffrey in generalizing the Simple Principle to apply to cases in which one's new evidence is less than certain (also discussed below). What unifies Bayesian epistemology is a conviction that conditionalizing (perhaps of a generalized sort) is rationally required in some important contexts — that is, that some sort of conditionalization principle is an important principle governing rational changes in degrees of belief.

## 3. Dutch Book Arguments

Many arguments have been given for regarding the probability laws as
coherence conditions on degrees of belief and for taking some
principle of conditionalization to be a rule of probabilistic
inference. The most distinctively Bayesian are those referred to as
*Dutch Book Arguments*. Dutch Book Arguments represent the
possibility of a new kind of justification for epistemological
principles.

A Dutch Book Argument relies on some descriptive or normative
assumptions to connect degrees of belief with willingness to wager —
for example, a person with degree of belief *p* in sentence
*S* is assumed to be willing to pay up to and including
$*p* for a unit wager on *S* (i.e., a wager that pays $1
if *S* is true) and is willing to sell such a wager for any
price equal to or greater than $*p* (one is assumed to be equally
willing to buy or sell such a wager when the price is exactly
$*p*).^{[2]}
A *Dutch Book* is a combination of
wagers which, on the basis of deductive logic alone, can be shown to
entail a sure loss. A *synchronic Dutch Book* is a Dutch Book
combination of wagers that one would accept all at the same time. A
*diachronic Dutch Book* is a Dutch Book combination of wagers
that one will be motivated to enter into at different times.

Ramsey and de Finetti first employed synchronic Dutch Book Arguments
in support of the probability laws as standards of synchronic
coherence for degrees of belief. The first diachronic Dutch Book
Argument in support of a principle of conditionalization was reported
by Teller, who credited David Lewis. The Lewis/Teller argument depends
on a further descriptive or normative assumption about conditional
probabilities due to de Finetti: An agent with conditional probability
*P*(*S*/*T*) = *p* is assumed to be
willing to pay any price up to and including $*p* for a unit
wager on *S* conditional on *T*. (A unit wager on
*S* conditional on *T* is one that is called off, with
the purchase price returned to the purchaser, if *T* is not
true. If *T* is true, the wager is not called off and the wager
pays $1 if *S* is also true.) On this interpretation of
conditional probabilities, Lewis, as reported by Teller, was able to
show how to construct a diachronic Dutch Book against anyone who, on
learning only that *T*, would predictably change his/her degree
of belief in *S* to *P*_{f}(*S*)
> *P*_{i}(*S*/*T*); and how
to construct a diachronic Dutch Book against anyone who, on learning
only that *T*, would predictably change his/her degree of
belief in *S* to *P*_{f}(*S*)
< *P*_{i}(*S*/T*)*. For
illustrations of the strategy of the Ramsey/de Finetti and the
Lewis/Teller arguments, see the following supplementary article:

Supplement on Dutch Book Arguments

There has been much discussion of exactly what it is that Dutch Book
Arguments are supposed to show. On the *literal-minded
interpretation*, their significance is that they show that those
whose degrees of belief violate the probability laws or those whose
probabilistic inferences predictably violate a principle of
conditionalization are liable to enter into wagers on which they are
sure to lose. There is very little to be said for the literal-minded
interpretation, because there is no basis for claiming that
rationality requires that one be willing to wager in accordance with
the behavioral assumptions described above. An agent could simply
refuse to accept Dutch Book combinations of wagers.

One of the main motivations for Jeffrey's new approach to the
foundations of decision theory in *Logic of Decision* was his
dissatisfaction with the identification of subjective probability with
betting ratios. For example, no matter what one's degree of belief in
the proposition that all human life will be destroyed within the next
ten years, it would be not be rational to offer to buy a bet on its
truth. Williamson extends de Finetti's Dutch Book Argument for a
finite additivity constraint on rational degrees of belief to produce
an argument for a countable additivity constraint on degrees of
belief, but the argument is better interpreted as a reductio of the
literal-minded interpretation of Dutch Book Arguments than as an
argument for the rationality of a countable additivity constraint.
The rational response to offers to bet on the proposition that all
life will be destroyed within the next ten years or to bet on a single
possible outcome in a countably infinite set of equiprobable possible
outcomes is simply not to.

A more plausible interpretation of Dutch Book Arguments is that they
are to be understood hypothetically, as symptomatic of what has been
termed *pragmatic self-defeat*. On this interpretation, Dutch
Book Arguments are a kind of heuristic for determining when
one's degrees of belief have the potential to be
*pragmatically self-defeating*. The problem is not that one
who violates the Bayesian constraints is likely to enter into a
combination of wagers that constitute a Dutch Book, but that, on any
reasonable way of translating one's degrees of belief into
action, there is a potential for one's degrees of belief to
motivate one to act in ways that make things worse than they might
have been, when, as a matter of logic alone, it can be determined
that alternative actions would have made things better (on one's
own evaluations of better and worse).

Another way of understanding the problem of susceptibility to a Dutch
Book is due to Ramsey: Someone who is susceptible to a Dutch Book
evaluates identical bets differently based on how they are described.
Putting it this way makes susceptibility to Dutch Books sound
irrational. But this standard of rationality would make it
irrational not to recognize all the logical consequences of what one
believes. This is the *assumption of logical omniscience*
(discussed below).

If successful, Dutch Book Arguments would reduce the justification of the principles of Bayesian epistemology to two elements: (1) an account of the appropriate relationship between degrees of belief and choice; and (2) the laws of deductive logic. Because it would seem that the truth about the appropriate relationship between the degrees of belief and choice is independent of epistemology, Dutch Book Arguments hold out the potential of justifying the principles of Bayesian epistemology in a way that requires no other epistemological resources than the laws of deductive logic. For this reason, it makes sense to think of Dutch Book Arguments as indirect, pragmatic arguments for according the principles of Bayesian epistemology much the same epistemological status as the laws of deductive logic. Dutch Book Arguments are a truly distinctive contribution made by Bayesians to the methodology of epistemology.

It should also be mentioned that some Bayesians have defended their
principles more directly, with non-pragmatic arguments. In addition
to reporting Lewis's Dutch Book Argument, Teller offers a
non-pragmatic defense of Conditionalization. There have been many
proposed non-pragmatic defenses of the probability laws (e.g., van
Fraassen; Shimony). The most compelling is due to Joyce. All such
defenses, whether pragmatic or non-pragmatic, produce a puzzle for
Bayesian epistemology: The principles of Bayesian epistemology are
typically proposed as principles of *inductive* reasoning. But
if the principles of Bayesian epistemology depend ultimately for their
justification solely on the laws of deductive logic, what reason is
there to think that they have any *inductive* content? That is
to say, what reason is there to believe that they do anything more
than extend the laws of deductive logic from beliefs to degrees of
belief? It should be mentioned, however, that even if Bayesian
epistemology only extended the laws of deductive logic to degrees of
belief, that alone would represent an extremely important advance in
epistemology.

## 4. Bayes' Theorem and Bayesian Confirmation Theory

This section reviews some of the most important results in the
Bayesian analysis of scientific practice — *Bayesian Confirmation
Theory*. It is assumed that all statements to be evaluated have
prior probability greater than zero and less than one.

### 4.1 Bayes' Theorem and a Corollary

Bayes' Theorem is a straightforward consequence of the probability axioms and the definition of conditional probability:

Bayes' Theorem:

P(S/T) =P(T/S) ×P(S)/P(T) [whereP(T) is assumed to be greater than zero]

The epistemological significance of Bayes' Theorem is that it
provides a straightforward corollary to the Simple Principle of
Conditionalization. Where the final probability of a hypothesis
*H* is generated by conditionalizing on evidence *E*,
Bayes' Theorem provides a formula for the final probability of
*H* in terms of the prior or initial *likelihood* of
*H* on *E*
(*P*_{i}(*E*/*H*)) and the prior
or initial probabilities of *H* and *E*:

Corollary of the Simple Principle of Conditionalization:

P_{f}(H) =P_{i}(H/E) =P_{i}(E/H) ×P_{i}(H)/P_{i}(E).

Due to the influence of Bayesianism, *likelihood* is now a
technical term of art in confirmation theory. As used in this
technical sense, likelihoods can be very useful. Often, when the
conditional probability of *H* on *E* is in doubt, the
likelihood of *H* on *E* can be computed from the
theoretical assumptions of *H*.

### 4.2 Bayesian Confirmation Theory

**A. Confirmation and disconfirmation.** In Bayesian
Confirmation Theory, it is said that evidence confirms (or would
confirm) hypothesis *H* (to at least some degree) just in case
the prior probability of *H* conditional on *E* is
greater than the prior unconditional probability of *H*:
*P*_{i}(*H*/*E*) >
*P*_{i}(*H*). *E* disconfirms
(or would disconfirm) *H* if the prior probability of
*H* conditional on *E* is less than the prior
unconditional probability of *H*.

This is a qualitative conception of confirmation. There is no general
agreement in the literature on a quantitative measure of degree of
confirmation or degree of evidential support. Earman (chap. 5) and
Fitelson both provide a good overview of the various proposals. It
might be thought that *the degree to which evidence E supports (or
would support) hypothesis H* could be defined as
*P*_{i}(*H*/*E*) −
*P*_{i}(*H*). One potential problem
with this proposal is that it has the consequence that no evidence can
provide much evidential support to a hypothesis that is antecedently
very probable, because as the probability of *H* approaches
one, the difference goes to zero. Eells and Fitelson have argued that
this apparently counterintuitive consequence can be avoided by
distinguishing the historical question of how much a piece of evidence
*E* actually contributed to the confirmation of *H*
(which, of course, would have to be small if H were antecedently
highly probable) from the question of the degree of evidential support
*E* provides for *H*, the answer to which, they propose,
is relative to the background information. So even if *H* is
very probable at the time that evidence *E* is acquired, we can
ask how much evidential support *E* would provide for
*H* if we had no other evidence supporting *H*. Eells
and Fitelson have also provided a useful framework for evaluating the
various proposals in the literature, a framework within which most of
them are found to be wanting.

**B. Confirmation and disconfirmation by entailment.**
Whenever a hypothesis *H* logically entails evidence
*E*, *E* confirms *H*. This follows from the fact
that to determine the truth of *E* is to rule out a possibility
assumed to have non-zero prior probability that is incompatible with
*H* — the possibility that ~*E*. A corollary is
that, where *H* entails *E*, ~*E* would
disconfirm *H*, by reducing its probability to zero. The most
influential model of explanation in science is the
hypothetico-deductive model (e.g., Hempel). Thus, one of the most
important sources of support for Bayesian Confirmation Theory is that
it can explain the role of hypothetico-deductive explanation in
confirmation.

**C. Confirmation of logical equivalents.** If two
hypotheses H1 and H2 are logically equivalent, then evidence
*E* will confirm both equally. This follows from the fact that
logically equivalent statements always are assigned the same
probability.

**D. The confirmatory effect of surprising or diverse
evidence.** From the corollary above, it follows that whether
*E* confirms (or disconfirms) *H* depends on whether
*E* is more probable (or less probable) conditional on
*H* than it is unconditionally — that is, on whether:

(b1)P(E/H)/P(E) > 1.

An intuitive way of understanding (b1) is to say that it states that
*E* would be more expected (or less surprising) if it were
known that *H* were true. So if *E* is surprising, but
would not be surprising if we knew *H* were true, then
*E* will significantly confirm *H*. Thus, Bayesians
explain the tendency of surprising evidence to confirm hypotheses on
which the evidence would be expected.

Similarly, because it is reasonable to think that evidence
*E*_{1} makes other evidence of the same kind much more
probable, after *E*_{1} has been determined to be true,
other evidence of the same kind *E*_{2} will generally
not confirm hypothesis *H* as much as other diverse evidence
*E*_{3}, even if *H* is equally likely on both
*E*_{2} and *E*_{3}. The explanation is
that where *E*_{1} makes *E*_{2} much
more probable than *E*_{3}
(*P*_{i}(*E*_{2}/*E*_{1})
>>
*P*_{i}(*E*_{3}/*E*_{1}),
there is less potential for the discovery that *E*_{2}
is true to raise the probability of *H* than there is for the
discovery that *E*_{3} is true to do so.

**E. Relative confirmation and likelihood ratios.** Often
it is important to be able to compare the effect of evidence
*E* on two competing hypotheses,
*H*_{j} and H_{k}, without
having also to consider its effect on other hypotheses that may not be
so easy to formulate or to compare with
*H*_{j} and *H*_{k}.
From the first corollary above, the ratio of the final probabilities
of *H*_{j} and *H*_{k}
would be given by:

Ratio Formula:

P_{f}(H_{j})/P_{f}(H_{k}) = [P_{i}(E/H_{j}) ×P_{i}(H_{j})]/[P_{i}(E/H_{k}) ×P_{i}(H_{k})]

If the *odds of* H_{j} *relative to*
H_{k} are defined as ratio of their probabilities, then from
the Ratio Formula it follows that, in a case in which change in
degrees of belief results from conditionalizing on *E*, the
final odds
(*P*_{f}(*H*_{j})/*P*_{f}(*H*_{k}))
result from multiplying the initial odds
(*P*_{i}(*H*_{j})/*P*_{i}(*H*_{k}))
by the *likelihood ratio*
(*P*_{i}(*E*/*H*_{j})/*P*_{i}(*E*/*H*_{k})). Thus,
in pairwise comparisons of the odds of hypotheses, the likelihood
ratio is the crucial determinant of the effect of the evidence on the
odds.

**F. Subjective and Objective Bayesianism.** Are there
constraints on prior probabilities other than the probability laws?
Consider a situation in which you are to draw a ball from an urn
filled with red and black balls. Suppose you have no other
information about the urn. What is the prior probability (before
drawing a ball) that, given that a ball is drawn from the urn, that
the drawn ball will be black? The question divides Bayesians into two
camps:

(a) *Subjective Bayesians* emphasize the relative lack of
rational constraints on prior probabilities. In the urn example, they
would allow that any prior probability between 0 and 1 might be
rational (though some Subjective Bayesians (e.g., Jeffrey) would rule
out the two extreme values, 0 and 1). The most extreme Subjective
Bayesians (e.g., de Finetti) hold that the only rational constraint on
prior probabilities is probabilistic coherence. Others (e.g.,
Jeffrey) classify themselves as subjectivists even though they allow
for some relatively small number of additional rational constraints on
prior probabilities. Since subjectivists can disagree about
particular constraints, what unites them is that their constraints
rule out very little. For Subjective Bayesians, our actual prior
probability assignments are largely the result of non-rational
factors—for example, our own unconstrained, free choice or
evolution or socialization.

(b) *Objective Bayesians* (e.g., Jaynes and Rosenkrantz)
emphasize the extent to which prior probabilities are rationally
constrained. In the above example, they would hold that rationality
requires assigning a prior probability of 1/2 to drawing a black ball
from the urn. They would argue that any other probability would fail
the following test: Since you have no information at all about which
balls are red and which balls are black, you must choose prior
probabilities that are invariant with a change in label (“red” or
“black”). But the only prior probability assignment that is invariant
in this way is the assignment of prior probability of 1/2 to each of
the two possibilities (i.e., that the ball drawn is black or that it
is red).

In the limit, an Objective Bayesian would hold that rational
constraints uniquely determine prior probabilities in every
circumstance. This would make the prior probabilities *logical
probabilities* determinable purely *a priori*. None of
those who identify themselves as Objective Bayesians holds this
extreme form of the view. Nor do they all agree on precisely what the
rational constraints on degrees of belief are. For example,
Williamson does not accept Conditionalization in any form as a
rational constraint on degrees of belief. What unites all of the
Objective Bayesians is their conviction that in many circumstances,
symmetry considerations uniquely determine the relevant prior
probabilities and that even when they don't uniquely determine the
relevant prior probabilities, they often so constrain the range of
rationally admissible prior probabilities, as to assure convergence on
the relevant posterior probabilities. Jaynes identifies four general
principles that constrain prior probabilities, group invariance,
maximium entropy, marginalization, and coding theory, but he does not
consider the list exhaustive. He expects additional principles to be
added in the future. However, no Objective Bayesian claims that there
are principles that uniquely determine rational prior probabilities in
all cases.

By introducing symmetry constraints on prior probabilities, the Objective Bayesians inherit the difficulties of the classical Principle of Indifference, so-named by Keynes, but usually attributed to Laplace. The simple example of the urn illustrates how invariance considerations can be used to give content to the Principle of Indifference. There the objectivist is able to uniquely determine the prior probabilities from the requirement that the rational prior probabilities should be invariant under switching the labels used to classify the balls in the urn.

However, it is generally agreed by both objectivists and subjectivists that ignorance alone cannot be the basis for assigning prior probabilities. The reason is that in any particular case there must be some information to pick out which parameters or which transformations are the ones among which one is to be indifferent. Without such information, indifference considerations lead to paradox. Objective Bayesians have been quite creative in finding ways to resolve many of the paradoxes (e.g., Jeffreys' solution to Bertrand's Pardox, Jaynes's solution to Buffon's Needle Paradox, or Mikkelson's solution to van Mises' Paradox). But there are always more paradoxes. Charles, Höcker, Lacker, Le Diberder, and T'Jampens (Other Internet Resources) provide an actual example from physics where maximum entropy yields conflicting results depending on parameterization and where a frequentist approach seems to be superior to any Objective Bayesian approach that employs any form of Conditionalization.

**G. The typical differential effect of positive evidence and
negative evidence.** Hempel first pointed out that we typically
expect the hypothesis that all ravens are black to be confirmed to
some degree by the observation of a black raven, but not by the
observation of a non-black, non-raven. Let *H* be the
hypothesis that all ravens are black. Let *E*_{1}
describe the observation of a non-black, non-raven. Let
*E*_{2} describe the observation of a black
raven. Bayesian Confirmation Theory actually holds that both
*E*_{1} and *E*_{2} may provide some
confirmation for *H*. Recall that *E*_{1}
supports *H* just in case
*P*_{i}(*E*_{1}/*H*)/*P*_{i}(*E*_{1})
> 1. It is plausible to think that this ratio is ever so slightly
greater than one. On the other hand, *E*_{2} would seem to
provide much greater confirmation to *H*, because, in this example, it
would be expected that
*P*_{i}(*E*_{2}/*H*)/*P*_{i}(*E*_{2})
>>
*P*_{i}(*E*_{1}/*H*)/*P*_{i}(*E*_{1}).

These are only a sample of the results that have provided support
for Bayesian Confirmation Theory as a theory of rational inference for
science. For further examples, see Howson and Urbach. It
should also be mentioned that an important branch of statistics,
*Bayesian statistics* is based on the principles of Bayesian
epistemology.

## 5. Bayesian Social Epistemology

One of the important developments in Bayesian epistemology has been the exploration of the social dimension to inquiry. The obvious example is scientific inquiry, because it is the community of scientists, rather than any individual scientist, who determine what is or is not accepted in the discipline. In addition, scientists typically work in research groups and even those who work alone rely on the reports of other scientists to be able to design and carry out their own work. Other important examples of the social dimension to knowledge include the use of juries to make factual determinations in the legal system and the decentralization of knowledge over the Internet.

There are two ways that Bayesian epistemology can be applied to social inquiry:

(1) Bayesian epistemology of testimony (understood generally, to include not only personal testimony but all media sources of information). Goldman has developed a Bayesian epistemology of testimony and applied it to social entities such as science and the legal system. In any such approach, a crucial issue is how to evaluate the reliability of the reports one receives. Goldman's approach is to focus on institutional design to motivate the production of reliable reports. Bovens and Hartmann instead try to model how, when there are reports from multiple sources, a Bayesian agent can use probabilistic reasoning to judge the reliability of the reports, and thus, how much credence to place in them. The idea that in evaluating the probability of a report we are implicitly evaluating the reliability of the reporter is developed by Barnes as a potential explanation of the prediction/accommodation asymmetry, discussed in the next section.

(2) Aggregate Bayesianism. If scientific knowledge or jury deliberations produce a group product, it is natural to consider whether the group's knowledge can be represented in aggregate form. In Bayesian terms, the question is whether the individuals' probabililty assignments can be usefully aggregated into a single probability assignment that reflects the group's knowledge. Although Seidenfeld, Kadane, and Schervish have shown that there is generally no way to define an aggregate Bayesian expected utility maximizer to represent the Pareto preferences of a group of two or more individual Bayesian expected utility maximizers, there is no impossibility result precluding the aggregation of individual probabililty assignments into a group probability assignment. However, there is no generally agreed upon rule for doing so. If a group of Bayesian individuals all had begun from the same initial probabilities, then simply sharing their evidence would lead them all to the same final probabilities. It may seem unfortunate that unanimity in science and other social endeavors cannot be achieved so easily, but Kitcher has argued that this is a mistake, because cognitive diversity plays an important role in scientific progress.

The fruitfulness of Bayesian social epistemology may ultimately depend on whether or not the idealizations of Bayesian theory are too unrealistic. For example, if one of the important effects of jury deliberations is that they tend to provide a way for the group to correct for the irrationality of individual members, then no model of jurors as ideal Bayesians is likely to be able to explain that feature of the jury system.

## 6. Potential Problems

This section reviews some of the most important potential problems for Bayesian Confirmation Theory and for Bayesian epistemology generally. No attempt is made to evaluate their seriousness here, though there is no generally agreed upon Bayesian solution to any of them.

### 6.1 Objections to the Probability Laws as Standards of Synchronic Coherence

**A. The assumption of logical omniscience.** The
assumption that degrees of belief satisfy the probability laws implies
omniscience about deductive logic, because the probability laws
require that all deductive logical truths have probability one, all
deductive inconsistencies have probability zero, and the probability
of any conjunction of sentences be no greater than *any* of its
deductive consequences. This seems to be an unrealistic standard for
human beings. Hacking and Garber have made proposals to relax the
assumption of logical omniscience. Because relaxing that assumption
would block the derivation of almost all the important results in
Bayesian epistemology, most Bayesians maintain the assumption of
logical omniscience and treat it as an ideal to which human beings can
only more or less approximate.

**B. The special epistemological status of the laws of
classical logic.** Even if the assumption of logical
omniscience is not too much of an idealization to provide a useful
model for human reasoning, it has another potentially troubling
consequence. It commits Bayesian epistemology to some sort of a
priori/a posteriori distinction, because there could be no Bayesian
account of how empirical evidence might make it rational to adopt a
theory with a non-classical logic. In this respect, Bayesian
epistemology carries over the presumption from traditional
epistemology that the laws of logic are immune to revision on the
basis of empirical evidence.

It is open to the Bayesian to try to downplay the significance of this consequence, by articulating an a priori/a posteriori distinction that aims to be pragmatic rather than metaphysical (e.g., Carnap's analytic/synthetic distinction). However, any such account must address Quine's well-known holistic challenge to the analytic-synthetic distinction.

### 6.2 Objections to The Simple Principle of Conditionalization as a Rule of Inference and Other Objections to Bayesian Confirmation Theory

**A. The problem of uncertain evidence.** The Simple
Principle of Conditionalization requires that the acquisition of
evidence be representable as changing one's degree of belief in a
statement *E* to one — that is, to certainty. But many
philosophers would object to assigning probability of one to any
contingent statement, even an evidential statement, because, for
example, it is well-known that scientists sometimes give up previously
accepted evidence. Jeffrey has proposed a generalization of the
Principle of Conditionalization that yields that principle as a
special case. Jeffrey's idea is that what is crucial about
observation is not that it yields certainty, but that it generates a
non-inferential change in the probability of an evidential statement
*E* and its negation ~*E* (assumed to be the locus of
all the non-inferential changes in probability) from initial
probabilities between zero and one to
*P*_{f}(*E*) and
*P*_{f}(~*E*) = [1 −
*P*_{f}(*E*)]. Then on
Jeffrey's account, after the observation, the rational degree of
belief to place in an hypothesis *H* would be given by the
following principle:

Principle of Jeffrey Conditionalization:

P_{f}(H) =P_{i}(H/E) ×P_{f}(E) +P_{i}(H/~E) ×P_{f}(~E) [whereEandHare both assumed to have prior probabilities between zero and one]

Counting in favor of Jeffrey's Principle is its theoretical elegance. Counting against it is the practical problem that it requires that one be able to completely specify the direct non-inferential effects of an observation, something it is doubtful that anyone has ever done. Skyrms has given it a Dutch Book defense.

**B. The problem of old evidence.** On a Bayesian
account, the effect of evidence *E* in confirming (or
disconfirming) a hypothesis is solely a function of the increase in
probability that accrues to *E* when it is first determined to
be true. This raises the following puzzle for Bayesian Confirmation
Theory discussed extensively by Glymour: Suppose that *E* is an
evidentiary statement that has been known for some time — that
is, that it is *old evidence*; and suppose that *H* is a
scientific theory that has been under consideration for some time. One
day it is discovered that *H* implies *E*. In scientific
practice, the discovery that *H* implied *E* would
typically be taken to provide some degree of confirmatory support for
*H*. But Bayesian Confirmation Theory seems unable to explain
how a previously known evidentiary statement *E* could provide
any new support for H. For conditionalization to come into play, there
must be a change in the probability of the evidence statement
*E*. Where *E* is old evidence, there is no change in
its probability. Some Bayesians who have tried to solve this problem
(e.g., Garber) have typically tried to weaken the logical omniscience
assumption to allow for the possibility of discovering logical
relations (e.g., that *H* and suitable auxiliary assumptions
imply *E*). As mentioned above, relaxing the logical
omniscience assumption threatens to block the derivation of almost all
of the important results in Bayesian epistemology. Other Bayesians
(e.g., Lange) employ the Bayesian formalism as a tool in the
*rational reconstruction* of the evidentiary support for a
scientific hypothesis, where it is irrelevant to the rational
reconstruction whether the evidence was discovered before or after the
theory was initially formulated. Joyce and Christensen agree that
discovering new logical relations between previously accepted evidence
and a theory cannot raise the probability of the theory. However,
they suggest that using
*P*_{i}(*H*/*E*) −
*P*_{i}(*H*/*-E*) as a measure
of support can at least explain how evidence that has probability one
could still support a theory. Eells and Fitelson have criticized this
proposal and argued that the problem is better addressed by
distinguishing two measures, the historical measure of the degree to
which a piece of evidence *E* actually confirmed an hypothesis
*H* and the ahistorical measure of how much a piece of evidence
*E* would support an hypothesis *H*, on given background
information *B*. The second measure enables us to ask the
ahistorical question of how much *E* would support *H*
if we had no other evidence supporting *H*.

**C. The problem of rigid conditional probabilities.**
When one conditionalizes, one applies the initial conditional
probabilities to determine final unconditional probabilities.
Throughout, the conditional probabilities themselves do not change;
they remain rigid. Examples of the Problem of Old Evidence are but
one of a variety of cases in which it seems that it can be rational
to change one's initial conditional probabilities. Thus, many
Bayesians reject the Simple Principle of Conditionalization in favor
of a qualified principle, limited to situations in which one does not
change one's initial conditional probabilities. There is no
generally accepted account of when it is rational to maintain rigid
initial conditional probabilities and when it is not.

**D. The problem of prediction vs. accommodation.**
Related to the problem of Old Evidence is the following potential
problem: Consider two different scenarios. In the first, theory
*H* was developed in part to *accommodate* (i.e., to
imply) some previously known evidence E. In the second, theory
*H* was developed at a time when *E* was not known. It
was because *E* was derived as a *prediction* from
*H* that a test was performed and *E* was found to be
true. It seems that E's being true would provide a greater degree of
confirmation for *H* if the truth of *E* had been
*predicted* by *H* than if *H* had been developed
to *accommodate* the truth of *E*. There is no general
agreement among Bayesians about how to resolve this problem. Some
(e.g., Horwich) argue that Bayesianism implies that there is no
important difference between prediction and accommodation, and try to
defend that implication. Others (e.g., Maher) argue that there is a
way to understand Bayesianism so as to explain why there is an
important difference between prediction and accommodation.

**E. The problem of new theories.** Suppose that there is
one theory *H*_{1} that is generally regarded as highly
confirmed by the available evidence *E*. It is possible that
simply the introduction of an alternative theory
*H*_{2} can lead to an erosion of
*H*_{1}'s support. It is plausible to think that
Copernicus' introduction of the heliocentric hypothesis had this
effect on the previously unchallenged Ptolemaic earth-centered
astronomy. This sort of change cannot be explained by
conditionalization. It is for this reason that many Bayesians prefer
to focus on probability ratios of hypotheses (see the Ratio Formula
above), rather than their absolute probability; but it is clear that
the introduction of a new theory could also alter the probability
ratio of two hypotheses — for example, if it implied one of them
as a special case.

**F. The problem of the priors.** Are there constraints
on prior probabilities other than the probability laws? This is the
issue that divides the Subjective from the Objective Bayesians, as
discussed above. Consider Goodman's “new riddle of
induction”: In the past all observed emeralds have been
green. Do those observations provide any more support for the
generalization that all emeralds are green than they do for the
generalization that all emeralds are grue (green if observed before
now; blue if observed later); or do they provide any more support for
the prediction that the next emerald observed will be green than for
the prediction that the next emerald observed will be grue (i.e.,
blue)? Almost everyone agrees that it would be irrational to have
prior probabilities that were indifferent between green and grue, and
thus made predictions of greenness no more probable than predictions of
grueness. But there is no generally agreed upon explanation of this
constraint.

The problem of the priors identifies an important issue between the Subjective and Objective Bayesians. If the constraints on rational inference are so weak as to permit any or almost any probabilistically coherent prior probabilities, then there would be nothing to make inferences in the sciences any more rational than inferences in astrology or phrenology or in the conspiracy reasoning of a paranoid schizophrenic, because all of them can be reconstructed as inferences from probabilistically coherent prior probabilities. Some Subjective Bayesians believe that their position is not objectionably subjective, because of results (e.g., Doob or Gaifman and Snir) proving that even subjects beginning with very different prior probabilities will tend to converge in their final probabilities, given a suitably long series of shared observations. These convergence results are not completely reassuring, however, because they only apply to agents who already have significant agreement in their priors and they do not assure convergence in any reasonable amount of time. Also, they typically only guarantee convergence on the probability of predictions, not on the probability of theoretical hypotheses. For example, Carnap favored prior probabilities that would never raise above zero the probability of a generalization over a potentially infinite number of instances (e.g., that all crows are black), no matter how many observations of positive instances (e.g., black crows) one might make without finding any negative instances (i.e., non-black crows). In addition, the convergence results depend on the assumption that the only changes in probabilities that occur are those that are the non-inferential results of observation on evidential statements and those that result from conditionalization on such evidential statements. But almost all subjectivists allow that it can sometimes be rational to change one's prior probability assignments.

Because there is no generally agreed upon solution to the Problem of the Priors, it is an open question whether Bayesian Confirmation Theory has inductive content, or whether it merely translates the framework for rational belief provided by deductive logic into a corresponding framework for rational degrees of belief.

## 7. Other Principles of Bayesian Epistemology

Other principles of Bayesian epistemology have been proposed, but none has garnered anywhere near a majority of support among Bayesians. The most important proposals are merely mentioned here. It is beyond the scope of this entry to discuss them in any detail.

**A. Other principles of synchronic coherence.** Are the
probability laws the only standards of synchronic coherence for
degrees of belief? Van Fraassen has proposed an additional principle
(Reflection or Special Reflection), which he now regards as a special
case of an even more general principle (General
Reflection).^{[3]}

**B. Other probabilistic rules of inference.** There seem
to be at least two different concepts of probability: the probability
that is involved in degrees of belief (epistemic or subjective
probability) and the probability that is involved in random events,
such as the tossing of a coin (chance). De Finetti thought this was a
mistake and that there was only one kind of probability, subjective
probability. For Bayesians who believe in both kinds of probability,
an important question is: What is (or should be) the relation between
them? The answer can be found in the various proposals for principles
of direct inference in the literature. Typically, principles of
direct inference are proposed as principles for inferring subjective
or epistemic probabilities from beliefs about objective chance (e.g.,
Pollock). Lewis reverses the direction of inference, and proposes to
infer beliefs about objective chance from subjective or epistemic
probabilities, via his (Reformulated) Principal
Principle.^{[4]}
Strevens argues that it is Lewis's Principal Principle that gives
Bayesianism its inductive content.

**C. Principles of rational acceptance.** What is the
relation between beliefs and degrees of belief? Jeffrey proposes to
give up the notion of belief (at least for empirical statements) and
make do with only degrees of belief. Other authors (e.g., Levi,
Maher, Kaplan) propose principles of rational acceptance as part of
accounts of when it is rational to accept a statement as true, not
merely to regard it as probable.

## Bibliography

- Barnes, Eric Christian, 2005, “Predictivism for
Pluralists”,
*British Journal for the Philosophy of Science*56: 421–450. - Bayes, Thomas, 1764, “An Essay Towards Solving a Problem in the Doctrine
of Chances”,
*Philosophical Transactions of the Royal Society of London*, 53: 37–418, reprinted in E.S. Pearson and M.G. Kendall, eds., Studies in the History of Statistics and probability (London: Charles Griffin, 1970). - Bovens, Luc, and Stephan Hartmann, 2003,
*Bayesian Epistemology*, Oxford: Clarendon Press. - Carnap, Rudolf, 1950,
*Logical Foundations of Probability*, Chicago: University of Chicago Press. - –––, 1952,
*The Continuum of Inductive Methods*, Chicago: University of Chicago Press. - –––, 1956, “Meaning Postulates”,
in
*Meaning and Necessity*, Chicago: Phoenix Books, 222–229. - Christensen, David, 2004,
*Putting Logic in its Place: Formal Constraints on Rational Belief*, Oxford: Clarendon Press. - –––, 1999, “Measuring
Confirmation”,
*Journal of Philosophy*, 96: 437–461. - de Finetti, Bruno, 1937, “La Prevision: ses lois logiques,
se sources subjectives”,
*Annales de l'Institut Henri Poincare*, 7: 1–68; tTranslated into English and reprinted in Kyburg and Smokler,*Studies in Subjective Probability*, Huntington, NY: Krieger, 1980. - Doob, J.L., 1971, “What is a
Martingale?”,
*American Mathematical Monthly*, 78: 451–462. - Earman, John, 1991,
*Bayes or Bust? A Critical Examination of Bayesian Confirmation Theory*, Cambridge, MA: MIT Press. - Eells, Ellery, and Branden Fitelson, 2000, “Measuring
Confirmation and Evidence”,
*Journal of Philosophy*, 97: 663–672. - –––, 2002, “Symmetries and
Asymmetries in Evidential Support”,
*Philosophical Studies*, 107: 129–142. - Fitelson, Branden, 1999, “The Plurality of Bayesian Measures
of Confirmation and the Problem of Measure
Sensitivity”,
*Philosophy of Science*(Proceedings Supplement), 66: S362–378. - –––, 2003, “Review of James Joyce,
*The Foundations of Causal Decision Theory*”,*Mind*, 112: 545–551. - Gaifman, H., and Snir, M., 1982, “Probabilities over Rich
Languages”,
*Journal of Symbolic Logic*, 47: 495–548. - Garber, Daniel, 1983, “Old Evidence and Logical Omniscience
in Bayesian Confirmation Theory”, in J. Earman, ed.,
*Testing Scientific Theories*(Midwest Studies in the Philosophy of Science, Vol. X), Minneapolis: University of Minnesota Press, 99–131. - Goldman, Alvin I., 1999,
*Knowledge in a Social World*, Oxford: Clarendon Press. - Goodman, Nelson, 1983,
*Fact, Fiction, and Forecast*, Cambridge: Harvard University Press. - Glymour, Clark, 1980,
*Theory and Evidence*, Princeton: Princeton University Press. - Hacking, Ian, 1967, “Slightly More Realistic Personal Probability”,
*Philosophy of Science*, 34: 311–325. - Hempel, Carl G., 1965,
*Aspects of Scientific Explanation*, New York: Free Press. - Horwich, Paul, 1982,
*Probability and Evidence*, Cambridge: Cambridge University Press. - Howson, Colin, and Peter Urbach, 1993,
*Scientific Reasoning: The Bayesian Approach,*, 2nd ed., Chicago: Open Court. - Jaynes, E.T., 1968, “Prior
Probabilities”,
*Institute of Electrical and Electronic Engineers Transactions on Systems Science and Cybernetics*, SSC-4: 227–241. - –––, 2003,
*Probability Theory: The Logic of Science*, G. Larry Bretthorst (ed.), Cambridge: Cambridge University Press. - Jeffrey, Richard, 1983,
*The Logic of Decision*, 2nd ed., Chicago: University of Chicago Press. - –––, 1992,
*Probability and the Art of Judgment*, Cambridge: Cambridge University Press. - Jeffreys, Harold, 1948 [1961],
*Theory of Probability*, 3d ed., Oxford: Clarendon Press. - Joyce, James M., 1998, “A Nonpragmatic Vindication of
Probabilism”,
*Philosophy of Science*, 65: 575–603. - –––, 1999,
*The Foundations of Causal Decision Theory*, Cambridge: Cambridge University Press. - Kaplan, Mark, 1996,
*Decision Theory as Philosophy*, Cambridge: Cambridge University Press. - Keynes, John Maynard, 1921,
*A Treatise on Probability*, London: Macmillan. - Kitcher, Philip, 1990, “The Division of Cognitive
Labor”,
*Journal of Philosophy*, 87: 5–22. - Lange, Marc, 1999, “Calibration and the Epistemological Role of Bayesian
Conditionalization”,
*Journal of Philosophy*, 96: 294–324. - Laplace, P. S. Marquis de, 1820 [1886],
*Théorie Analytique des Probabilitis*, 3d ed., Paris: Gauthier-Villars. - Levi, Isaac, 1980,
*The Enterprise of Knowledge*, Cambridge, Mass.: MIT Press. - –––, 1991,
*The Fixation Of Belief And Its Undoing*, Cambridge: Cambridge University Press. - Lewis, David, 1980, “A Subjectivist's Guide to Objective Chance”, in
Richard C. Jeffrey (ed.),
*Studies in Inductive Logic and Probability*(Vol. 2), Berkeley: University of California Press, 263–293. - Maher, Patrick, 1988, “Prediction, Accommodation, and the
Logic of Discovery”,
*PSA*, 1: 273–285. - Maher, Patrick, 1993,
*Betting on Theories*, Cambridge: Cambridge University Press. - Mikkelson, Jeffrey M., 2004, “Dissolving the Wine/Water Paradox”,
*British Journal for the Philosophy of Science*, 55: 137–145. - Pollock, John L., 1990,
*Nomic Probability and the Foundations of Induction*, Oxford: Oxford University Press. - Popper, Karl, 1968,
*The Logic of Scientific Discovery*, 3^{rd}ed., London: Hutchinson. - Quine, W.V.O., 1966, “Carnap on Logical Truth”,
in
*The Ways of Paradox*, New York: Random House: 100–125. - Ramsey, Frank P., 1926, “Truth and Probability,” in
Richard B. Braithwaite (ed.),
*Foundations of Mathematics and Other Logical Essay*, London: Routledge and Kegan Paul, 1931, pp. 156–198. - Réyni, A., 1955, “On a New Axiomatic Theory of
Probability”,
*Acta Mathematica Academiae Scientiarium Hungaricae*, 6: 285–385. - Rosenkrantz, R.D., 1981,
*Foundations and Applications of Inductive Probability*, Atascadero, CA: Ridgeview Publishing. - Savage, Leonard, 1972,
*The Foundations of Statistics*, 2nd ed., New York: Dover. - Seidenfeld, Teddy, Joseph B. Kadane, and Mark J. Schervish, 1989,
“On the Shared Preferences of Two Bayesian Decision
Makers”,
*Journal of Philosophy*, 86: 225–244. - Shimony, Abner, 1988, “An Adamite Derivation of the Calculus
of Probability”, in J.H. Fetzer (ed.),
*Probability and Causalty*, Dordrecht: Reidel. - Skyrms, Brian, 1984,
*Pragmatics and Empiricism*, New Haven: Yale University Press. - –––, 1990,
*The Dynamics of Rational Deliberation*, Cambridge, Mass.: Harvard University Press. - Sober, Elliott, 2002, “Bayesianism—Its Scope and
Limits”, in Richard Swinburne (ed.),
*Bayes's Theorem*, Oxford: Oxford University Press, 21–38. - Strevens, Michael, 2004, “Bayesian Confirmation Theory:
Inductive Logic, or Mere Inductive
Framework?”,
*Synthese*, 141: 365–379. - Teller, Paul, 1976, “Conditionalization, Observation, and
Change of Preference”, in W. Harper and C.A. Hooker
(eds.),
*Foundations of Probability Theory, Statistical Inference, and Statistical Theories of Science*, Dordrecht: D. Reidel. - Van Fraassen, Bas C., 1983, “Calibration: A Frequency
Justification for Personal Probability”, in R.S. Cohen and
L. Laudan (eds.),
*Physics, Philosophy, and Psychoanalysis: Essays in Honor of Adolf Grunbaum*, Dordrecht: Reidel. - –––, 1984, “Belief and the
Will”,
*Journal of Philosophy*, 81: 235–256. - –––, 1995, “Belief and the Problem of
Ulysses and the Sirens”,
*Philosophical Studies*, 77: 7–37. - Williamson, Jon, 1999, “Countable Additivity and Subjective
Probability”,
*British Journal for the Philosophy of Science*, 50: 401–416. - –––, 2007, “Motivating Objective
Bayesianism: From Empirical Constraints to Objective
Probabilities,” in W. E. Harper and G. R. Wheeler
(eds.),
*Probability and Inference: Essays in Honour of Henry E. Kyburg, Jr.*, Amsterdam: Elsevier. - Zynda, Lyle, 1995, “Old Evidence and New
Theories”,
*Philosophical Studies*, 77: 67–95.

## Academic Tools

How to cite this entry. Preview the PDF version of this entry at the Friends of the SEP Society. Look up this entry topic at the Indiana Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers, with links to its database.

## Other Internet Resources

- Charles, J., Hocker, A., Lacker, H., Le Diberder, F.R., T'Jampens, S., 2006, “Bayesian Statistics at Work: the Troublesome Extraction of the CKM Phase alpha,” preprint at ArXiv.org.

### Acknowledgments

In the preparation of this article, I have benefited from comments from Marc Lange, Stephen Glaister, Laurence BonJour, and James Joyce.