# Self-Reference

*First published Tue Jul 15, 2008*

In the context of language, *self-reference* is used to denote
a statement that refers to itself or its own referent. The most famous
example of a self-referential sentence is the *liar sentence*:
"This sentence is not true." Self-reference is often used in a broader
context as well. For instance, a picture could be considered
self-referential if it contains a copy of itself (see the animated
image above); and a piece of literature could be considered
self-referential if it includes a reference to the work itself. In
philosophy, self-reference is primarily studied in the context of
language. Self-reference within language is not only a subject of
philosophy, but also a field of individual interest in mathematics and
computer science, in particular in relation to the foundations of
these sciences.

The philosophical interest in self-reference is to a large extent
centered around the paradoxes. A *paradox* is a seemingly sound
piece of reasoning based on apparently true assumptions that leads to a
contradiction. The liar sentence considered above leads to a contradiction
when we try to determine whether it is true or not. If we assume the
sentence to be true, then what it states must be the case, that
is, it cannot be true. If, on the other hand, we assume it not to be
true, then what it states is actually the case, and thus it must be
true. In either case we are led to a contradiction. Since the
contradiction was obtained by a seemingly sound piece of reasoning
based on apparently true assumptions, it qualifies as a paradox. It is
known as the *liar paradox*.

Most paradoxes of self-reference may be categorised as either
*semantic*, *set-theoretic* or *epistemic*. The
semantic paradoxes, like the liar paradox, are primarily relevant to
theories of truth. The set-theoretic paradoxes are relevant to the
foundations of mathematics, and the epistemic paradoxes are relevant
to epistemology. Even though these paradoxes are different in the
subject matter they relate to, they share the same underlying
structure, and may often be tackled using the same mathematical
means.

In the present entry, we will first introduce a number of the most well-known paradoxes of self-reference, and discuss their common underlying structure. Subsequently, we will discuss the profound consequences that these paradoxes have on a number of different areas: theories of truth, set theory, epistemology, foundations of mathematics, computability. Finally, we will present the most prominent approaches to solving the paradoxes.

- 1. Paradoxes of Self-Reference
- 2. Why Paradoxes Matter
- 3. Solving the paradoxes
- Bibliography
- Other Internet Resources
- Related Entries

## 1. Paradoxes of Self-Reference

### 1.1 Semantic Paradoxes

Paradoxes of self-reference have been known since antiquity. The
discovery of the liar paradox is often credited to Eubulides the
Megarian who lived in the 4th century BC. The liar paradox belongs to
the category of *semantic* paradoxes, since it is based on the
semantic notion of truth. Other well-known semantic paradoxes include
Grelling's paradox, Berry's paradox, and Richard's paradox.

*Grelling's paradox* involves a predicate defined as
follows. Say a predicate is *heterological* if it is not true
of itself, that is, if it does not itself have the property it
expresses. Thus the predicate "German" is heterological, since it is
not itself a German word, but the predicate "deutsch" is not
heterological. The question that leads to the paradox is now:

Is "heterological" heterological?

It is easy to see that we obtain a contradiction independently of
whether we answer "yes" or "no" to this question (the argument runs
more or less like in the liar paradox). Grelling's paradox is
self-referential, since the definition of the predicate heterological
refers to *all* predicates, including the predicate
heterological itself. Definitions such as this which depends on a set
of entities, at least one of which is the entity being defined, are
called *impredicative*.

*Berry's paradox* is another paradox based on an impredicative
definition, or rather, an impredicative description. Some phrases of
the English language are descriptions of natural
numbers, for example, "the sum of five and seven" is a description of
the number 12. Berry's paradox arises when trying to determine the
denotation of the following description:

the least number that cannot be referred to by a description containing less than 100 symbols.

The contradiction is that this description containing 93 symbols
denotes a number which, by definition, cannot be denoted by any
description containing less than 100 symbols. The description is of
course impredicative, since it implicitly refers to *all*
descriptions, including itself.

*Richard's paradox* considers phrases of the English language
defining real numbers rather than natural numbers. For
example, "the ratio between the circumference and diameter of a
circle" is a phrase defining the number π. Assume an
enumeration of all such phrases is given (e.g. by putting them into
lexicographical order). Now consider the phrase:

the real number whosenth decimal place is 1 whenever thenth decimal place of thenth phrase is 0; otherwise 0.

This phrase defines a real number, so it must be among the enumerated
phrases, say number *k* in this enumeration. But, at the same
time, by definition, it differs from the number denoted by the
*k*th phrase in the *k*th decimal place. Thus we have a
contradiction. The defining phrase is obviously impredicative. The
particular construction employed in this paradox is called
*diagonalisation*. Diagonalisation is a general construction
and proof method originally invented by Georg Cantor (1891) to prove
the uncountability of the power set of the natural numbers. It was
also used as a basis for Cantor's paradox, one of the set-theoretic
paradoxes to be considered next.

### 1.2 Set-Theoretic Paradoxes

The best-known set-theoretic paradoxes are Russell's paradox and
Cantor's paradox. *Russell's paradox* arises from considering
the *Russell set* *R* of all sets that are not members
of themselves, that is, the set defined defined by *R* = {
*x* | *x*
∉
*x* }. The contradiction is then derived by asking whether
*R* is a member of itself, that is, whether *R* ∈
*R* holds. If *R* ∈ *R* then *R* is a
member of itself, and thus *R*
∉
*R*, by definition of *R*. If, on the other hand,
*R*
∉
*R* then *R* is not a member of itself, and thus
*R* ∈ *R*, again by definition of *R*.

*Cantor's paradox* is based on an application of Cantor's
theorem. *Cantor's theorem* states that given any finite or
infinite set *S*, the power set of *S* has larger
cardinality (greater size) than *S*. The theorem is proved by a
form of diagonalisation, the same idea as underlying Berry's
paradox. Cantor's paradox considers the set of all sets. Let us call
this set the *universal set* and denote it by *U*. The
power set of *U* is denoted
℘(*U*).
Since *U* contains *all* sets it will in particular
contain all elements of
℘(*U*).
Thus
℘(*U*)
must be a subset of *U* and must thus have a cardinality which
is less than or equal to the cardinality of *U*. However, this
immediately contradicts Cantor's theorem.

The *Hypergame paradox* is a more recent addition to the list
of set-theoretic paradoxes, invented by Zwicker (1987). Let us call a
two-player game *well-founded* if it is bound to terminate in a
finite number of moves. Tournament chess is an example of a
well-founded game. We now define *hypergame* to be the game in
which player 1 in the first move chooses a well-founded game to be
played, and player 2 subsequently makes the first move in the chosen
game. All remaining moves are then moves of the chosen game. Hypergame
must be a well-founded game, since any play will last exactly one move
more than some well-founded game. However, if hypergame is
well-founded then it must be one of the games that can be chosen in
the first move of hypergame, that is, player 1 can choose hypergame in
the first move. This allows player 2 to choose hypergame in the
subsequent move, and thus the two players can continue choosing
hypergame *ad infinitum*. Thus hypergame cannot be
well-founded, contradicting our previous conclusion.

### 1.3 Epistemic paradoxes

The most well-know epistemic paradox is the *paradox of the
knower*. This paradox has many equivalent formulations, one of
them based on the sentence "This sentence is not known by anyone." Let
us call this sentence the *knower sentence*, abbreviated
*KS*. *KS* is obviously quite similar to the liar
sentence, except the central concept involved is knowledge rather than
truth. The reasoning leading to a contradiction from *KS* is a
bit more complex than in the liar paradox. First *KS* is shown
to be true by the following piece of reasoning:

Assume to obtain a contradiction thatKSis not true. Then whatKSexpresses cannot be the case, that is,KSmust be known by someone. Since everything known is true (this is part of the definition of the concept of knowledge),KSis true, contradicting our assumption. This concludes the proof that theKSis true.

The piece of reasoning just carried out to prove the truth of
*KS* should be available to any agent (person) with sufficient
reasoning capabilities. That is, an agent should be able to prove the
truth of *KS*, and thus come to know that *KS*
holds. However, if *KS* is known by someone, then what it
expresses is not the case, and thus it cannot be true. This is a
contradiction, and thus we have a paradox. The role of self-reference
in this paradox is obvious, as it is based on a sentence *KS*
referring directly to itself.

The paradox of the knower is just one of many epistemic paradoxes involving self-reference. See the entry on epistemic paradoxes for further information on the class of epistemic paradoxes. For a detailed discussion and history of the paradoxes of self-reference in general, see the entry on paradoxes and contemporary logic.

### 1.4 Common Structures in the Paradoxes

The paradoxes above are all quite similar in structure. In the case of
the paradoxes of Grelling and Russell, this can be seen as
follows. Define the *extension* of a predicate to be the set of
objects it is true of. For a predicate *P* we denote its
extension by ext(*P*). Grelling's paradox involves the
predicate heterological, which is true of all those predicates that
are not true of themselves. Thus the extension of the predicate
heterological is the set { *P* | *P*
∉
ext(*P*) }. Compare this to the Russell set *R* given by
{ *x* | *x*
∉
*x* }. The only significant difference between these two sets
is that the first is defined on predicates whereas the second is
defined on sets. The contradictions obtained from analysing these two
sets are also similar. Both contradictions can be presented as
sequences of equivalences, and both sequences share the same structure,
as seen below (where "het" abbreviates "heterological"):

• Grelling's paradox: het ∈ ext(het) ⇔ het ∈ { P|P∉ ext(P) }⇔ het ∉ ext(het). • Russell's paradox: R∈R⇔ R∈ {x|x∉x}⇔ R∉R.

Thus we have two paradoxes of an almost identical structure belonging to two different classes of paradoxes: one is semantic and the other set-theoretic. What this teaches us is that even if paradoxes seem different by involving different subject matters, they might be almost identical in their underlying structure. Thus in many cases it makes most sense to study the paradoxes of self-reference under one, rather than study, say, the semantic and set-theoretic paradoxes separately.

Russell's paradox and Cantor's paradox are also more similar than it
might at first seem. Cantor's paradox is based on an application of
Cantor's theorem to the universal set *U*. Below we give the
proof of Cantor's theorem for an arbitrary set *S*.

We need to prove that ℘(

S) has greater cardinality thanS. Assume to obtain a contradiction that this is not the case. Then there must exist a mapffromSonto ℘(S). Now consider the setC= {x∈S|x∉f(x) }. Sincefis onto ℘(S), there must exist a setc∈Ssuch thatf(c)=C. However, we now obtain a contradiction, since the following holds:

c∈f(c)⇔ c∈ {x∈S|x∉f(x) }⇔ c∉f(c).

Note the similarity between this sequence of equivalences and the
corresponding sequences of equivalences derived for Russell's and
Grelling's paradoxes above. Now consider the special case of Cantor's
theorem where *S* is the universal set. Then we can simply
choose *f* to be the identity function, since
℘(*U*)
must necessarily be a subset of the universal set *U*. But
then *C* becomes the Russell set! Thus Cantor's paradox is
nothing more than a slight variant of Russell's paradox; the core
argument leading to the contradiction is the same in both.

Priest (1994) gives even firmer evidence to the similarity between the
paradoxes of self-reference by showing that they all fit into what he
calls *Russell's schema*. The idea behind it goes back to
Russell himself (1905) who also considered the paradoxes of
self-reference to have a common underlying structure. Given two
predicates predicates P and Q, and a possibly partial function
δ, Russell's schema consists of the following two conditions:

*w*= {*x*|*P*(*x*) } exists and*Q*(*w*) holds;- if
*x*is a subset of*w*such that*Q*(*x*) holds then:- δ(
*x*) ∉*x*, - δ(
*x*) ∈*w*.

- δ(

If these conditions are satisfied we have the following contradiction:
Since *w* is trivially a subset of *w* and since
*Q*(*w*) holds by condition 1, we have both
δ(*w*)
∉
*w* and δ(*w*) ∈ *w*, by 2a and 2b,
respectively. Thus any triple (*P*,*Q*,δ)
satisfying the Russell schema will produce a paradox. Priest shows how
most of the well-known paradoxes of self-reference fit into Russell's
schema. Below we will consider only a few of these paradoxes, starting
with Russell's paradox. In this case we define the triple
(*P*,*Q*,δ) as follows:

*P*(*x*) is the predicate "*x*∉*x*".*Q*(*x*) is the universal predicate true of any object.- δ is the identity function.

Then *w* in Russell's schema becomes the Russell set and the
contradiction obtained from the schema becomes Russell's paradox.

In the case of Richard's paradox we define the triple by:

*P*(*x*) is the predicate "*x*is a real definable by a phrase in English."*Q*(*x*) is the predicate "*x*is definable by a phrase in English."- δ is the function that maps any infinite set
*x*of well-ordered reals to the real*y*whose*n*th decimal place is 1 whenever the*n*th decimal of the*n*th real in*x*is 0; otherwise 0.

Here *w* = { *x* | *P*(*x*) } becomes the
set of all reals definable by phrases in English. For any well-ordered
subset *x* of *w*, δ(*x*) is a real that by
construction will differ from all reals in *x* (it differs from
the *n*th real in *x* on the *n*th decimal
place). Letting *x* equal *w* we thus get δ(*w*)
∉
*w*. However, at the same time δ(*w*) is
definable by a phrase in English, so δ(*w*) ∈
*w*, and we have a contradiction. This contradiction is
Richard's paradox.

The liar paradox also fits Russell's schema, albeit in a slightly less direct way:

*P*(*x*) is the predicate "*x*is true."*Q*(*x*) is the predicate "*x*is definable."- δ(
*x*) is the sentence "this sentence does not belong to the set*x*."

Here *w* = { *x* | *P*(*x*) } becomes the set of
true sentences, and δ(*w*) becomes a version of the liar
sentence: "this sentence does not belong to the set of true
sentences".

From the above it can be concluded that all, or at least most,
paradoxes of self-reference share a common underlying
structure. Priest (1994) argues that they should then also share a
common solution. Priest calls this the *principle of uniform
solution*: "same kind of paradox, same kind of solution."

### 1.5 A Paradoxes without Self-Reference

In 1985, Yablo succeeded in constructing a semantic paradox that does
not involve self-reference in the strict sense. Instead, it consists
of an infinite chain of sentences, each sentence expressing the
untruth of all the subsequent ones. More precisely, for each natural
number *i* we define *S*_{i} to be the
sentence "for all *j*>*i*,
*S*_{j} is not true". We can then derive a
contradiction as follows:

First we prove that none of the sentences

S_{i}can be true. Assume to obtain a contradiction thatS_{i}is true for somei. Then it is true that "for allj>i,S_{j}is not true". Thus none of the sentencesS_{j}forj>iare true. In particular,S_{i+1}is not true.S_{i+1}is the sentence "for allj>i+1,S_{j}is not true". Since this sentence is not true, there must be somek>i+1 for whichS_{k}is true. However, this contradicts that none of the sentencesS_{j}withj>iare true.We have now proved that none of the sentences

S_{i}are true. Then, in particular, we have that for allj>0, S_{j}is not true. This is exactly what is expressed by S_{0}, so S_{0}must be true. This is again a contradiction.

Yablo calls this paradox the *ω-liar*, but others usually
refer to it as *Yablo's paradox*. Note that none of the
sentences *S*_{i} refer to themselves (not
even indirectly), but only to the ones that occur later in the
sequence. Yablo's paradox is semantic, but as shown by Yablo (2006),
similar set-theoretic paradoxes involving no self-reference can be
formulated in certain set theories.

Yablo's paradox demonstrates that we can have logical paradoxes
without self-reference – only a certain kind of
non-wellfoundedness is needed to obtain a contradiction. There are
obviously structural differences between the ordinary paradoxes of
self-reference and Yablo's paradox: the ordinary paradoxes of
self-reference involve a cyclic structure of reference, whereas
Yablo's paradox involve an acyclic, but non-wellfounded, structure of
reference. More precisely, the referential structure in
self-referential paradoxes such as the liar is a reflexive relation on
a singleton set, whereas the referential structure in Yablo's paradox
is isomorphic to the usual less-than ordering on the natural numbers,
which is an irreflexive relation. Even though there is this
difference, Yablo's paradox still share most properties with the
ordinary paradoxes of self-reference. When solving paradoxes we might
thus choose to consider them all under one, and refer to them as
*paradoxes of non-wellfoundedness*. In the following we will
however stick to the term *paradoxes of self-reference*, even
though most of what we say will apply to Yablo's paradox and related
paradoxes of non-wellfoundedness as well.

## 2. Why the Paradoxes Matter

After having presented a number of paradoxes of self-reference and
discussed some of their underlying similarities, we will now turn to a
discussion of their *significance*. The significance of a
paradox is its indication of a flaw or deficiency in our understanding
of the central concepts involved in it. In case of the semantic
paradoxes, it seems that it is our understanding of fundamental
semantic concepts such as *truth* (in the liar paradox and
Grelling's paradox) and *definability* (in Berry's and
Richard's paradoxes) that are deficient. In case of the set-theoretic
paradoxes, it is our understanding of the concept of a
*set*. If we fully understood these concepts, we should to be
able to deal with them without being led to contradictions. To
illustrate this, consider the case of Zeno's classical paradox on
*Achilles and the Tortoise* (see the entry
Zeno's paradoxes
for details). In this paradox we seem able to prove that the tortoise
can win a race against the 10 times faster Achilles if given an
arbitrarily small head start. Zeno used this paradox as an argument
against the possibility of motion. It has later turned out that the
paradox rests on an inadequate understanding of infinity. More
precisely, it rests on an implicit assumption that any infinite series
of positive reals must have an infinite sum. The later developments of
the mathematics of infinite series has shown that this assumption is
invalid, and thus the paradox dissolves. The original acceptance of
Zeno's argument as a paradox was a symptom of the fact that the
concept of infinity was not sufficiently well understood at the
time. In analogy, it seems reasonable to expect that the existence of
semantic and set-theoretic paradoxes is a symptom of the fact that the
involved semantic and set-theoretic concepts are not yet sufficiently
well understood.

The paradoxes present serious foundational problems in semantics and set theory: no claim can be made to a solid foundation for these subjects until a satisfactory solution to the paradoxes have been provided. Problems surface when it comes to formalising semantics (the concept of truth) and set theory. If formalising the intuitive, "naive" understanding of these subjects inconsistent systems linger as the paradoxes will be formalisable in these systems.

### 2.1 Consequences of the Semantic Paradoxes

The liar paradox is a significant barrier to the construction of formal theories of truth as it produces inconsistencies in these potential theories. A substantial amount of research in self-reference concentrates on formal theories of truth and ways to circumvent the liar paradox. There are two articles that have influenced the work on formal theories of truth and the liar paradox more than any other: Alfred Tarski's "The Concept of Truth in Formalised Languages" (1935) and Saul Kripke's "Outline of a Theory of Truth" (1975). Below we first introduce some of the ideas and results of Tarski's article. Kripke's article is discussed in Section 3.2.

Tarski gives a number of conditions that, as he puts it, any
*adequate definition of truth* must satisfy. The central of
these conditions is what is now most often referred to as
*Schema T* (or the *T-schema* or *Convention T* or the
*Tarski biconditionals*):

Schemaφ ↔T:T<φ>, for all sentences φ.

Here *T* is the predicate intended to express truth and
<φ> is a *name* for the sentence φ. Applying the
predicate *T* to the name <φ> gives the
expression *T*<φ> intended to represent the phrase
"φ is true". Schema *T* thus expresses that for every
sentence φ, φ holds if and only if the sentence "φ is
true" holds. The *T*-schema is usually regarded as a set of
sentences within a formal theory. It is customary to use first-order
arithmetic, that is, first-order predicate logic extended with a set
of standard axioms for arithmetic like PA (Peano Arithmetic) or
Robinson's Q. What is being said in the following will apply to any
such first-order formalisation of arithmetic. In this setting,
<φ> above denotes the *Gödel code* of
*φ*, and *T*<φ> is short for
*T*(<φ>). The reader not familiar with Gödel
codings (also known as Gödel numberings) can just think of the
mapping <·> as a naming device or quotation mechanism for
formulae — just like quotation marks in natural language. Often
used notational variant for <φ> are
^{⌈}φ^{⌉}
and
‘φ’.

Tarski showed that the liar paradox is formalisable in any formal
theory containing his schema T, and thus any such theory must be
inconsistent. This result is often referred to as *Tarski's theorem
on the undefinability of truth*. The result is basically a
formalisation of the liar paradox within first-order arithmetic
extended with the *T*-schema. In order to construct such a
formalisation it is necessary to be able to formulate self-referential
sentences (like the liar sentence) within first-order arithmetic. This
ability is provided by the diagonal lemma.

Diagonal lemma.

LetSbe a theory extending first-order arithmetic. For every formula φ(x) there is a sentence ψ such thatS⊢ ψ ↔ φ<ψ>.

Here the notation
*S*⊢α
means that α is provable in the theory S, and
φ<ψ> is short for φ(<ψ>). Assume a formula
φ(*x*) is given that is intended to express some property
of sentences – truth, for instance. Then the diagonal lemma
gives the existence of a sentence ψ satisfying the biimplication
ψ ↔ φ<ψ>.
The sentence φ<ψ> can be thought of as
expressing that the sentence ψ has the property expressed by
φ(*x*). The biimplication thus expresses that ψ is equivalent to
the sentence expressing that ψ has property φ. One can
therefore think of ψ as a sentence *expressing of itself*
that it has property φ. In the case of truth, it would be a
sentence expressing of itself that it is true. The sentence ψ is
of course not self-referential in a strict sense, but mathematically
it behaves like one. It is therefore possible to use sentences
generated by the diagonal lemma to formalise paradoxes based on
self-referential sentences, like the liar. The diagonal lemma is
sometimes called the *fixed-point lemma*, since the equivalence
ψ ↔ φ<ψ> can be seen as expressing that ψ is a
fixed-point of φ(*x*).

A theory in first-order predicate logic is called
*inconsistent* if a logical contradiction is provable in
it. Tarski's theorem (on the undefinability of truth) may now be
stated and proved.

Tarski's theorem.

Any theory extending first-order arithmetic and containing schemaTis inconsistent.

Proof.Assume the existence of a consistent formal theorySextending first-order arithmetic and containing schemaT. We need to show that this assumption leads to a contradiction. The proof mimics the liar paradox. Apply the diagonal lemma to obtain a sentence λ satisfying λ ↔ ¬T<λ> in S. The sentence λ expresses of itself that it is not true, so λ corresponds to the liar sentence. Instantiating schemaTwith the sentence λ gives us λ ↔T<λ>. We now have that both λ ↔ ¬T<λ> and λ ↔T<λ> hold inS(are provable inS), and thusT<λ> ↔ ¬T<λ> must hold inS. This contradictsSbeing consistent. □

Note that the contradiction *T*<λ> ↔ ¬
*T*<λ> above expresses: The liar sentence is true
if and only if it is not. Compare this to the informal liar presented
in the beginning of the article. Tarski's theorem shows that, in the
setting of first-order arithmetic, it is not possible to give what
Tarski considers to be an ‘adequate theory of truth’. The
central question then becomes: How may the formal setting or the
requirements for an adequate theory of truth be modified to regain
consistency — that is, to prevent the liar paradox from
trivialising the system? There are many different answers to this
question, as there are many different ways to regain consistency. In
Section 3 we will review the most influential approaches to this
problem.

### 2.2 Consequences of the Set-Theoretic Paradoxes

The set-theoretic paradoxes constitute a significant challenge to the
foundations of mathematics. They show that it is impossible to have a
concept of set satisfying the *unrestricted comprehension
principle* (also called *full comprehension* or
*unrestricted abstraction*):

Unrestricted comprehension:

∀u(u∈ {x| φ(x) } ↔ φ(u)), for all formulae φ(x).

In an informal setting, the formulae φ(*x*) could be
allowed to be arbitrary predicates. In a more formal setting they
would be formulae of e.g. a suitable first-order language. The
unrestricted comprehension principles says that for any property
(expressed by φ) there is the *set* of those entities that
satisfy the property. This sounds as a very reasonable principle, and
it more or less captures the intuitive concept of a
set. Unfortunately, the principle is not sound, as it gives rise to
Russell's paradox. Consider the property of non-self-membership. This
can be expressed by the formula *x*
∉
*x*. If we let φ(*x*) be the formula *x*
∉
*x* then the set { *x* | φ(*x*) } becomes
the Russell set *R*, and we obtain the following instance of
the unrestricted comprehension principle:

∀u(u∈R↔u∉ u).

Analogous to the argument in Russell's paradox a contradiction is now
obtained by instantiating *u* with *R*:

R∈R↔R∉R.

This contradiction expresses that the Russell set is a member of itself if and only if it is not. What has hereby been proven is the following.

Theorem (Inconsistency of Naive Set Theory).

Any theory containing the unrestricted comprehension principle is inconsistent.

Compare this theorem with Tarski's theorem. Tarski's theorem expresses that if we formalise the intuitively most obvious principle concerning truth we end up with an inconsistent theory. The theorem above expresses that the same thing happens when formalising the intuitively most obvious principle concerning set membership.

Given the inconsistency of unrestricted comprehension, the objective becomes to find a way to restrict it or the underlying logical principles to regain a consistent theory — that is, a set theory that will not be trivialised by Russell's paradox. Many alternative set theories excluding the unrestricted comprehension principle have been developed during the last century, among them the type theory of Russell and Whitehead, Simple Type theory (ST), Gödel-Bernays set theory (GB), Zermelo-Fraenkel set theory (ZF), and Quine's New Foundations (NF). These are all believed to be consistent, although no simple proofs of their consistency are known. However, no counter-examples to their consistency are known either. At least they all escape the known paradoxes of self-reference. We will return to a discussion of this in Section 3.

### 2.3 Consequences of the Epistemic Paradoxes

The epistemic paradoxes constitute a threat to the construction of
formal theories of knowledge, as the paradoxes become formalisable in
many such theories. Suppose we wish to construct a formal theory of
*knowability* within an extension of first-order
arithmetic. The reason for choosing to formalise knowability rather
than knowledge is that knowledge is always relative to a certain agent
at a certain point in time, whereas knowability is a universal concept
like truth. We could have chosen to work directly with knowledge
instead, but it would require more work and make the presentation
unnecessarily complicated. To formalise knowability we introduce a
special predicate *K*, and use sentences of the
form *K*<φ> to express that φ is
knowable. Analogous to the cases of truth and set membership, there
must be certain logical principles that *K* needs to satisfy in
order for our formal theory to qualify as an adequate theory of
knowability. First of all, all knowable sentences must be true. This
property can be formalised by the following logical principle:

A1.K<φ> → φ, for all sentences φ.

Of course this principle must itself be knowable, that is, we get the following logical principle:

A2.K<K<φ> → φ>, for all sentences φ.

In addition, all theorems of first-order arithmetic ought to be knowable:

A3.K<φ>, for all sentences φ of first-order arithmetic.

Furthermore, knowability must be closed under logical consequences:

A4.K<φ → ψ> → (K<φ> →K<ψ>), for all sentences φ.

Now, the principles A1–A4 is all it takes to formalise the paradox of the knower. More precisely, we have the following theorem due to Montague (1963).

Montague's theorem.

Any formal theory extending first-order arithmetic and containing axiom schemas A1–A4 is inconsistent.

Proof.Assume the existence of a consistent formal theorySextending first-order arithmetic and containing axiom schemas A1–A4. We need to show that this assumption leads to a contradiction. The proof mimics the paradox of the knower. Apply the diagonal lemma to obtain a sentence λ satisfying λ ↔ ¬K<λ> inS. The sentence λ expresses of itself that it is not knowable, so λ roughly corresponds to the knower sentence,KS. The first piece of argumentation used in the paradox of the knower led to the conclusion thatKSis indeed true. This piece of argumentation is mimicked by the following piece of formal reasoning withinS:

1. λ → ¬ K<λ>by choice of λ 2. ¬ K<λ> → λby choice of λ 3. K<λ> → λaxiom A1 4. ( K<λ> → λ) →

((λ → ¬K<λ>) → ¬K<λ>)propositional tautology 5. (λ → ¬ K<λ>) → ¬K<λ>modus ponens on 4,3 6. ¬ K<λ>modus ponens on 5,1 7. λ modus ponens on 2,6 This proof shows that λ — our formal version of

KS— is provable inS. The proof corresponds to the informal argument thatKSis true. As argued in the paradox of the knower, any agent with sufficient reasoning capabilities will be able to prove the truth ofKS, and thus come to know thatKSholds. ThusKSmust be knowable. What this means in the present formal framework is that we can also prove the knowability of λ inS:

8. K<λ → ¬K<λ>>by A3 and choice of λ 9. K<¬K<λ> → λ>by A3 and choice of λ 10. K<K<λ> → λ>axiom A2 11. K<(K<λ> → λ) →

((λ → ¬K<λ>) → ¬K<λ>)>axiom A3 on propositional tautology 12. K<(λ → ¬K<λ>) → ¬K<λ>>axiom A4 on 11,10 13. K<¬K<λ>>axiom A4 on 12,8 14. K<λ>axiom A4 on 9,13 This completes the proof of the knowability of λ, corresponding to the informal argument that

KSis known by some agent. Note the similarity between the two pieces of proof in lines 1–7 and 8–14. The only difference is that in the latter all formulae are preceded by an extra K. This is because lines 8–14 express the same line of reasoning as lines 1–7 with the only difference that the latter is a proof of theknowabilityof λ rather than thetruthof λ. Having concluded that λ is both true and knowable, we now immediately obtain a contradiction as in the paradox of the knower:

15. ¬ K<λ>modus ponens on 1,7 16. K<λ> ∧ ¬K<λ>conjunction of 14 and 15 This completes the proof of Montague's theorem. □

Montague's theorem shows that in the setting of first-order arithmetic
we cannot have a theory of knowledge or knowability satisfying even
the basic principles A1–A4. Montague's theorem is a
generalisation of Tarski's theorem. If a predicate symbol *K* satisfies
Tarski's schema *T* then it is easy to see that it will also
satisfy axiom schemas A1–A4. Thus axiom schemas A1–A4
constitute a weakening of the *T*-schema, and Montague's theorem shows
that even this much weaker version of the *T*-schema is sufficient to
produce inconsistency.

Formalising knowledge as a predicate in a first-order logic is
referred to as the *syntactic treatment* of
knowledge. Alternatively, one can choose to formalise knowledge as an
operator in a suitable modal logic. This is referred to as the
*semantic treatment* of knowledge. In the semantic treatment of
knowledge one generally avoids problems of self-reference, and thus
inconsistency, but it is at the expense of the expressive power of the
formalism.

### 2.4 Consequences Concerning Provability and Computability

The central argument given in the proof of Tarski's theorem is closely related to the central argument in Gödel's first incompleteness theorem (Gödel, 1931). Gödel's theorem can be given the following formulation.

Gödel's first incompleteness theorem.

If first-order arithmetic is ω-consistent then it is incomplete.

A theory is called ω-*consistent* if the following holds
for every formula φ(*x*) containing *x* as its only free
variable: If
⊢¬φ(*n*) for every natural
number *n* then
⊬∃*x*φ(*x*).
ω-consistency is a stronger condition than ordinary
consistency, so any ω-consistent theory will also be
consistent. A theory is *incomplete* if it contains a formula
which can neither be proved nor disproved.

Proof sketch.Assume first-order arithmetic is both ω-consistent and complete. We need to show that this leads to a contradiction. Gödel constructs a formulaBew(for "Beweis") in formal arithmetic satisfying, for all φ and alln,

(1) ⊢ Bew(n, <φ>) iffnis the Gödel code of a proof of φ.Assuming the theory to be ω-consistent and complete we can prove that for all sentences φ,

(2) ⊢∃ xBew(x, <φ>) iff ⊢ φ.The proof of (2) runs like this. First we prove the implication from left to right. If ⊢∃

xBew(x, <φ>) then there is somensuch that ⊬¬Bew(n, <φ>), by ω-consistency. By completeness we get ⊢Bew(n, <φ>) for thisn. By (1) above we get thatndenotes a proof of φ. That is, φ is provable, so we have ⊢ φ. To prove the implication from right to left, note that if ⊢ φ then there must be annsuch that ⊢Bew(n, <φ>), by (1). From this we get ⊢∃xBew(x, <φ>), as required. This concludes the proof of (2). Now, when in a complete theory we have (2) we must also have:

(3) ⊢∃ xBew(x, <φ>) ↔ φ, for all sentence φ.If we let ∃

xBew(x, <φ>) be abbreviatedT(<φ>) then (3) becomes:⊢T(<φ>) ↔ φ, for all sentence φ.This is the

T-schema! Thus if we assume first-order arithmetic to be ω-consistent and complete then schemaTturns out to be interpretable in it. Now, Tarski's theorem above shows that there exists no such consistent theory, and we thus have a contradiction. □

In the proof above we reduced Gödel's incompleteness theorem to an application of Tarski's theorem in order to show the close link between the two. Gödel was well aware of this link, and indeed it seems that Gödel even proved Tarski's theorem before Tarski himself did (Feferman, 1984). Gödel's theorem can be interpreted as demonstrating a limitation in what can be achieved by purely formal procedures. It says that if first-order arithmetic is ω-consistent (which it is believed to be), then there must be arithmetical sentences that can neither be proved nor disproved by the formal procedures of first-order arithmetic. One might at first expect this limitation to be resolvable by the inclusion of additional axioms, but Gödel showed that the incompleteness result still holds when first-order arithmetic is extended with an arbitrary finite set of axiom schemas (or, more generally, an arbitrary recursive set of axioms). Thus we obtain a general limitation result saying that there cannot exist a formal proof procedure by which any given arithmetical sentence can be proved to hold or not to hold. For more details on Gödel's incompleteness theorem, see the entry on Kurt Gödel.

The limitation result of Gödel's theorem is closely related to
another limitation result known as the *undecidability of the
halting problem*. This is a result stating that there are
limitations to what can be computed. We will present this result in
the following. The result is based on the notion of a *Turing
machine*, which is a generic model of a computer program running
on a computer with an unlimited memory. Thus any program running on
any computer can be thought of as a Turing machine (see the entry on
Turing machines
for more details). When running a Turing machine, it will either
terminate after a finite number of computation steps, or will continue
running forever. In case it terminates after a finite number of
computation steps we say that it *halts*. The *halting
problem* is the problem of finding a Turing machine that can
decide whether other Turing machines halt or not. We say that a Turing
machine *H* *decides the halting problem* if the
following holds:

Htakes as input a pair (<A>,x) consisting of the Gödel code <A> of a Turing machineAand an arbitrary stringx.Hreturns the answer "yes" if the Turing machineAhalts when given inputx, and "no" otherwise.

Thus if a Turing machine *H* decides the halting problem, we can use it
to determine for an arbitrary Turing machine *A* and arbitrary
input *x* whether *A* will halt on input *x* or
not. The *undecidability of the halting problem* is the
following result, due to Turing (1937), stating that no such machine
can exist:

Theorem (Undecidability of the Halting Problem).

There exists no Turing machine deciding the halting problem.

Proof.Assume the existence of a Turing machineHdeciding the halting problem. We need to show that this assumption leads to a contradiction. The proof mimics Grelling's paradox. We call a Turing machineAheterologicalifAdoesn't halt on input <A>, that is, ifAdoesn't halt when given its own Gödel code as input. UsingH, we can construct a Turing machineGthat halts if and only if it is given the Gödel code of a heterological Turing machine as input.Gworks as follows:Gtakes as input the Gödel code of a Turing machineA. Then it runsHon input (<A>,<A>). IfHon input (<A>,<A>) returns "no" we know thatAis heterological, andGis halted. If, on the other hand,Hon input (<A>,<A>) returns "yes" thenGis forced into an infinite loop (that is, never halts).In analogy to Grelling's paradox we can now ask whether

Gis a heterological Turing machine or not. This leads to the following sequence of equivalences:

Gis heterological⇔ Gdoesn't halt on input <G>⇔ Hreturns "no" on input (<G>,<G>)⇔ Ghalts on input <G>⇔ Gis not heterological.This gives the required contradiction. □

From the two theorems above we see that in the areas of provability and computability the paradoxes of self-reference turn into limitation results: there are limits to what can be proven and what can be computed. This is actually quite similar to what happened in the areas of semantics, set theory and epistemology: The paradoxes of self-reference turned into theorems showing that there are limits to which properties we can consistently assume a truth predicate to have (Tarski's theorem), a set theory to have (inconsistency of the unrestricted comprehension principle), and a knowledge predicate to have (Montague's theorem). It is hard to accept these limitation results, because most of them conflict with our intuitions and expectations. The central role played by self-reference in all of them makes them even harder to accept, and definitely more puzzling. However, we are forced to accept them, and forced to accept the fact that in these areas we cannot have all we might (otherwise) reasonably ask for.

## 3. Solving the paradoxes

The present section takes a look at how to solve — or rather, circumvent — the paradoxes. To solve or circumvent a paradox one has to weaken some of the assumptions leading to the contradiction. It is very difficult to choose which assumptions to weaken, since each of the explicitly stated assumptions underlying a paradox appears to be "obviously true" — otherwise it would not be called a paradox. Below we will take a look at the most influential approaches to solving the paradoxes.

So far the presentation has been structured according to type of paradox, that is, the semantic, set-theoretic and epistemic paradoxes have been dealt with separately. However, it has also been demonstrated that these three types of paradoxes are similar in underlying structure, and it has been argued that a solution to one should be a solutions to all (the principle of uniform solution). Therefore, in the following the presentation will be structured not according to type of paradox but according to type of solution. Each type of solution considered in the following can be applied to any of the paradoxes of self-reference, although in most cases the constructions involved were originally developed with only one type of paradox in mind.

### 3.1 Building Explicit Hierarchies

Building hierarchies is a method to circumvent both the set-theoretic,
semantic and epistemic paradoxes. Russell's original solution to his
paradox — as well as Tarski's original solution to
his *undefinability of truth* problem — was to build
hierarchies. In Russell's case, this led to *type theory*. In
Tarski's case, it led to what is now known as *Tarski's hierarchy
of languages*. In both cases, the idea is to stratify the universe
of discourse (sets, sentences) into levels. In type theory, these
levels are called *types*. The fundamental idea of type theory
is to introduce the constraint that any set of a given type may only
contain elements of lower types (that is, may only contain sets which
are located lower in the stratification). This effectively blocks
Russell's paradox, since no set can then be a member of itself.

In Tarski's case, the stratification is obtained in the following
way. Assume one wants to equip a language *L*_{0} with
a truth predicate. From Tarski's theorem (Section 2.1) it is known
that this truth predicate cannot be part of the
language *L*_{0} itself — at least not as long as
we want the truth predicate to satisfy schema *T*. Instead one
builds a hierarchy of languages, *L*_{0},
*L*_{1}, *L*_{2},…, where each
language *L*_{i+1} has a truth predicate
*T*_{i+1} that only applies to the sentences
of *L*_{j}, *j*≤*i*. In this
hierarchy, *L*_{0} is called the *object
language* and, for
all *i*, *L*_{i+1} is called
the *meta-language* of *L*_{i}. The
hierarchy effectively blocks the liar paradox, since now a sentence
can only express the truth or untruth of sentences at lower levels,
and thus a sentence such as the liar that expresses its *own*
untruth cannot be formed.

Russell's type theory can be regarded as a solution to Russell's
paradox, since type theory demonstrates how to "repair" set theory
such that the paradox disappears. Similarly, Tarski's hierarchy can be
regarded as a solution to the liar paradox. It is the same
stratification idea that underlies both of Russell's and Tarski's
solutions. The point to note is that Russell's paradox and the
liar paradox depend essentially on circular notions
(*self*-membership and *self*-reference). By making a
stratification in which an object may only contain or refer to
objects at lower levels, circularity disappears. In the case of the
epistemic paradoxes, a similar stratification could be obtained by
making an explicit distinction between first-order knowledge
(knowledge about the external world), second-order knowledge
(knowledge about first-order knowledge), third-order knowledge
(knowledge about second-order knowledge), and so on.

Building explicit hierarchies is sufficient to avoid circularity, and
thus sufficient to block the standard paradoxes of
self-reference. However, there also exist paradoxes such as Yablo's
that do not rely on circularity and self-reference. Such paradoxes can
also be blocked by a hierarchy approach, but it is necessary to
further require the hierarchy to be well-founded, that is, to have a
lowest level. Otherwise, the paradoxes of non-wellfoundedness can
still be formulated. For instance, Yablo's paradox may be formalised
in a *descending* hierarchy of languages. A descending
hierarchy of languages consists of languages *L*_{0},
*L*_{−1}, *L*_{−2},…
where each language *L*_{−i} has a truth
predicate that only applies to the sentences of the languages
*L*_{−j}, *j*>*i*. Similarly,
a set-theoretic paradox of non-wellfoundedness may be formulated in a
type theory allowing negative types. The conclusion drawn is that a
stratification of the universe is not itself sufficient to avoid all
paradoxes — the stratification also has to be well-founded.

Building an explicit (well-founded) hierarchy to solve the paradoxes
is today by most considered an overly drastic and heavy-handed
approach. The hierarchies introduce a number of complicating
technicalities not present in a "flat universe", and even though the
paradoxes do indeed disappear, so do all non-paradoxical occurrences
of self-reference. Kripke (1975) gives the following illustrative
example taken from ordinary discourse. Let *N* be the following
statement, made by Nixon,

(N) All of Jones' utterances about Watergate are true,

and let *J* be the following statement, made by Jones,

(J) Most of Nixon's utterances about Watergate are false.

In a Tarskian language hierarchy, the sentence *N* would have
to be on a higher level than all of Jones' utterances, and,
conversely, the sentence *J* would have to be on a higher level
than all of Nixon's utterances. Since *N* is an utterance of
Nixon, and *J* is an utterance of Jones, *N* would have
to be on a higher level than *J*, and *J* on a higher
lever than *N*. This is obviously not possible, so in a
hierarchy like the Tarskian, these sentences cannot even be
formulated. The sentences *N* and *J* are indeed both
indirectly self-referential, since *N* makes reference to a
totality including *J*, and *J* makes reference to a
totality including *N*. Nevertheless, in most cases *N*
and *J* are harmless, and do not produce a paradox. A paradox
is only produced in the special cases where all of Jones' utterances
except possibly *J* are true, and exactly half of Nixon's
utterances are false, disregarding *N*. Kripke uses the fact
that *N* and *J* are only problematic in certain special cases
as an argument against an approach that altogether excludes the
possibility of formulating *N* and *J*.

Another argument against the hierarchy approach is that explicit
stratification is not part of ordinary discourse, and thus it might be
considered somewhat *ad hoc* to introduce it into formal
settings with the sole purpose of circumventing the paradoxes. Not
having an explicit stratification in ordinary discourse obviously
doesn't imply the non-existence of an underlying, implicit,
stratification, but at least it's not explicitly represented in our
syntax.

The arguments given above are among the reasons why Russell's and Tarski's
works have not been considered to furnish the final solutions to the
paradoxes. Many alternative solutions have been proposed. One might
for instance try to look for *implicit* hierarchies rather than
*explicit* hierarchies. An implicit hierarchy is a hierarchy
not explicitly reflected in the syntax of the language. In the
following section we will consider some of the solutions to the
paradoxes obtained by such implicit stratifications.

### 3.2 Building Implicit Hierarchies

Tarski's hierarchy approach to the semantic paradoxes dominated the field until 1975, when Kripke published his famous and highly influential paper, "Outline of a Theory of Truth". This paper has greatly shaped most later approaches to theories of truth and the semantic paradoxes. It should be noted, however, that ideas quite similar to Kripke's were developed simultaneously and independently by Martin and Woodruff (1975), and that a parallel approach in a set-theoretic setting was developed independently by Gilmore (1974).

#### 3.2.1 Kripke's Theory of Truth

Kripke's ideas are based on an analysis of the problems involved in Tarski's hierarchy approach. Kripke lists a number of arguments against having a language hierarchy in which each sentence lives at a fixed level, determined by its syntactic form. He proposes an alternative solution which still uses the idea of having levels, but where the levels are not becoming an explicit part of the syntax. Rather, the levels become stages in an iterative construction of a truth predicate. To explain Kripke's construction, some additional technical machinery is required.

At each stage in Kripke's construction, the truth predicate is only
*partially defined*, that is, it only applies to some of the
sentences of the language. To deal with such partially defined
predicates, a *three-valued logic* is employed — that is,
a logic which operates with a third value, *undefined*, in
addition to the truth values *true* and *false*. Often
the third value is denoted "*u*" or "⊥" (bottom). A
partially defined predicate only receives one of
the classical truth values, *true* or
*false*, when it is applied to one of the terms for which the
predicate has been defined, and otherwise it receives the value
*undefined*. There are several different three-valued logics
available, differing in how they treat the third value.
Here only one of them is briefly described, called
*Kleene's strong three-valued logic*. More detailed information
on this and related logics can be found in the entry
on
many-valued logic.

In Kleene's strong three-valued logic, the value
⊥ (*undefined*) can be interpreted as "not yet defined".
So one could think of formulae with the value ⊥ as formulae
that actually have a classical truth value (*true* or
*false*), but which has just not been determined yet. This
interpretation of *undefined* is reflected in the truth
tables for the logic, given below. The upper truth table is for
disjunction, the lower for negation:

∨ true false ⊥ true true true true false true false ⊥ ⊥ true ⊥ ⊥

¬ true false false true ⊥ ⊥

These truth tables define the three-valued logic completely, as ∨ and ¬ are taken to form an adequate set of connectives and existential and universal quantification are treated as infinite disjunction and conjunction, respectively.

To handle partially defined truth predicates, it is necessary to
introduce the notion of partial models. In a *partial model*
for a first-order language, each *n*-place predicate
symbol *P* is interpreted by a pair (*U*,*V*) of
disjoint *n*-place relations on the domain of the
model. *U* is called the *extension* of *P*
and *V* its *anti-extension*. In the model, *P*
is true of the objects in *U*, false of the objects
in *V*, and undefined otherwise. In this way, any atomic
sentence receives one of the truth
values *true*, *false* or *undefined* in the
model. Non-atomic formulae are given truth values in the model by
using Kleene's strong three-valued logic for evaluating the
connectives.

Given the definition of a partial model, a *partially interpreted
language* is a pair (*L*,*M*) where *L* is a
first-order language and *M* is a partial model
of *L*. Kripke recursively defines a sequence of partially
interpreted languages *L*_{0}, *L*_{1},
*L*_{2},…, only differing in their
interpretation of the truth predicate *T*. The first
language, *L*_{0}, is taken to be an arbitrary language
in which both the extension and anti-extension of *T* are the
empty set. Thus in *L*_{0}, the truth predicate is
completely undefined. For any α, the language
*L*_{α+1} is like *L*_{α}
except that *T* is interpreted by the extension/anti-extension
pair (*U*,*V*) given by:

*U*is the set of Gödel codes <φ> of sentences φ true in*L*_{α}.*V*is the set of Gödel codes <φ> of sentences φ false in*L*_{α}.

This definition immediately gives that for all α,

(4) φ is true (false) in L_{α}⇔T(<φ>) is true (false) inL_{α+1}.

What has been constructed is a sequence *L*_{0}, *L*_{1},
*L*_{2},… of partially interpreted languages
such that *T* is interpreted in *L*_{α+1}
as the truth predicate for
*L*_{α}. This is just like Tarski's hierarchy of
languages, except that here there is no syntactic distinction between
the different languages and their truth predicates.

The
sequence *L*_{0}, *L*_{1}, *L*_{2},…
has an important property: For each α, the interpretation
of *T* in
*L*_{α+1} extends
the interpretation of *T* in *L*_{α}, that is, both the
extension and anti-extension of *T* increase when moving from
*L*_{α} to *L*_{α+1}. This means that one can
define a new partially interpreted language *L*_{ω} by
letting the extension of *T* be the union of all the extensions of *T* in
*L*_{0}, *L*_{1}, *L*_{2},…;
and similarly for the anti-extension. Thus
in *L*_{ω}, the interpretation of *T*
extends the interpretation that *T* gets in all previous
languages. This furnishes a strategy for continuing the iterative
construction of a truth predicate into the transfinite: For each
successor ordinal
*α+1*, define *L*_{α+1} from
*L*_{α} exactly as in the finite case above; and for each
limit ordinal σ, define
*L*_{σ} from the preceding languages
(*L*_{i})_{i<σ} in the same
way as *L*_{ω} was defined (for a detailed
explanation of the ordinal numbers and their use in this context, see the entry on
the revision theory of truth).
A simple cardinality consideration now shows that this transfinite sequence of
languages will eventually *stabilise*: There is an ordinal
γ such that for all ordinals β, *L*_{γ} =
*L*_{γ+β}. Since, in particular, *L*_{γ} =
*L*_{γ+1}, the following instance of (4) is
obtained:

(5) φ is true (false) in L_{γ}⇔T(<φ>) is true (false) inL_{γ}.

This shows that *L*_{γ} is actually a language containing its
own truth predicate: Any sentence φ is true
(false) if and only if the sentence expressing its truth,
*T*(<φ>), is true (false). The
equivalence (5) is nothing more than a semantic
analogue of Tarski's schema *T* in a three-valued setting.
The construction of the language *L*_{γ} was one of the
major contributions of Kripke (1975). It shows that in a three-valued
logical setting it is actually possible for a language to contain
its own truth predicate. It is easy to see that the third value,
*undefined,* is essential to making things work: If
*L*_{γ} had been a totally interpreted language (that is, a
language with no undefined sentences), then *L*_{γ}
would satisfy schema T, by (5) above. However, it
immediately contradicts Tarski's theorem that such a totally
interpreted language can exist.

Among the sentences that receive the value *undefined* in
*L*_{γ} is the liar sentence. The solution to the liar
paradox implicit in Kripke's theory is this: Since both assuming that
the liar sentence is true and that it is false leads to a
contradiction it must be neither; it is *undefined*. The liar
sentence is said to suffer from a *truth-value
gap*. The idea of avoiding the liar paradox
by allowing truth-value gaps did in fact appear in several places
before Kripke's paper, but Kripke was among the first to make it
an integral part of a genuine theory.

As with the hierarchy solution to the liar paradox, the
truth-value gap solution is by many considered to be problematic.
The main criticism is that by using a three-valued semantics, one
gets an interpreted language which is expressively weak. One
cannot, for instance, have a *non-truth* predicate in one of
Kripke's languages (a sentence is *non-true* when it is
either false or undefined). This is in fact noted
by Kripke himself. If a partially interpreted language contained
such a predicate, the following version of the liar paradox within
the language could be made: "This sentence is non-true".

Kripke's theory circumvents the liar paradox by assigning it the value
*undefined*. An alternative way to circumvent the liar paradox
would be to assign it the value *both true and false* in a
suitable paraconsistent logic. One of the simplest paraconsistent
logics is LP, which is a three-valued logic with the same truth tables
as Kleene's strong three-valued logic presented above — the only
difference is that the third truth value is interpreted as *both
true and false* rather than *undefined*. A reason for
preferring a paraconsistent logic over a partial logic is that
paradoxical sentences such as the liar can then be
modelled as *true contradictions* rather than truth-value
gaps. The view that there exists true contradictions is called
dialetheism. See the entries on
dialetheism and
paraconsistent logic
for more information.

#### 3.2.2 Implicit Hierarchies in Set Theories

Building implicit rather than explicit hierarchies is also an idea that has been employed in set theory. New Foundations (NF) by Quine (1937) is a modification of simple type theory where the stratification into syntactic types has been replaced by a stratification on the comprehension principle:

NF comprehension:

∀u(u∈ {x| φ(x) } ↔ φ(u)), for allstratifiedformulae φ(x).

A formula φ is *stratified* if there exists a mapping
σ (a *stratification*) from the variables of φ to the
natural numbers such that if *u* ∈ *v* is a
subformula of φ then σ(*v*) = σ(*u*)+1
and if *u* = *v* is a subformula of φ then
σ(*v*) = σ(*u*). Obviously the
formula *x* ∉ *x* is not stratified, and thus the
NF comprehension principle cannot be used to formulate Russell's
paradox in the theory. Quine's New Foundations is essentially obtained
from type theory by hiding the types from the syntax. Thus, the theory
still makes use of a hierarchy approach to avoid the paradoxes, but
the hierarchy is made implicit by not representing it in the syntax of
formulae.

Zermelo-Fraenkel set theory (ZF) is another theory that builds on the
idea of an implicit hierarchy to circumvent the paradoxes. However, it
does so in a much less direct way than NF. In ZF, sets are built
bottom-up, starting with the empty set and iterating a construction of
bigger and bigger sets using the operations of union and power
set. This produces a hierarchy with the empty set on the lowest level,
level 0, and with the power set operation producing a set of level
α+1 from a set of level α. Analogous to Kripke's iterative
construction, the procedure is continued into the transfinite using
the union operator at the limit ordinal levels. The hierarchy obtained
is called the *cumulative hierarchy*. One of the axioms of ZF,
the axiom of foundation, states that every set of ZF lives on a
certain level in this cumulative hierarchy. In order words, the axiom
of foundation states that there are no sets in ZF besides the ones
that can be constructed bottom-up by the iterative procedure just
described. Since in a cumulative hierarchy, there can be no sets
containing themselves, no universal set, and no non-wellfounded sets,
none of the known paradoxes can immediately be formulated in the
theory. This does obviously not in itself ensure the consistency of
ZF, but at least it illustrates how the idea of a set hierarchy plays
a significant role in ZF as well. ZF has a privileged status among set
theories, as it is today the most widely acknowledged candidate for a formal foundation of
mathematics.

### 3.3 General Fixed Point Approaches

Kripke's iterative construction of a truth predicate presented above
may be viewed as an instance of a more general *fixed point
approach* towards building formal theories of truth. Fixed point
approaches have become central to contemporary formal theories of
truth. The main idea is to have a
*truth revision operator* and then look for fixed points of
this operator. At heart of such fixed point approaches is some
kind of *fixed point theorem* that guarantees the existence
of fixed points for certain kinds of operators. There are several
different fixed point theorems available. Consider now one of the
simpler ones.

Fixed point theorem.

Let τ be a monotone operator on a chain complete partial order (henceforth ccpo). Then τ has a least fixed point, that is, there is a least f such that τ(f) =f.

A *ccpo* is a partial order (*D*,<) in which every
totally ordered subset of *D* has a least upper bound. The
totally ordered subsets of *D* are called *chains* in
*D*. A *monotone operator* on (*D*,<) is a
mapping τ: *D* → *D* satisfying:

d_{1}≤ d_{2}⇒ τ(d_{1}) ≤ τ(d_{2}), for alld_{1},d_{2}∈D.

Kripke's construction fits into the fixed point theorem above in the
following way. First note that the set of partially interpreted
languages that only differ on the interpretation of *T* forms a
ccpo: Simply define the ordering on these languages
by *L*_{1} ≤ *L*_{2} iff the
interpretation of *T* in *L*_{2} extends the
interpretation of *T* in *L*_{1} (that is, the
extension and anti-extension of *T* in *L*_{1}
are included in those of *L*_{2}). Then define a truth
revision operator τ on these languages by:

(6) τ( L) =L′, whereT(<φ>) is true (false) inL′ iff φ is true (false) inL.

Note that if *L*_{α} is one of the languages in
Kripke's construction, then *L*_{α+1} =
τ(*L*_{α}). The idea of this truth revision
operator τ is that if τ(*L*)=*L*′
then *L*′ will be a language in which *T* is
interpreted as the truth predicate for *L*. If therefore
τ(*L*)=*L* for some *L*, that is,
if *L* is a fixed point of τ, then *L* will be a
language containing its own truth predicate. This motivates the search
for fixed points of τ. Since τ is easily seen to be monotone,
by the fixed point theorem it has a least fixed point. It is not hard
to see that this fixed point is exactly the
language *L*_{γ} constructed in Kripke's theory
of truth. Kripke's construction is thus recaptured in the setting of
fixed points for monotone operators.

The point of introducing the additional machinery was not just to
rediscover the language *L*_{γ}. The point is rather that a
much more general and abstract framework which may lead to new
theories of truth and give further insights into the semantic
paradoxes is provided. It turns out that the truth revision operator
τ defined above has many interesting fixed-points in addition to
*L*_{γ}. It is also possible to obtain new
interesting theories of truth by considering alternative ways of
making the set of interpreted languages into a ccpo. One may for
instance add an additional truth value and consider the situation in a
four-valued logic, as considered by Fitting (1997); or one
may remove the third truth value *undefined* and construct a
ccpo in a completely classical setting. In the classical setting, attention is
restricted to the totally interpreted languages (languages in which
every sentence is either true or false), and an ordering on them is
defined by: *L*_{1} ≤ *L*_{2} holds
iff the extension of the truth predicate in *L*_{1} is
included in the extension of the truth predicate
in *L*_{2}, that is, iff *L*_{2} points
out at least as many sentences as true as
*L*_{1}. This gives a ccpo. By using the fixed point theorem in
this setting on a suitably defined revision operator, it is fairly
easy to prove the existence of a totally interpreted language
containing a *positive definition of truth*. By this is meant
that the interpreted language has a predicate *T* satisfying the
following restricted version of the *T*-schema:

(7) φ ↔ T(<φ>), for allpositivesentences φ,

where the positive sentences are those built using only the
quantifiers ∃ and ∀ and the connectives ∧ and ∨
(that is, positive sentences have no occurrences of the
negation sign). Since (7) is satisfiable in a
totally interpreted language, the first-order theory containing the
sentences of (7) as axioms must be consistent. This should be
contrasted with Tarski's theorem stating that the
*unrestricted* *T*-schema is inconsistent. If the
unrestricted comprehension principle is similarly restricted to the
positive formulae, we also get a consistent theory. This was
originally shown by Gilmore (1974).

The fixed point approach is also the point of departure of the
*revision theory of truth* developed by Belnap and Gupta
(1993). The revision theory of truth is without doubt the most
influential theory of truth and the semantic paradoxes that has been
developed since the theory of Kripke. The revision theory considers
the truth revision operator τ defined by (6) as an operator on the
*totally* interpreted languages. On these languages τ
doesn't have a fixed point: If it had such a fixed point *L*
then *L* would be a totally interpreted language satisfying the
full schema *T*, directly contradicting Tarski's theorem. Since
τ doesn't have a fixed point on the totally interpreted languages,
the revision theory instead considers transfinite sequences
*L*_{1}, *L*_{2}, … ,
*L*_{ω}, *L*_{ω+1}, …
of totally interpreted languages satisfying:

- For any successor ordinal α+1,
*L*_{α+1}= τ(*L*_{α}). - For any limit ordinal σ and any sentence φ, if φ
stabilises on the value true (false) in the sequence
(
*L*_{α})_{α < σ}then φ is true (false) in*L*_{σ}.

In such a sequence, each sentence φ will either eventually
stabilise on a classical truth value (true or false), or it will never
stabilise. An example of a sentence that will never stabilise is the
liar sentence: If the liar sentence is true in one of the languages
*L*_{α} it will be false
in *L*_{α+1}, and vice versa. The revision theory
thus gives an account of truth that correctly *models* the
behaviour of the liar sentence as one that never stabilises on a truth
value. In the revision theory it is argued that this gives a more
correct account of truth and self-reference than Kripke's theory in
which the liar sentence is simply assigned the value undefined. A full
account of Gupta and Belnap's theory can be found in the entry on
the revision theory of truth.

## Bibliography

- Bartlett, S.J. and Peter Suber (eds.), 1987,
*Self-Reference — Reflections on Reflexivity*, Dordrecht: Martinus Nijhoff Publishers. - Bartlett, S.J. (ed.), 1992,
*Reflexivity — A Source-Book in Self-Reference*, Amsterdam: North-Holland Publishing Co. - Barwise, J. and J. Etchemendy, 1987,
*The Liar — An Essay on Truth and Circularity*, New York: Oxford University Press. - Barwise, J. and L. Moss, 1996,
*Vicious Circles — On the Mathematics of Non-Wellfounded Phenomena*, Stanford: CSLI Publications. - Bolander, T. and V.F. Hendricks and S.A. Pedersen (eds.), 2006,
*Self-Reference*, Stanford: CSLI Publications. - Boolos, G., 1993,
*The Logic of Provability*, Cambridge: Cambridge University Press. - Cantini, A., 1996,
*Logical Frameworks for Truth and Abstraction — An Axiomatic Study*, Amsterdam: North-Holland Publishing Co. - Cantor, G., 1891, "Über eine Elementare Frage der
Mannigfaltigkeitslehre,"
*Jahresbericht der Deutschen Mathematiker-Vereinigung*, 1: 75–78. - Chapuis, A. and A. Gupta, 2000,
*Circularity, Definition and Truth*, New Delhi: Indian Council of Philosophical Research. - Erickson, G. W. and J. A. Fossa, 1998,
*Dictionary of Paradox*, Lanham: University Press of America. - Feferman, S., 1984, "Kurt Gödel: conviction and caution",
*Philosophia Naturalis*, 21: 546–562. - Feferman, S., 1984a, "Toward useful type-free theories I",
*The Journal of Symbolic Logic*, 49: 75–111. - Feferman, S., 1991, "Reflecting on incompleteness",
*The Journal of Symbolic Logic*, 56: 1–49. - Fitting, M., 1986, "Notes on the mathematical aspects of Kripke's
theory of truth",
*Notre Dame Journal of Formal Logic*, 27(1): 75–88. - Fitting, M., 1997, "A theory of truth that prefers falsehood",
*The Journal of Philosophical Logic*, 26(5): 477–500. - Forster, T. E., 1995,
*Set Theory with a Universal Set*, New York: The Clarendon Press. - Gilmore, P. C., 1974, "The consistency of partial set theory
without extensionality",
*Axiomatic set theory (Proc. Sympos. Pure Math., Vol. XIII, Part II)*, pp. 147–153, Providence RI: Amer. Math. Soc. - Gödel, K., 1931, "Über formal unentscheidbare Satze der
Principia Mathematica und verwandter Systeme I",
*Monatshefte für Mathematik und Physik*, 38: 173–198. - Gupta, A. and N. Belnap, 1993,
*The Revision Theory of Truth*, Cambridge MA: MIT Press. - van Heijenort, J., 1967,
*From Frege to Gödel. A source book in mathematical logic, 1879–1931*, Cambridge (MA): Harvard University Press. - Kripke, S., 1975, "Outline of a Theory of Truth",
*The Journal of Philosophy*, 72: 690–716. - Martin, R. L. and P. W. Woodruff, 1975, "On representing
‘True-in-L’ in L",
*Philosophia*, 5: 213–217. - Martin, R. L., 1984,
*Recent Essays on Truth and the Liar Paradox*, Oxford: Oxford University Press. - McGee, V., 1990,
*Truth, Vagueness, and Paradox — An Essay on the Logic of Truth*, Indianapolis: Hackett Publishing Co. - McGee, V., 1992, "Maximal consistent sets of instances of
Tarski's Schema (T)",
*The Journal of Philosophical Logic*, 21(3): 235–241. - Montague, R., 1963, "Syntactical treatment of modality, with
corollaries on reflection principles and finite
axiomatizability",
*Acta Philosophica Fennica*, 16: 153–167. - Perlis, D. and V. S. Subrahmanian, 1994, "Meta-languages,
Reflection Principles and Self-Reference",
*Handbook of Logic in Artificial Intelligence and Logic Programming*, 2: 323–358. - Priest, G., 1987,
*In Contradiction — A Study of the Transconsistent*, Dordrecht: Martinus Nijhoff Publishers. - Priest, G., 1994, "The Structure of the Paradoxes of
Self-Reference",
*Mind*, 103: 25–34. - Priest, G., 1995,
*Beyond the Limits of Thought*, Cambridge: Cambridge University Press. - Quine, W.V., 1937, "New foundations for mathematical
logic",
*American Mathematical Monthly*, 44: 70–80. - Rosado Haddock, G. E., 2001, "Recent truth theories: A case
study",
*Axiomathes*, 12(1): 87–115. - Russell, B., 1905, "On some difficulties in the theory of
transfinite numbers and order types",
*Proceedings of the London Mathematical Society*, 4: 29–53. - Sheard, M., 1994, "A guide to truth predicates in the modern era",
*The Journal of Symbolic Logic*, 59(3): 1032–1054. - Simmons, K., 1993,
*Universality and the Liar — An Essay on Truth and the Diagonal Argument*, Cambridge: Cambridge University Press. - Smullyan, R. M., 1992,
*Gödel's Incompleteness Theorems*, Oxford: Oxford University Press. - Smullyan, R. M., 1994,
*Diagonalization and self-reference*, Oxford: Oxford University Press. - Tarski, A., 1935, "Der Wahrheitsbegriff in den formalisierten
Sprachen",
*Studia Philosophica*, 1: 261–405. - Tarski, A., 1968,
*Undecidable Theories*, Amsterdam: North-Holland Publishing Co. - Thomason, R., 1980, "A note of syntactical treatments of
modality",
*Synthese*, 44: 391–395. - Turing, A.M., 1936, "On Computable Numbers, With an Application
to the Entscheidungsproblem",
*Proceedings of the London Mathematical Society*, 42(2): 230–265; correction*ibid*., 43: 544–546 (1937). - Visser, A., 1989, "Semantics and the liar paradox",
*Handbook of Philosophical Logic*, vol.IV, Dordrecht: Kluwer, pp. 617–706. - Yablo, Stephen, 1985, Yablo, "Truth and reflection",
*Journal of Philosophical Logic*, 14(3): 297–349. - Yablo, Stephen, 1993, "Paradox without
self-reference",
*Analysis*, 53: 251–252. - Yablo, Stephen, 2006, "Circularity and Paradox", in Bolander,
Hendricks, Pedersen (eds.),
*Self-reference*, Stanford: CSLI Publications. - Zwicker, W.S., 1987, "Playing Games with Games: The Hypergame
Paradox,"
*The American Mathematical Monthly*, 94(6): 507–514.

## Other Internet Resources

- Logical Paradoxes,
by B. Hartley Slater, in the
*Internet Encyclopedia of Philosophy*. - Truth,
by Bradley Dowden and Norman Swartz, in the
*Internet Encyclopedia of Philosophy*.

## Related Entries

computability and complexity |
Curry's paradox |
dialetheism [dialethism] |
epistemic paradoxes |
Gödel, Kurt |
insolubles [= *insolubilia*] |
logic: epistemic |
logic: many-valued |
logic: paraconsistent |
mathematics, philosophy of |
paradoxes: and contemporary logic |
Quine, Willard van Orman: New Foundations |
Russell's paradox |
set theory |
set theory: alternative axiomatic theories |
set theory: early development |
Tarski, Alfred: truth definitions |
truth |
truth: axiomatic theories of |
truth: revision theory of |
Turing machines |
type theory