This is a file in the archives of the Stanford Encyclopedia of Philosophy.

Self-Reference

First published Tue Jul 15, 2008

In the context of language, self-reference is used to denote a statement that refers to itself or its own referent. The most famous example of a self-referential sentence is the liar sentence: "This sentence is not true." Self-reference is often used in a broader context as well. For instance, a picture could be considered self-referential if it contains a copy of itself (see the animated image above); and a piece of literature could be considered self-referential if it includes a reference to the work itself. In philosophy, self-reference is primarily studied in the context of language. Self-reference within language is not only a subject of philosophy, but also a field of individual interest in mathematics and computer science, in particular in relation to the foundations of these sciences.

The philosophical interest in self-reference is to a large extent centered around the paradoxes. A paradox is a seemingly sound piece of reasoning based on apparently true assumptions that leads to a contradiction. The liar sentence considered above leads to a contradiction when we try to determine whether it is true or not. If we assume the sentence to be true, then what it states must be the case, that is, it cannot be true. If, on the other hand, we assume it not to be true, then what it states is actually the case, and thus it must be true. In either case we are led to a contradiction. Since the contradiction was obtained by a seemingly sound piece of reasoning based on apparently true assumptions, it qualifies as a paradox. It is known as the liar paradox.

Most paradoxes of self-reference may be categorised as either semantic, set-theoretic or epistemic. The semantic paradoxes, like the liar paradox, are primarily relevant to theories of truth. The set-theoretic paradoxes are relevant to the foundations of mathematics, and the epistemic paradoxes are relevant to epistemology. Even though these paradoxes are different in the subject matter they relate to, they share the same underlying structure, and may often be tackled using the same mathematical means.

In the present entry, we will first introduce a number of the most well-known paradoxes of self-reference, and discuss their common underlying structure. Subsequently, we will discuss the profound consequences that these paradoxes have on a number of different areas: theories of truth, set theory, epistemology, foundations of mathematics, computability. Finally, we will present the most prominent approaches to solving the paradoxes.

1. Paradoxes of Self-Reference
2. Why Paradoxes Matter
3. Solving the paradoxes
Bibliography
Other Internet Resources
Related Entries

1. Paradoxes of Self-Reference

1.1 Semantic Paradoxes

Paradoxes of self-reference have been known since antiquity. The discovery of the liar paradox is often credited to Eubulides the Megarian who lived in the 4th century BC. The liar paradox belongs to the category of semantic paradoxes, since it is based on the semantic notion of truth. Other well-known semantic paradoxes include Grelling's paradox, Berry's paradox, and Richard's paradox.

Grelling's paradox involves a predicate defined as follows. Say a predicate is heterological if it is not true of itself, that is, if it does not itself have the property it expresses. Thus the predicate "German" is heterological, since it is not itself a German word, but the predicate "deutsch" is not heterological. The question that leads to the paradox is now:

Is "heterological" heterological?

It is easy to see that we obtain a contradiction independently of whether we answer "yes" or "no" to this question (the argument runs more or less like in the liar paradox). Grelling's paradox is self-referential, since the definition of the predicate heterological refers to all predicates, including the predicate heterological itself. Definitions such as this which depends on a set of entities, at least one of which is the entity being defined, are called impredicative.

Berry's paradox is another paradox based on an impredicative definition, or rather, an impredicative description. Some phrases of the English language are descriptions of natural numbers, for example, "the sum of five and seven" is a description of the number 12. Berry's paradox arises when trying to determine the denotation of the following description:

the least number that cannot be referred to by a description containing less than 100 symbols.

The contradiction is that this description containing 93 symbols denotes a number which, by definition, cannot be denoted by any description containing less than 100 symbols. The description is of course impredicative, since it implicitly refers to all descriptions, including itself.

Richard's paradox considers phrases of the English language defining real numbers rather than natural numbers. For example, "the ratio between the circumference and diameter of a circle" is a phrase defining the number π. Assume an enumeration of all such phrases is given (e.g. by putting them into lexicographical order). Now consider the phrase:

the real number whose nth decimal place is 1 whenever the nth decimal place of the nth phrase is 0; otherwise 0.

This phrase defines a real number, so it must be among the enumerated phrases, say number k in this enumeration. But, at the same time, by definition, it differs from the number denoted by the kth phrase in the kth decimal place. Thus we have a contradiction. The defining phrase is obviously impredicative. The particular construction employed in this paradox is called diagonalisation. Diagonalisation is a general construction and proof method originally invented by Georg Cantor (1891) to prove the uncountability of the power set of the natural numbers. It was also used as a basis for Cantor's paradox, one of the set-theoretic paradoxes to be considered next.

1.2 Set-Theoretic Paradoxes

The best-known set-theoretic paradoxes are Russell's paradox and Cantor's paradox. Russell's paradox arises from considering the Russell set R of all sets that are not members of themselves, that is, the set defined defined by R = { x | x ∉ x }. The contradiction is then derived by asking whether R is a member of itself, that is, whether R ∈ R holds. If R ∈ R then R is a member of itself, and thus R ∉ R, by definition of R. If, on the other hand, R ∉ R then R is not a member of itself, and thus R ∈ R, again by definition of R.

Cantor's paradox is based on an application of Cantor's theorem. Cantor's theorem states that given any finite or infinite set S, the power set of S has larger cardinality (greater size) than S. The theorem is proved by a form of diagonalisation, the same idea as underlying Berry's paradox. Cantor's paradox considers the set of all sets. Let us call this set the universal set and denote it by U. The power set of U is denoted ℘(U). Since U contains all sets it will in particular contain all elements of ℘(U). Thus ℘(U) must be a subset of U and must thus have a cardinality which is less than or equal to the cardinality of U. However, this immediately contradicts Cantor's theorem.

The Hypergame paradox is a more recent addition to the list of set-theoretic paradoxes, invented by Zwicker (1987). Let us call a two-player game well-founded if it is bound to terminate in a finite number of moves. Tournament chess is an example of a well-founded game. We now define hypergame to be the game in which player 1 in the first move chooses a well-founded game to be played, and player 2 subsequently makes the first move in the chosen game. All remaining moves are then moves of the chosen game. Hypergame must be a well-founded game, since any play will last exactly one move more than some well-founded game. However, if hypergame is well-founded then it must be one of the games that can be chosen in the first move of hypergame, that is, player 1 can choose hypergame in the first move. This allows player 2 to choose hypergame in the subsequent move, and thus the two players can continue choosing hypergame ad infinitum. Thus hypergame cannot be well-founded, contradicting our previous conclusion.

1.3 Epistemic paradoxes

The most well-know epistemic paradox is the paradox of the knower. This paradox has many equivalent formulations, one of them based on the sentence "This sentence is not known by anyone." Let us call this sentence the knower sentence, abbreviated KS. KS is obviously quite similar to the liar sentence, except the central concept involved is knowledge rather than truth. The reasoning leading to a contradiction from KS is a bit more complex than in the liar paradox. First KS is shown to be true by the following piece of reasoning:

Assume to obtain a contradiction that KS is not true. Then what KS expresses cannot be the case, that is, KS must be known by someone. Since everything known is true (this is part of the definition of the concept of knowledge), KS is true, contradicting our assumption. This concludes the proof that the KS is true.

The piece of reasoning just carried out to prove the truth of KS should be available to any agent (person) with sufficient reasoning capabilities. That is, an agent should be able to prove the truth of KS, and thus come to know that KS holds. However, if KS is known by someone, then what it expresses is not the case, and thus it cannot be true. This is a contradiction, and thus we have a paradox. The role of self-reference in this paradox is obvious, as it is based on a sentence KS referring directly to itself.

The paradox of the knower is just one of many epistemic paradoxes involving self-reference. See the entry on epistemic paradoxes for further information on the class of epistemic paradoxes. For a detailed discussion and history of the paradoxes of self-reference in general, see the entry on paradoxes and contemporary logic.

1.4 Common Structures in the Paradoxes

The paradoxes above are all quite similar in structure. In the case of the paradoxes of Grelling and Russell, this can be seen as follows. Define the extension of a predicate to be the set of objects it is true of. For a predicate P we denote its extension by ext(P). Grelling's paradox involves the predicate heterological, which is true of all those predicates that are not true of themselves. Thus the extension of the predicate heterological is the set { P | P ∉ ext(P) }. Compare this to the Russell set R given by { x | x ∉ x }. The only significant difference between these two sets is that the first is defined on predicates whereas the second is defined on sets. The contradictions obtained from analysing these two sets are also similar. Both contradictions can be presented as sequences of equivalences, and both sequences share the same structure, as seen below (where "het" abbreviates "heterological"):

• Grelling's paradox: het ∈ ext(het) ⇔ het ∈ { P | P ∉ ext(P) }

⇔ het ∉ ext(het).

• Russell's paradox: R ∈ R ⇔ R ∈ { x | x ∉ x}

⇔ R ∉ R.

Thus we have two paradoxes of an almost identical structure belonging to two different classes of paradoxes: one is semantic and the other set-theoretic. What this teaches us is that even if paradoxes seem different by involving different subject matters, they might be almost identical in their underlying structure. Thus in many cases it makes most sense to study the paradoxes of self-reference under one, rather than study, say, the semantic and set-theoretic paradoxes separately.

Russell's paradox and Cantor's paradox are also more similar than it might at first seem. Cantor's paradox is based on an application of Cantor's theorem to the universal set U. Below we give the proof of Cantor's theorem for an arbitrary set S.

We need to prove that ℘(S) has greater cardinality than S. Assume to obtain a contradiction that this is not the case. Then there must exist a map f from S onto ℘(S). Now consider the set C = { x ∈ S | x ∉ f(x) }. Since f is onto ℘(S), there must exist a set c ∈ S such that f(c)=C. However, we now obtain a contradiction, since the following holds:

c ∈ f(c) ⇔ c ∈ { x ∈ S | x ∉ f(x) }

⇔ c ∉ f(c).

Note the similarity between this sequence of equivalences and the corresponding sequences of equivalences derived for Russell's and Grelling's paradoxes above. Now consider the special case of Cantor's theorem where S is the universal set. Then we can simply choose f to be the identity function, since ℘(U) must necessarily be a subset of the universal set U. But then C becomes the Russell set! Thus Cantor's paradox is nothing more than a slight variant of Russell's paradox; the core argument leading to the contradiction is the same in both.

Priest (1994) gives even firmer evidence to the similarity between the paradoxes of self-reference by showing that they all fit into what he calls Russell's schema. The idea behind it goes back to Russell himself (1905) who also considered the paradoxes of self-reference to have a common underlying structure. Given two predicates predicates P and Q, and a possibly partial function δ, Russell's schema consists of the following two conditions:

w = { x | P(x) } exists and Q(w) holds;
if x is a subset of w such that Q(x) holds then:
1. δ(x) ∉ x,
2. δ(x) ∈ w.

If these conditions are satisfied we have the following contradiction: Since w is trivially a subset of w and since Q(w) holds by condition 1, we have both δ(w) ∉ w and δ(w) ∈ w, by 2a and 2b, respectively. Thus any triple (P,Q,δ) satisfying the Russell schema will produce a paradox. Priest shows how most of the well-known paradoxes of self-reference fit into Russell's schema. Below we will consider only a few of these paradoxes, starting with Russell's paradox. In this case we define the triple (P,Q,δ) as follows:

P(x) is the predicate "x ∉ x".
Q(x) is the universal predicate true of any object.
δ is the identity function.

Then w in Russell's schema becomes the Russell set and the contradiction obtained from the schema becomes Russell's paradox.

In the case of Richard's paradox we define the triple by:

P(x) is the predicate "x is a real definable by a phrase in English."
Q(x) is the predicate "x is definable by a phrase in English."
δ is the function that maps any infinite set x of well-ordered reals to the real y whose nth decimal place is 1 whenever the nth decimal of the nth real in x is 0; otherwise 0.

Here w = { x | P(x) } becomes the set of all reals definable by phrases in English. For any well-ordered subset x of w, δ(x) is a real that by construction will differ from all reals in x (it differs from the nth real in x on the nth decimal place). Letting x equal w we thus get δ(w) ∉ w. However, at the same time δ(w) is definable by a phrase in English, so δ(w) ∈ w, and we have a contradiction. This contradiction is Richard's paradox.

The liar paradox also fits Russell's schema, albeit in a slightly less direct way:

P(x) is the predicate "x is true."
Q(x) is the predicate "x is definable."
δ(x) is the sentence "this sentence does not belong to the set x."

Here w = { x | P(x) } becomes the set of true sentences, and δ(w) becomes a version of the liar sentence: "this sentence does not belong to the set of true sentences".

From the above it can be concluded that all, or at least most, paradoxes of self-reference share a common underlying structure. Priest (1994) argues that they should then also share a common solution. Priest calls this the principle of uniform solution: "same kind of paradox, same kind of solution."

1.5 A Paradoxes without Self-Reference

In 1985, Yablo succeeded in constructing a semantic paradox that does not involve self-reference in the strict sense. Instead, it consists of an infinite chain of sentences, each sentence expressing the untruth of all the subsequent ones. More precisely, for each natural number i we define S_i to be the sentence "for all j>i, S_j is not true". We can then derive a contradiction as follows:

First we prove that none of the sentences S_i can be true. Assume to obtain a contradiction that S_i is true for some i. Then it is true that "for all j>i, S_j is not true". Thus none of the sentences S_j for j>i are true. In particular, S_i+1 is not true. S_i+1 is the sentence "for all j>i+1, S_j is not true". Since this sentence is not true, there must be some k>i+1 for which S_k is true. However, this contradicts that none of the sentences S_j with j>i are true.

We have now proved that none of the sentences S_i are true. Then, in particular, we have that for all j>0, S_j is not true. This is exactly what is expressed by S₀, so S₀ must be true. This is again a contradiction.

Yablo calls this paradox the ω-liar, but others usually refer to it as Yablo's paradox. Note that none of the sentences S_i refer to themselves (not even indirectly), but only to the ones that occur later in the sequence. Yablo's paradox is semantic, but as shown by Yablo (2006), similar set-theoretic paradoxes involving no self-reference can be formulated in certain set theories.

Yablo's paradox demonstrates that we can have logical paradoxes without self-reference – only a certain kind of non-wellfoundedness is needed to obtain a contradiction. There are obviously structural differences between the ordinary paradoxes of self-reference and Yablo's paradox: the ordinary paradoxes of self-reference involve a cyclic structure of reference, whereas Yablo's paradox involve an acyclic, but non-wellfounded, structure of reference. More precisely, the referential structure in self-referential paradoxes such as the liar is a reflexive relation on a singleton set, whereas the referential structure in Yablo's paradox is isomorphic to the usual less-than ordering on the natural numbers, which is an irreflexive relation. Even though there is this difference, Yablo's paradox still share most properties with the ordinary paradoxes of self-reference. When solving paradoxes we might thus choose to consider them all under one, and refer to them as paradoxes of non-wellfoundedness. In the following we will however stick to the term paradoxes of self-reference, even though most of what we say will apply to Yablo's paradox and related paradoxes of non-wellfoundedness as well.

2. Why the Paradoxes Matter

After having presented a number of paradoxes of self-reference and discussed some of their underlying similarities, we will now turn to a discussion of their significance. The significance of a paradox is its indication of a flaw or deficiency in our understanding of the central concepts involved in it. In case of the semantic paradoxes, it seems that it is our understanding of fundamental semantic concepts such as truth (in the liar paradox and Grelling's paradox) and definability (in Berry's and Richard's paradoxes) that are deficient. In case of the set-theoretic paradoxes, it is our understanding of the concept of a set. If we fully understood these concepts, we should to be able to deal with them without being led to contradictions. To illustrate this, consider the case of Zeno's classical paradox on Achilles and the Tortoise (see the entry Zeno's paradoxes for details). In this paradox we seem able to prove that the tortoise can win a race against the 10 times faster Achilles if given an arbitrarily small head start. Zeno used this paradox as an argument against the possibility of motion. It has later turned out that the paradox rests on an inadequate understanding of infinity. More precisely, it rests on an implicit assumption that any infinite series of positive reals must have an infinite sum. The later developments of the mathematics of infinite series has shown that this assumption is invalid, and thus the paradox dissolves. The original acceptance of Zeno's argument as a paradox was a symptom of the fact that the concept of infinity was not sufficiently well understood at the time. In analogy, it seems reasonable to expect that the existence of semantic and set-theoretic paradoxes is a symptom of the fact that the involved semantic and set-theoretic concepts are not yet sufficiently well understood.

The paradoxes present serious foundational problems in semantics and set theory: no claim can be made to a solid foundation for these subjects until a satisfactory solution to the paradoxes have been provided. Problems surface when it comes to formalising semantics (the concept of truth) and set theory. If formalising the intuitive, "naive" understanding of these subjects inconsistent systems linger as the paradoxes will be formalisable in these systems.

2.1 Consequences of the Semantic Paradoxes

The liar paradox is a significant barrier to the construction of formal theories of truth as it produces inconsistencies in these potential theories. A substantial amount of research in self-reference concentrates on formal theories of truth and ways to circumvent the liar paradox. There are two articles that have influenced the work on formal theories of truth and the liar paradox more than any other: Alfred Tarski's "The Concept of Truth in Formalised Languages" (1935) and Saul Kripke's "Outline of a Theory of Truth" (1975). Below we first introduce some of the ideas and results of Tarski's article. Kripke's article is discussed in Section 3.2.

Tarski gives a number of conditions that, as he puts it, any adequate definition of truth must satisfy. The central of these conditions is what is now most often referred to as Schema T (or the T-schema or Convention T or the Tarski biconditionals):

Schema T: φ ↔ T<φ>, for all sentences φ.

Here T is the predicate intended to express truth and <φ> is a name for the sentence φ. Applying the predicate T to the name <φ> gives the expression T<φ> intended to represent the phrase "φ is true". Schema T thus expresses that for every sentence φ, φ holds if and only if the sentence "φ is true" holds. The T-schema is usually regarded as a set of sentences within a formal theory. It is customary to use first-order arithmetic, that is, first-order predicate logic extended with a set of standard axioms for arithmetic like PA (Peano Arithmetic) or Robinson's Q. What is being said in the following will apply to any such first-order formalisation of arithmetic. In this setting, <φ> above denotes the Gödel code of φ, and T<φ> is short for T(<φ>). The reader not familiar with Gödel codings (also known as Gödel numberings) can just think of the mapping <·> as a naming device or quotation mechanism for formulae — just like quotation marks in natural language. Often used notational variant for <φ> are ^⌈φ^⌉ and ‘φ’.

Tarski showed that the liar paradox is formalisable in any formal theory containing his schema T, and thus any such theory must be inconsistent. This result is often referred to as Tarski's theorem on the undefinability of truth. The result is basically a formalisation of the liar paradox within first-order arithmetic extended with the T-schema. In order to construct such a formalisation it is necessary to be able to formulate self-referential sentences (like the liar sentence) within first-order arithmetic. This ability is provided by the diagonal lemma.

Diagonal lemma.
Let S be a theory extending first-order arithmetic. For every formula φ(x) there is a sentence ψ such that S ⊢ ψ ↔ φ<ψ>.

Here the notation S⊢α means that α is provable in the theory S, and φ<ψ> is short for φ(<ψ>). Assume a formula φ(x) is given that is intended to express some property of sentences – truth, for instance. Then the diagonal lemma gives the existence of a sentence ψ satisfying the biimplication ψ ↔ φ<ψ>. The sentence φ<ψ> can be thought of as expressing that the sentence ψ has the property expressed by φ(x). The biimplication thus expresses that ψ is equivalent to the sentence expressing that ψ has property φ. One can therefore think of ψ as a sentence expressing of itself that it has property φ. In the case of truth, it would be a sentence expressing of itself that it is true. The sentence ψ is of course not self-referential in a strict sense, but mathematically it behaves like one. It is therefore possible to use sentences generated by the diagonal lemma to formalise paradoxes based on self-referential sentences, like the liar. The diagonal lemma is sometimes called the fixed-point lemma, since the equivalence ψ ↔ φ<ψ> can be seen as expressing that ψ is a fixed-point of φ(x).

A theory in first-order predicate logic is called inconsistent if a logical contradiction is provable in it. Tarski's theorem (on the undefinability of truth) may now be stated and proved.

Tarski's theorem.
Any theory extending first-order arithmetic and containing schema T is inconsistent.
Proof. Assume the existence of a consistent formal theory S extending first-order arithmetic and containing schema T. We need to show that this assumption leads to a contradiction. The proof mimics the liar paradox. Apply the diagonal lemma to obtain a sentence λ satisfying λ ↔ ¬T <λ> in S. The sentence λ expresses of itself that it is not true, so λ corresponds to the liar sentence. Instantiating schema T with the sentence λ gives us λ ↔ T<λ>. We now have that both λ ↔ ¬T<λ> and λ ↔ T<λ> hold in S (are provable in S), and thus T<λ> ↔ ¬T<λ> must hold in S. This contradicts S being consistent. □

Note that the contradiction T<λ> ↔ ¬ T<λ> above expresses: The liar sentence is true if and only if it is not. Compare this to the informal liar presented in the beginning of the article. Tarski's theorem shows that, in the setting of first-order arithmetic, it is not possible to give what Tarski considers to be an ‘adequate theory of truth’. The central question then becomes: How may the formal setting or the requirements for an adequate theory of truth be modified to regain consistency — that is, to prevent the liar paradox from trivialising the system? There are many different answers to this question, as there are many different ways to regain consistency. In Section 3 we will review the most influential approaches to this problem.

2.2 Consequences of the Set-Theoretic Paradoxes

The set-theoretic paradoxes constitute a significant challenge to the foundations of mathematics. They show that it is impossible to have a concept of set satisfying the unrestricted comprehension principle (also called full comprehension or unrestricted abstraction):

Unrestricted comprehension:
∀u (u ∈ { x | φ(x) } ↔ φ(u)), for all formulae φ(x).

In an informal setting, the formulae φ(x) could be allowed to be arbitrary predicates. In a more formal setting they would be formulae of e.g. a suitable first-order language. The unrestricted comprehension principles says that for any property (expressed by φ) there is the set of those entities that satisfy the property. This sounds as a very reasonable principle, and it more or less captures the intuitive concept of a set. Unfortunately, the principle is not sound, as it gives rise to Russell's paradox. Consider the property of non-self-membership. This can be expressed by the formula x ∉ x. If we let φ(x) be the formula x ∉ x then the set { x | φ(x) } becomes the Russell set R, and we obtain the following instance of the unrestricted comprehension principle:

∀u (u ∈ R ↔ u ∉ u).

Analogous to the argument in Russell's paradox a contradiction is now obtained by instantiating u with R:

R ∈ R ↔ R ∉ R.

This contradiction expresses that the Russell set is a member of itself if and only if it is not. What has hereby been proven is the following.

Theorem (Inconsistency of Naive Set Theory).
Any theory containing the unrestricted comprehension principle is inconsistent.

Compare this theorem with Tarski's theorem. Tarski's theorem expresses that if we formalise the intuitively most obvious principle concerning truth we end up with an inconsistent theory. The theorem above expresses that the same thing happens when formalising the intuitively most obvious principle concerning set membership.

Given the inconsistency of unrestricted comprehension, the objective becomes to find a way to restrict it or the underlying logical principles to regain a consistent theory — that is, a set theory that will not be trivialised by Russell's paradox. Many alternative set theories excluding the unrestricted comprehension principle have been developed during the last century, among them the type theory of Russell and Whitehead, Simple Type theory (ST), Gödel-Bernays set theory (GB), Zermelo-Fraenkel set theory (ZF), and Quine's New Foundations (NF). These are all believed to be consistent, although no simple proofs of their consistency are known. However, no counter-examples to their consistency are known either. At least they all escape the known paradoxes of self-reference. We will return to a discussion of this in Section 3.

2.3 Consequences of the Epistemic Paradoxes

The epistemic paradoxes constitute a threat to the construction of formal theories of knowledge, as the paradoxes become formalisable in many such theories. Suppose we wish to construct a formal theory of knowability within an extension of first-order arithmetic. The reason for choosing to formalise knowability rather than knowledge is that knowledge is always relative to a certain agent at a certain point in time, whereas knowability is a universal concept like truth. We could have chosen to work directly with knowledge instead, but it would require more work and make the presentation unnecessarily complicated. To formalise knowability we introduce a special predicate K, and use sentences of the form K<φ> to express that φ is knowable. Analogous to the cases of truth and set membership, there must be certain logical principles that K needs to satisfy in order for our formal theory to qualify as an adequate theory of knowability. First of all, all knowable sentences must be true. This property can be formalised by the following logical principle:

A1. K<φ> → φ, for all sentences φ.

Of course this principle must itself be knowable, that is, we get the following logical principle:

A2. K<K<φ> → φ>, for all sentences φ.

In addition, all theorems of first-order arithmetic ought to be knowable:

A3. K<φ>, for all sentences φ of first-order arithmetic.

Furthermore, knowability must be closed under logical consequences:

A4. K<φ → ψ> → (K<φ> → K<ψ>), for all sentences φ.

Now, the principles A1–A4 is all it takes to formalise the paradox of the knower. More precisely, we have the following theorem due to Montague (1963).

Montague's theorem.
Any formal theory extending first-order arithmetic and containing axiom schemas A1–A4 is inconsistent.
Proof. Assume the existence of a consistent formal theory S extending first-order arithmetic and containing axiom schemas A1–A4. We need to show that this assumption leads to a contradiction. The proof mimics the paradox of the knower. Apply the diagonal lemma to obtain a sentence λ satisfying λ ↔ ¬K <λ> in S. The sentence λ expresses of itself that it is not knowable, so λ roughly corresponds to the knower sentence, KS. The first piece of argumentation used in the paradox of the knower led to the conclusion that KS is indeed true. This piece of argumentation is mimicked by the following piece of formal reasoning within S:

1. λ → ¬K<λ> by choice of λ

2. ¬K<λ> → λ by choice of λ

3. K<λ> → λ axiom A1

4. (K<λ> → λ) →
((λ → ¬K<λ>) → ¬K<λ>) propositional tautology

5. (λ → ¬K<λ>) → ¬ K<λ> modus ponens on 4,3

6. ¬K<λ> modus ponens on 5,1

7. λ modus ponens on 2,6

This proof shows that λ — our formal version of KS — is provable in S. The proof corresponds to the informal argument that KS is true. As argued in the paradox of the knower, any agent with sufficient reasoning capabilities will be able to prove the truth of KS, and thus come to know that KS holds. Thus KS must be knowable. What this means in the present formal framework is that we can also prove the knowability of λ in S:

8. K<λ → ¬K<λ>> by A3 and choice of λ

9. K<¬K<λ> → λ> by A3 and choice of λ

10. K<K<λ> → λ> axiom A2

11. K<(K<λ> → λ) →
((λ → ¬K<λ>) → ¬K<λ>)> axiom A3 on propositional tautology

12. K<(λ → ¬K<λ>) → ¬ K<λ>> axiom A4 on 11,10

13. K<¬K<λ>> axiom A4 on 12,8

14. K<λ> axiom A4 on 9,13

This completes the proof of the knowability of λ, corresponding to the informal argument that KS is known by some agent. Note the similarity between the two pieces of proof in lines 1–7 and 8–14. The only difference is that in the latter all formulae are preceded by an extra K. This is because lines 8–14 express the same line of reasoning as lines 1–7 with the only difference that the latter is a proof of the knowability of λ rather than the truth of λ. Having concluded that λ is both true and knowable, we now immediately obtain a contradiction as in the paradox of the knower:

15. ¬K<λ> modus ponens on 1,7

16. K<λ> ∧ ¬K<λ> conjunction of 14 and 15

This completes the proof of Montague's theorem. □

Montague's theorem shows that in the setting of first-order arithmetic we cannot have a theory of knowledge or knowability satisfying even the basic principles A1–A4. Montague's theorem is a generalisation of Tarski's theorem. If a predicate symbol K satisfies Tarski's schema T then it is easy to see that it will also satisfy axiom schemas A1–A4. Thus axiom schemas A1–A4 constitute a weakening of the T-schema, and Montague's theorem shows that even this much weaker version of the T-schema is sufficient to produce inconsistency.

Formalising knowledge as a predicate in a first-order logic is referred to as the syntactic treatment of knowledge. Alternatively, one can choose to formalise knowledge as an operator in a suitable modal logic. This is referred to as the semantic treatment of knowledge. In the semantic treatment of knowledge one generally avoids problems of self-reference, and thus inconsistency, but it is at the expense of the expressive power of the formalism.

2.4 Consequences Concerning Provability and Computability

The central argument given in the proof of Tarski's theorem is closely related to the central argument in Gödel's first incompleteness theorem (Gödel, 1931). Gödel's theorem can be given the following formulation.

Gödel's first incompleteness theorem.
If first-order arithmetic is ω-consistent then it is incomplete.

A theory is called ω-consistent if the following holds for every formula φ(x) containing x as its only free variable: If ⊢¬φ(n) for every natural number n then ⊬∃xφ(x). ω-consistency is a stronger condition than ordinary consistency, so any ω-consistent theory will also be consistent. A theory is incomplete if it contains a formula which can neither be proved nor disproved.

Proof sketch. Assume first-order arithmetic is both ω-consistent and complete. We need to show that this leads to a contradiction. Gödel constructs a formula Bew (for "Beweis") in formal arithmetic satisfying, for all φ and all n,

(1) ⊢Bew(n, <φ>) iff n is the Gödel code of a proof of φ.

Assuming the theory to be ω-consistent and complete we can prove that for all sentences φ,

(2) ⊢∃xBew(x, <φ>) iff ⊢ φ.

The proof of (2) runs like this. First we prove the implication from left to right. If ⊢∃xBew(x, <φ>) then there is some n such that ⊬¬Bew(n, <φ>), by ω-consistency. By completeness we get ⊢Bew(n, <φ>) for this n. By (1) above we get that n denotes a proof of φ. That is, φ is provable, so we have ⊢ φ. To prove the implication from right to left, note that if ⊢ φ then there must be an n such that ⊢ Bew(n, <φ>), by (1). From this we get ⊢∃xBew(x, <φ>), as required. This concludes the proof of (2). Now, when in a complete theory we have (2) we must also have:

(3) ⊢∃xBew(x, <φ>) ↔ φ, for all sentence φ.

If we let ∃xBew(x, <φ>) be abbreviated T(<φ>) then (3) becomes:

⊢T(<φ>) ↔ φ, for all sentence φ.

This is the T-schema! Thus if we assume first-order arithmetic to be ω-consistent and complete then schema T turns out to be interpretable in it. Now, Tarski's theorem above shows that there exists no such consistent theory, and we thus have a contradiction. □

In the proof above we reduced Gödel's incompleteness theorem to an application of Tarski's theorem in order to show the close link between the two. Gödel was well aware of this link, and indeed it seems that Gödel even proved Tarski's theorem before Tarski himself did (Feferman, 1984). Gödel's theorem can be interpreted as demonstrating a limitation in what can be achieved by purely formal procedures. It says that if first-order arithmetic is ω-consistent (which it is believed to be), then there must be arithmetical sentences that can neither be proved nor disproved by the formal procedures of first-order arithmetic. One might at first expect this limitation to be resolvable by the inclusion of additional axioms, but Gödel showed that the incompleteness result still holds when first-order arithmetic is extended with an arbitrary finite set of axiom schemas (or, more generally, an arbitrary recursive set of axioms). Thus we obtain a general limitation result saying that there cannot exist a formal proof procedure by which any given arithmetical sentence can be proved to hold or not to hold. For more details on Gödel's incompleteness theorem, see the entry on Kurt Gödel.

The limitation result of Gödel's theorem is closely related to another limitation result known as the undecidability of the halting problem. This is a result stating that there are limitations to what can be computed. We will present this result in the following. The result is based on the notion of a Turing machine, which is a generic model of a computer program running on a computer with an unlimited memory. Thus any program running on any computer can be thought of as a Turing machine (see the entry on Turing machines for more details). When running a Turing machine, it will either terminate after a finite number of computation steps, or will continue running forever. In case it terminates after a finite number of computation steps we say that it halts. The halting problem is the problem of finding a Turing machine that can decide whether other Turing machines halt or not. We say that a Turing machine H decides the halting problem if the following holds:

H takes as input a pair (<A>,x) consisting of the Gödel code <A> of a Turing machine A and an arbitrary string x. H returns the answer "yes" if the Turing machine A halts when given input x, and "no" otherwise.

Thus if a Turing machine H decides the halting problem, we can use it to determine for an arbitrary Turing machine A and arbitrary input x whether A will halt on input x or not. The undecidability of the halting problem is the following result, due to Turing (1937), stating that no such machine can exist:

Theorem (Undecidability of the Halting Problem).
There exists no Turing machine deciding the halting problem.
Proof. Assume the existence of a Turing machine H deciding the halting problem. We need to show that this assumption leads to a contradiction. The proof mimics Grelling's paradox. We call a Turing machine A heterological if A doesn't halt on input <A>, that is, if A doesn't halt when given its own Gödel code as input. Using H, we can construct a Turing machine G that halts if and only if it is given the Gödel code of a heterological Turing machine as input. G works as follows:

G takes as input the Gödel code of a Turing machine A. Then it runs H on input (<A>,<A>). If H on input (<A>,<A>) returns "no" we know that A is heterological, and G is halted. If, on the other hand, H on input (<A>,<A>) returns "yes" then G is forced into an infinite loop (that is, never halts).

In analogy to Grelling's paradox we can now ask whether G is a heterological Turing machine or not. This leads to the following sequence of equivalences:

G is heterological ⇔ G doesn't halt on input <G>

⇔ H returns "no" on input (<G>,<G>)

⇔ G halts on input <G>

⇔ G is not heterological.

This gives the required contradiction. □

From the two theorems above we see that in the areas of provability and computability the paradoxes of self-reference turn into limitation results: there are limits to what can be proven and what can be computed. This is actually quite similar to what happened in the areas of semantics, set theory and epistemology: The paradoxes of self-reference turned into theorems showing that there are limits to which properties we can consistently assume a truth predicate to have (Tarski's theorem), a set theory to have (inconsistency of the unrestricted comprehension principle), and a knowledge predicate to have (Montague's theorem). It is hard to accept these limitation results, because most of them conflict with our intuitions and expectations. The central role played by self-reference in all of them makes them even harder to accept, and definitely more puzzling. However, we are forced to accept them, and forced to accept the fact that in these areas we cannot have all we might (otherwise) reasonably ask for.

3. Solving the paradoxes

The present section takes a look at how to solve — or rather, circumvent — the paradoxes. To solve or circumvent a paradox one has to weaken some of the assumptions leading to the contradiction. It is very difficult to choose which assumptions to weaken, since each of the explicitly stated assumptions underlying a paradox appears to be "obviously true" — otherwise it would not be called a paradox. Below we will take a look at the most influential approaches to solving the paradoxes.

So far the presentation has been structured according to type of paradox, that is, the semantic, set-theoretic and epistemic paradoxes have been dealt with separately. However, it has also been demonstrated that these three types of paradoxes are similar in underlying structure, and it has been argued that a solution to one should be a solutions to all (the principle of uniform solution). Therefore, in the following the presentation will be structured not according to type of paradox but according to type of solution. Each type of solution considered in the following can be applied to any of the paradoxes of self-reference, although in most cases the constructions involved were originally developed with only one type of paradox in mind.

3.1 Building Explicit Hierarchies

Building hierarchies is a method to circumvent both the set-theoretic, semantic and epistemic paradoxes. Russell's original solution to his paradox — as well as Tarski's original solution to his undefinability of truth problem — was to build hierarchies. In Russell's case, this led to type theory. In Tarski's case, it led to what is now known as Tarski's hierarchy of languages. In both cases, the idea is to stratify the universe of discourse (sets, sentences) into levels. In type theory, these levels are called types. The fundamental idea of type theory is to introduce the constraint that any set of a given type may only contain elements of lower types (that is, may only contain sets which are located lower in the stratification). This effectively blocks Russell's paradox, since no set can then be a member of itself.

In Tarski's case, the stratification is obtained in the following way. Assume one wants to equip a language L₀ with a truth predicate. From Tarski's theorem (Section 2.1) it is known that this truth predicate cannot be part of the language L₀ itself — at least not as long as we want the truth predicate to satisfy schema T. Instead one builds a hierarchy of languages, L₀, L₁, L₂,…, where each language L_i+1 has a truth predicate T_i+1 that only applies to the sentences of L_j, j≤i. In this hierarchy, L₀ is called the object language and, for all i, L_i+1 is called the meta-language of L_i. The hierarchy effectively blocks the liar paradox, since now a sentence can only express the truth or untruth of sentences at lower levels, and thus a sentence such as the liar that expresses its own untruth cannot be formed.

Russell's type theory can be regarded as a solution to Russell's paradox, since type theory demonstrates how to "repair" set theory such that the paradox disappears. Similarly, Tarski's hierarchy can be regarded as a solution to the liar paradox. It is the same stratification idea that underlies both of Russell's and Tarski's solutions. The point to note is that Russell's paradox and the liar paradox depend essentially on circular notions (self-membership and self-reference). By making a stratification in which an object may only contain or refer to objects at lower levels, circularity disappears. In the case of the epistemic paradoxes, a similar stratification could be obtained by making an explicit distinction between first-order knowledge (knowledge about the external world), second-order knowledge (knowledge about first-order knowledge), third-order knowledge (knowledge about second-order knowledge), and so on.

Building explicit hierarchies is sufficient to avoid circularity, and thus sufficient to block the standard paradoxes of self-reference. However, there also exist paradoxes such as Yablo's that do not rely on circularity and self-reference. Such paradoxes can also be blocked by a hierarchy approach, but it is necessary to further require the hierarchy to be well-founded, that is, to have a lowest level. Otherwise, the paradoxes of non-wellfoundedness can still be formulated. For instance, Yablo's paradox may be formalised in a descending hierarchy of languages. A descending hierarchy of languages consists of languages L₀, L₋₁, L₋₂,… where each language L_−i has a truth predicate that only applies to the sentences of the languages L_−j, j>i. Similarly, a set-theoretic paradox of non-wellfoundedness may be formulated in a type theory allowing negative types. The conclusion drawn is that a stratification of the universe is not itself sufficient to avoid all paradoxes — the stratification also has to be well-founded.

Building an explicit (well-founded) hierarchy to solve the paradoxes is today by most considered an overly drastic and heavy-handed approach. The hierarchies introduce a number of complicating technicalities not present in a "flat universe", and even though the paradoxes do indeed disappear, so do all non-paradoxical occurrences of self-reference. Kripke (1975) gives the following illustrative example taken from ordinary discourse. Let N be the following statement, made by Nixon,

(N) All of Jones' utterances about Watergate are true,

and let J be the following statement, made by Jones,

(J) Most of Nixon's utterances about Watergate are false.

In a Tarskian language hierarchy, the sentence N would have to be on a higher level than all of Jones' utterances, and, conversely, the sentence J would have to be on a higher level than all of Nixon's utterances. Since N is an utterance of Nixon, and J is an utterance of Jones, N would have to be on a higher level than J, and J on a higher lever than N. This is obviously not possible, so in a hierarchy like the Tarskian, these sentences cannot even be formulated. The sentences N and J are indeed both indirectly self-referential, since N makes reference to a totality including J, and J makes reference to a totality including N. Nevertheless, in most cases N and J are harmless, and do not produce a paradox. A paradox is only produced in the special cases where all of Jones' utterances except possibly J are true, and exactly half of Nixon's utterances are false, disregarding N. Kripke uses the fact that N and J are only problematic in certain special cases as an argument against an approach that altogether excludes the possibility of formulating N and J.

Another argument against the hierarchy approach is that explicit stratification is not part of ordinary discourse, and thus it might be considered somewhat ad hoc to introduce it into formal settings with the sole purpose of circumventing the paradoxes. Not having an explicit stratification in ordinary discourse obviously doesn't imply the non-existence of an underlying, implicit, stratification, but at least it's not explicitly represented in our syntax.

The arguments given above are among the reasons why Russell's and Tarski's works have not been considered to furnish the final solutions to the paradoxes. Many alternative solutions have been proposed. One might for instance try to look for implicit hierarchies rather than explicit hierarchies. An implicit hierarchy is a hierarchy not explicitly reflected in the syntax of the language. In the following section we will consider some of the solutions to the paradoxes obtained by such implicit stratifications.

3.2 Building Implicit Hierarchies

Tarski's hierarchy approach to the semantic paradoxes dominated the field until 1975, when Kripke published his famous and highly influential paper, "Outline of a Theory of Truth". This paper has greatly shaped most later approaches to theories of truth and the semantic paradoxes. It should be noted, however, that ideas quite similar to Kripke's were developed simultaneously and independently by Martin and Woodruff (1975), and that a parallel approach in a set-theoretic setting was developed independently by Gilmore (1974).

3.2.1 Kripke's Theory of Truth

Kripke's ideas are based on an analysis of the problems involved in Tarski's hierarchy approach. Kripke lists a number of arguments against having a language hierarchy in which each sentence lives at a fixed level, determined by its syntactic form. He proposes an alternative solution which still uses the idea of having levels, but where the levels are not becoming an explicit part of the syntax. Rather, the levels become stages in an iterative construction of a truth predicate. To explain Kripke's construction, some additional technical machinery is required.

At each stage in Kripke's construction, the truth predicate is only partially defined, that is, it only applies to some of the sentences of the language. To deal with such partially defined predicates, a three-valued logic is employed — that is, a logic which operates with a third value, undefined, in addition to the truth values true and false. Often the third value is denoted "u" or "⊥" (bottom). A partially defined predicate only receives one of the classical truth values, true or false, when it is applied to one of the terms for which the predicate has been defined, and otherwise it receives the value undefined. There are several different three-valued logics available, differing in how they treat the third value. Here only one of them is briefly described, called Kleene's strong three-valued logic. More detailed information on this and related logics can be found in the entry on many-valued logic.

In Kleene's strong three-valued logic, the value ⊥ (undefined) can be interpreted as "not yet defined". So one could think of formulae with the value ⊥ as formulae that actually have a classical truth value (true or false), but which has just not been determined yet. This interpretation of undefined is reflected in the truth tables for the logic, given below. The upper truth table is for disjunction, the lower for negation:

∨ true false ⊥

true true true true

false true false ⊥

⊥ true ⊥ ⊥

∨	true	false	⊥
true	true	true	true
false	true	false	⊥
⊥	true	⊥	⊥

¬

true false

false true

⊥ ⊥

¬
true	false
false	true
⊥	⊥

These truth tables define the three-valued logic completely, as ∨ and ¬ are taken to form an adequate set of connectives and existential and universal quantification are treated as infinite disjunction and conjunction, respectively.

To handle partially defined truth predicates, it is necessary to introduce the notion of partial models. In a partial model for a first-order language, each n-place predicate symbol P is interpreted by a pair (U,V) of disjoint n-place relations on the domain of the model. U is called the extension of P and V its anti-extension. In the model, P is true of the objects in U, false of the objects in V, and undefined otherwise. In this way, any atomic sentence receives one of the truth values true, false or undefined in the model. Non-atomic formulae are given truth values in the model by using Kleene's strong three-valued logic for evaluating the connectives.

Given the definition of a partial model, a partially interpreted language is a pair (L,M) where L is a first-order language and M is a partial model of L. Kripke recursively defines a sequence of partially interpreted languages L₀, L₁, L₂,…, only differing in their interpretation of the truth predicate T. The first language, L₀, is taken to be an arbitrary language in which both the extension and anti-extension of T are the empty set. Thus in L₀, the truth predicate is completely undefined. For any α, the language L_α+1 is like L_α except that T is interpreted by the extension/anti-extension pair (U,V) given by:

U is the set of Gödel codes <φ> of sentences φ true in L_α.
V is the set of Gödel codes <φ> of sentences φ false in L_α.

This definition immediately gives that for all α,

(4) φ is true (false) in L_α ⇔ T(<φ>) is true (false) in L_α+1.

What has been constructed is a sequence L₀, L₁, L₂,… of partially interpreted languages such that T is interpreted in L_α+1 as the truth predicate for L_α. This is just like Tarski's hierarchy of languages, except that here there is no syntactic distinction between the different languages and their truth predicates.

The sequence L₀, L₁, L₂,… has an important property: For each α, the interpretation of T in L_α+1 extends the interpretation of T in L_α, that is, both the extension and anti-extension of T increase when moving from L_α to L_α+1. This means that one can define a new partially interpreted language L_ω by letting the extension of T be the union of all the extensions of T in L₀, L₁, L₂,…; and similarly for the anti-extension. Thus in L_ω, the interpretation of T extends the interpretation that T gets in all previous languages. This furnishes a strategy for continuing the iterative construction of a truth predicate into the transfinite: For each successor ordinal α+1, define L_α+1 from L_α exactly as in the finite case above; and for each limit ordinal σ, define L_σ from the preceding languages (L_i)_i<σ in the same way as L_ω was defined (for a detailed explanation of the ordinal numbers and their use in this context, see the entry on the revision theory of truth). A simple cardinality consideration now shows that this transfinite sequence of languages will eventually stabilise: There is an ordinal γ such that for all ordinals β, L_γ = L_γ+β. Since, in particular, L_γ = L_γ+1, the following instance of (4) is obtained:

(5) φ is true (false) in L_γ ⇔ T(<φ>) is true (false) in L_γ.

This shows that L_γ is actually a language containing its own truth predicate: Any sentence φ is true (false) if and only if the sentence expressing its truth, T(<φ>), is true (false). The equivalence (5) is nothing more than a semantic analogue of Tarski's schema T in a three-valued setting. The construction of the language L_γ was one of the major contributions of Kripke (1975). It shows that in a three-valued logical setting it is actually possible for a language to contain its own truth predicate. It is easy to see that the third value, undefined, is essential to making things work: If L_γ had been a totally interpreted language (that is, a language with no undefined sentences), then L_γ would satisfy schema T, by (5) above. However, it immediately contradicts Tarski's theorem that such a totally interpreted language can exist.

Among the sentences that receive the value undefined in L_γ is the liar sentence. The solution to the liar paradox implicit in Kripke's theory is this: Since both assuming that the liar sentence is true and that it is false leads to a contradiction it must be neither; it is undefined. The liar sentence is said to suffer from a truth-value gap. The idea of avoiding the liar paradox by allowing truth-value gaps did in fact appear in several places before Kripke's paper, but Kripke was among the first to make it an integral part of a genuine theory.

As with the hierarchy solution to the liar paradox, the truth-value gap solution is by many considered to be problematic. The main criticism is that by using a three-valued semantics, one gets an interpreted language which is expressively weak. One cannot, for instance, have a non-truth predicate in one of Kripke's languages (a sentence is non-true when it is either false or undefined). This is in fact noted by Kripke himself. If a partially interpreted language contained such a predicate, the following version of the liar paradox within the language could be made: "This sentence is non-true".

Kripke's theory circumvents the liar paradox by assigning it the value undefined. An alternative way to circumvent the liar paradox would be to assign it the value both true and false in a suitable paraconsistent logic. One of the simplest paraconsistent logics is LP, which is a three-valued logic with the same truth tables as Kleene's strong three-valued logic presented above — the only difference is that the third truth value is interpreted as both true and false rather than undefined. A reason for preferring a paraconsistent logic over a partial logic is that paradoxical sentences such as the liar can then be modelled as true contradictions rather than truth-value gaps. The view that there exists true contradictions is called dialetheism. See the entries on dialetheism and paraconsistent logic for more information.

3.2.2 Implicit Hierarchies in Set Theories

Building implicit rather than explicit hierarchies is also an idea that has been employed in set theory. New Foundations (NF) by Quine (1937) is a modification of simple type theory where the stratification into syntactic types has been replaced by a stratification on the comprehension principle:

NF comprehension:
∀u(u ∈ { x | φ(x) } ↔ φ(u)), for all stratified formulae φ(x).

A formula φ is stratified if there exists a mapping σ (a stratification) from the variables of φ to the natural numbers such that if u ∈ v is a subformula of φ then σ(v) = σ(u)+1 and if u = v is a subformula of φ then σ(v) = σ(u). Obviously the formula x ∉ x is not stratified, and thus the NF comprehension principle cannot be used to formulate Russell's paradox in the theory. Quine's New Foundations is essentially obtained from type theory by hiding the types from the syntax. Thus, the theory still makes use of a hierarchy approach to avoid the paradoxes, but the hierarchy is made implicit by not representing it in the syntax of formulae.

Zermelo-Fraenkel set theory (ZF) is another theory that builds on the idea of an implicit hierarchy to circumvent the paradoxes. However, it does so in a much less direct way than NF. In ZF, sets are built bottom-up, starting with the empty set and iterating a construction of bigger and bigger sets using the operations of union and power set. This produces a hierarchy with the empty set on the lowest level, level 0, and with the power set operation producing a set of level α+1 from a set of level α. Analogous to Kripke's iterative construction, the procedure is continued into the transfinite using the union operator at the limit ordinal levels. The hierarchy obtained is called the cumulative hierarchy. One of the axioms of ZF, the axiom of foundation, states that every set of ZF lives on a certain level in this cumulative hierarchy. In order words, the axiom of foundation states that there are no sets in ZF besides the ones that can be constructed bottom-up by the iterative procedure just described. Since in a cumulative hierarchy, there can be no sets containing themselves, no universal set, and no non-wellfounded sets, none of the known paradoxes can immediately be formulated in the theory. This does obviously not in itself ensure the consistency of ZF, but at least it illustrates how the idea of a set hierarchy plays a significant role in ZF as well. ZF has a privileged status among set theories, as it is today the most widely acknowledged candidate for a formal foundation of mathematics.

3.3 General Fixed Point Approaches

Kripke's iterative construction of a truth predicate presented above may be viewed as an instance of a more general fixed point approach towards building formal theories of truth. Fixed point approaches have become central to contemporary formal theories of truth. The main idea is to have a truth revision operator and then look for fixed points of this operator. At heart of such fixed point approaches is some kind of fixed point theorem that guarantees the existence of fixed points for certain kinds of operators. There are several different fixed point theorems available. Consider now one of the simpler ones.

Fixed point theorem.
Let τ be a monotone operator on a chain complete partial order (henceforth ccpo). Then τ has a least fixed point, that is, there is a least f such that τ(f) = f.

A ccpo is a partial order (D,<) in which every totally ordered subset of D has a least upper bound. The totally ordered subsets of D are called chains in D. A monotone operator on (D,<) is a mapping τ: D → D satisfying:

d₁ ≤ d₂ ⇒ τ(d₁) ≤ τ(d₂), for all d₁, d₂ ∈ D.

Kripke's construction fits into the fixed point theorem above in the following way. First note that the set of partially interpreted languages that only differ on the interpretation of T forms a ccpo: Simply define the ordering on these languages by L₁ ≤ L₂ iff the interpretation of T in L₂ extends the interpretation of T in L₁ (that is, the extension and anti-extension of T in L₁ are included in those of L₂). Then define a truth revision operator τ on these languages by:

(6) τ(L) = L′, where T(<φ>) is true (false) in L′ iff φ is true (false) in L.

Note that if L_α is one of the languages in Kripke's construction, then L_α+1 = τ(L_α). The idea of this truth revision operator τ is that if τ(L)=L′ then L′ will be a language in which T is interpreted as the truth predicate for L. If therefore τ(L)=L for some L, that is, if L is a fixed point of τ, then L will be a language containing its own truth predicate. This motivates the search for fixed points of τ. Since τ is easily seen to be monotone, by the fixed point theorem it has a least fixed point. It is not hard to see that this fixed point is exactly the language L_γ constructed in Kripke's theory of truth. Kripke's construction is thus recaptured in the setting of fixed points for monotone operators.

The point of introducing the additional machinery was not just to rediscover the language L_γ. The point is rather that a much more general and abstract framework which may lead to new theories of truth and give further insights into the semantic paradoxes is provided. It turns out that the truth revision operator τ defined above has many interesting fixed-points in addition to L_γ. It is also possible to obtain new interesting theories of truth by considering alternative ways of making the set of interpreted languages into a ccpo. One may for instance add an additional truth value and consider the situation in a four-valued logic, as considered by Fitting (1997); or one may remove the third truth value undefined and construct a ccpo in a completely classical setting. In the classical setting, attention is restricted to the totally interpreted languages (languages in which every sentence is either true or false), and an ordering on them is defined by: L₁ ≤ L₂ holds iff the extension of the truth predicate in L₁ is included in the extension of the truth predicate in L₂, that is, iff L₂ points out at least as many sentences as true as L₁. This gives a ccpo. By using the fixed point theorem in this setting on a suitably defined revision operator, it is fairly easy to prove the existence of a totally interpreted language containing a positive definition of truth. By this is meant that the interpreted language has a predicate T satisfying the following restricted version of the T-schema:

(7) φ ↔ T(<φ>), for all positive sentences φ,

where the positive sentences are those built using only the quantifiers ∃ and ∀ and the connectives ∧ and ∨ (that is, positive sentences have no occurrences of the negation sign). Since (7) is satisfiable in a totally interpreted language, the first-order theory containing the sentences of (7) as axioms must be consistent. This should be contrasted with Tarski's theorem stating that the unrestricted T-schema is inconsistent. If the unrestricted comprehension principle is similarly restricted to the positive formulae, we also get a consistent theory. This was originally shown by Gilmore (1974).

The fixed point approach is also the point of departure of the revision theory of truth developed by Belnap and Gupta (1993). The revision theory of truth is without doubt the most influential theory of truth and the semantic paradoxes that has been developed since the theory of Kripke. The revision theory considers the truth revision operator τ defined by (6) as an operator on the totally interpreted languages. On these languages τ doesn't have a fixed point: If it had such a fixed point L then L would be a totally interpreted language satisfying the full schema T, directly contradicting Tarski's theorem. Since τ doesn't have a fixed point on the totally interpreted languages, the revision theory instead considers transfinite sequences L₁, L₂, … , L_ω, L_ω+1, … of totally interpreted languages satisfying:

For any successor ordinal α+1, L_α+1 = τ(L_α).
For any limit ordinal σ and any sentence φ, if φ stabilises on the value true (false) in the sequence (L_α)_{α < σ} then φ is true (false) in L_σ.

In such a sequence, each sentence φ will either eventually stabilise on a classical truth value (true or false), or it will never stabilise. An example of a sentence that will never stabilise is the liar sentence: If the liar sentence is true in one of the languages L_α it will be false in L_α+1, and vice versa. The revision theory thus gives an account of truth that correctly models the behaviour of the liar sentence as one that never stabilises on a truth value. In the revision theory it is argued that this gives a more correct account of truth and self-reference than Kripke's theory in which the liar sentence is simply assigned the value undefined. A full account of Gupta and Belnap's theory can be found in the entry on the revision theory of truth.

Bibliography

Bartlett, S.J. and Peter Suber (eds.), 1987, Self-Reference — Reflections on Reflexivity, Dordrecht: Martinus Nijhoff Publishers.
Bartlett, S.J. (ed.), 1992, Reflexivity — A Source-Book in Self-Reference, Amsterdam: North-Holland Publishing Co.
Barwise, J. and J. Etchemendy, 1987, The Liar — An Essay on Truth and Circularity, New York: Oxford University Press.
Barwise, J. and L. Moss, 1996, Vicious Circles — On the Mathematics of Non-Wellfounded Phenomena, Stanford: CSLI Publications.
Bolander, T. and V.F. Hendricks and S.A. Pedersen (eds.), 2006, Self-Reference, Stanford: CSLI Publications.
Boolos, G., 1993, The Logic of Provability, Cambridge: Cambridge University Press.
Cantini, A., 1996, Logical Frameworks for Truth and Abstraction — An Axiomatic Study, Amsterdam: North-Holland Publishing Co.
Cantor, G., 1891, "Über eine Elementare Frage der Mannigfaltigkeitslehre," Jahresbericht der Deutschen Mathematiker-Vereinigung, 1: 75–78.
Chapuis, A. and A. Gupta, 2000, Circularity, Definition and Truth, New Delhi: Indian Council of Philosophical Research.
Erickson, G. W. and J. A. Fossa, 1998, Dictionary of Paradox, Lanham: University Press of America.
Feferman, S., 1984, "Kurt Gödel: conviction and caution", Philosophia Naturalis, 21: 546–562.
Feferman, S., 1984a, "Toward useful type-free theories I", The Journal of Symbolic Logic, 49: 75–111.
Feferman, S., 1991, "Reflecting on incompleteness", The Journal of Symbolic Logic, 56: 1–49.
Fitting, M., 1986, "Notes on the mathematical aspects of Kripke's theory of truth", Notre Dame Journal of Formal Logic, 27(1): 75–88.
Fitting, M., 1997, "A theory of truth that prefers falsehood", The Journal of Philosophical Logic, 26(5): 477–500.
Forster, T. E., 1995, Set Theory with a Universal Set, New York: The Clarendon Press.
Gilmore, P. C., 1974, "The consistency of partial set theory without extensionality", Axiomatic set theory (Proc. Sympos. Pure Math., Vol. XIII, Part II), pp. 147–153, Providence RI: Amer. Math. Soc.
Gödel, K., 1931, "Über formal unentscheidbare Satze der Principia Mathematica und verwandter Systeme I", Monatshefte für Mathematik und Physik, 38: 173–198.
Gupta, A. and N. Belnap, 1993, The Revision Theory of Truth, Cambridge MA: MIT Press.
van Heijenort, J., 1967, From Frege to Gödel. A source book in mathematical logic, 1879–1931, Cambridge (MA): Harvard University Press.
Kripke, S., 1975, "Outline of a Theory of Truth", The Journal of Philosophy, 72: 690–716.
Martin, R. L. and P. W. Woodruff, 1975, "On representing ‘True-in-L’ in L", Philosophia, 5: 213–217.
Martin, R. L., 1984, Recent Essays on Truth and the Liar Paradox, Oxford: Oxford University Press.
McGee, V., 1990, Truth, Vagueness, and Paradox — An Essay on the Logic of Truth, Indianapolis: Hackett Publishing Co.
McGee, V., 1992, "Maximal consistent sets of instances of Tarski's Schema (T)", The Journal of Philosophical Logic, 21(3): 235–241.
Montague, R., 1963, "Syntactical treatment of modality, with corollaries on reflection principles and finite axiomatizability", Acta Philosophica Fennica, 16: 153–167.
Perlis, D. and V. S. Subrahmanian, 1994, "Meta-languages, Reflection Principles and Self-Reference", Handbook of Logic in Artificial Intelligence and Logic Programming, 2: 323–358.
Priest, G., 1987, In Contradiction — A Study of the Transconsistent, Dordrecht: Martinus Nijhoff Publishers.
Priest, G., 1994, "The Structure of the Paradoxes of Self-Reference", Mind, 103: 25–34.
Priest, G., 1995, Beyond the Limits of Thought, Cambridge: Cambridge University Press.
Quine, W.V., 1937, "New foundations for mathematical logic", American Mathematical Monthly, 44: 70–80.
Rosado Haddock, G. E., 2001, "Recent truth theories: A case study", Axiomathes, 12(1): 87–115.
Russell, B., 1905, "On some difficulties in the theory of transfinite numbers and order types", Proceedings of the London Mathematical Society, 4: 29–53.
Sheard, M., 1994, "A guide to truth predicates in the modern era", The Journal of Symbolic Logic, 59(3): 1032–1054.
Simmons, K., 1993, Universality and the Liar — An Essay on Truth and the Diagonal Argument, Cambridge: Cambridge University Press.
Smullyan, R. M., 1992, Gödel's Incompleteness Theorems, Oxford: Oxford University Press.
Smullyan, R. M., 1994, Diagonalization and self-reference, Oxford: Oxford University Press.
Tarski, A., 1935, "Der Wahrheitsbegriff in den formalisierten Sprachen", Studia Philosophica, 1: 261–405.
Tarski, A., 1968, Undecidable Theories, Amsterdam: North-Holland Publishing Co.
Thomason, R., 1980, "A note of syntactical treatments of modality", Synthese, 44: 391–395.
Turing, A.M., 1936, "On Computable Numbers, With an Application to the Entscheidungsproblem", Proceedings of the London Mathematical Society, 42(2): 230–265; correction ibid., 43: 544–546 (1937).
Visser, A., 1989, "Semantics and the liar paradox", Handbook of Philosophical Logic, vol.IV, Dordrecht: Kluwer, pp. 617–706.
Yablo, Stephen, 1985, Yablo, "Truth and reflection", Journal of Philosophical Logic, 14(3): 297–349.
Yablo, Stephen, 1993, "Paradox without self-reference", Analysis, 53: 251–252.
Yablo, Stephen, 2006, "Circularity and Paradox", in Bolander, Hendricks, Pedersen (eds.), Self-reference, Stanford: CSLI Publications.
Zwicker, W.S., 1987, "Playing Games with Games: The Hypergame Paradox," The American Mathematical Monthly, 94(6): 507–514.

Other Internet Resources

Logical Paradoxes, by B. Hartley Slater, in the Internet Encyclopedia of Philosophy.
Truth, by Bradley Dowden and Norman Swartz, in the Internet Encyclopedia of Philosophy.

•	Grelling's paradox:	het ∈ ext(het)	⇔	het ∈ { P \| P ∉ ext(P) }
			⇔	het ∉ ext(het).
•	Russell's paradox:	R ∈ R	⇔	R ∈ { x \| x ∉ x}
			⇔	R ∉ R.

1.	λ → ¬K<λ>	by choice of λ
2.	¬K<λ> → λ	by choice of λ
3.	K<λ> → λ	axiom A1
4.	(K<λ> → λ) → ((λ → ¬K<λ>) → ¬K<λ>)	propositional tautology
5.	(λ → ¬K<λ>) → ¬ K<λ>	modus ponens on 4,3
6.	¬K<λ>	modus ponens on 5,1
7.	λ	modus ponens on 2,6

8.	K<λ → ¬K<λ>>	by A3 and choice of λ
9.	K<¬K<λ> → λ>	by A3 and choice of λ
10.	K<K<λ> → λ>	axiom A2
11.	K<(K<λ> → λ) → ((λ → ¬K<λ>) → ¬K<λ>)>	axiom A3 on propositional tautology
12.	K<(λ → ¬K<λ>) → ¬ K<λ>>	axiom A4 on 11,10
13.	K<¬K<λ>>	axiom A4 on 12,8
14.	K<λ>	axiom A4 on 9,13

15.	¬K<λ>	modus ponens on 1,7
16.	K<λ> ∧ ¬K<λ>	conjunction of 14 and 15

G is heterological	⇔	G doesn't halt on input <G>
	⇔	H returns "no" on input (<G>,<G>)
	⇔	G halts on input <G>
	⇔	G is not heterological.