Stanford Encyclopedia of Philosophy
This is a file in the archives of the Stanford Encyclopedia of Philosophy.

Type Theory

First published Wed Feb 8, 2006; substantive revision Fri Oct 20, 2006

The topic of type theory is fundamental both in logic and computer science. We limit ourselves here to sketch some aspects that are important in logic. For the importance of types in computer science, we refer the reader for instance to (Reynolds, 1983 and 1985).

1. Paradoxes and Russell's Type Theories

The theory of types was introduced by Russell in order to cope with some contradictions he found in his account of set theory (Russell, 1903). This contradiction was obtained by analysing a theorem of Cantor that no mapping

F : X → Pow(X)

(where Pow(X) is the class of subclasses of a class X) can be surjective; that is, F cannot be such that every member b of Pow(X) is equal to F(a) for some element a of X. This can be phrased “intuitively” as the fact that there are more subsets of X than elements of X. The proof of this fact is so simple and basic that it is worthwhile giving it here. Consider the following subset of X:

A = { xX | xF(x) }.

This subset cannot be in the range of F. For if A = F(a), for some a, then

aF(a) iff aA
iff aF(a)

which is a contradiction. Some remarks are in order. First, the proof does not use the law of excluded middle and is thus valid intuitionistically. Second, the method that is used, called diagonalisation was already present in the work of du Bois-Reymond for building real functions growing faster than any function in a given sequence of functions.

Russell analysed what happens if we apply this theorem to the case where A is the class of all classes, admitting that there is such a class. He was then lead to consider the special class of classes that do not belong to themselves

(*) R = { w | ww }

We then have

RR iff RR

It seems indeed that Cantor was already aware of the fact that the class of all sets cannot be considered itself to be a set.

Russell communicated this problem to Frege, and his letter, together with Frege's answer appear in van Heijenoort (1967). It is important to realise that the formulation (*) does not apply as it is to Frege's system. As Frege himself wrote in his reply to Russell, the expression “a predicate is predicated of itself” is not exact. Frege had a distinction between predicates (concepts) and objects. A (first-order) predicate applies to an object but it cannot have a predicate as argument. The exact formulation of the paradox in Frege's system uses the notion of the extension of a predicate P, which we designate as εP. The extension of a predicate is itself an object. The important axiom V is:

(Axiom V) εP = εQ   ≡   ∀x[P(x) ≡ Q(x)]

This axiom asserts that the extension of P is identical to the extension of Q if and only if P and Q are materially equivalent. We can then translate Russell's paradox (*) in Frege's system by defining the predicate

R(x) iff ∃P[x = εP ∧ ¬P(x)]

It can then been checked, using Axiom V in a crucial way, that

RR) ≡ ¬RR)

and we have a contradiction as well. (Notice that for defining the predicate R, we have used an impredicative existential quantification on predicates. It can be shown that the predicative version of Frege's system is consistent (see Heck, 1996 and for further refinements Ferreira, 2002).

It is clear from this account that an idea of types was already present in Frege's work: there we find a distinction between objects, predicates (or concepts), predicates of predicates, etc. (This point is stressed by Quine (1940).) This hierarchy is called the “extensional hierarchy” by Russell (1959), and its necessity was recognised by Russell as a consequence of his paradox.

2. Simple Type Theory and the λ-Calculus

As we saw above, the distinction: objects, predicates, predicate of predicates, etc., seems enough to block Russell's paradox (and this was recognised by Chwistek and Ramsey). We first describe the type structure as it is in Principia and later in this section we present the elegant formulation due to Church 1940 based on λ-calculus. The types can be defined as

  1. i is the type of individuals
  2. ( ) is the type of propositions
  3. if A1,…,An are types then (A1,…,An) is the type of n-ary relations over objects of respective types A1,…,An

For instance, the type of binary relations over individuals is (i, i), the type of binary connectives is (( ),( )), the type of quantifiers over individuals is ((i)).

For forming propositions we use this type structure: thus R(a1,…,an) is a proposition if R is of type (A1,…,An) and ai is of type Ai for i = 1,…,n. This restriction makes it impossible to form a proposition of the form P(P): the type of P should be of the form (A), and P can only be applied to arguments of type A, and thus cannot be applied to itself since A is not the same as (A).

However simple type theory is not predicative: we can define an object Q(x, y) of type (i, i) by

P[P(x) ⊃ P(y)]

Assume that we have two individuals a and b such that Q(a, b) holds. We can define P(x) to be Q(x, a). It is then clear that P(a) holds, since it is Q(a, a). Hence P(b) holds as well. We have proved, in an impredicative way, that Q(a, b) implies Q(b, a).

Alternative simpler formulations, which retain only the notion of classes, classes of classes, etc., were formulated by Gödel and Tarski. Actually this simpler version was used by Gödel in his 1931 paper on formally undecidable propositions. We have objects of type 0, for individuals, objects of type 1, for classes of individuals, objects of type 2, for classes of classes of individuals, and so on. Functions of two or more arguments, like relations, need not be included among primitive objects since one can define relations to be classes of ordered pairs, and ordered pairs to be classes of classes. For example, the ordered pair of individuals a, b can be defined to be {{a},{a,b}} where {x,y} denotes the class whose sole elements are x and y. (Wiener 1914 had suggested a similar reduction of relations to classes.) In this system, all propositions have the form a(b), where a is a sign of type n+1 and b a sign of type n. Thus this system is built on the concept of an arbitrary class or subset of objects of a given domain and on the fact that the collection of all subsets of the given domain can form a new domain of the next type. Starting from a given domain of individuals, this process is then iterated. As emphasised for instance in Scott 1993, in set theory this process of forming subsets is iterated into the transfinite.

In these versions of type theory, as in set theory, functions are not primitive objects, but are represented as functional relation. The addition function for instance is represented as a ternary relation by an object of type (i,i,i). An elegant formulation of the simple type theory which extends it by introducing functions as primitive objects was given by Church in 1940. It uses the λ-calculus notation (Barendregt, 1997). Since such a formulation is important in computer science, for the connection with category theory, and for Martin-Löf type theory, we describe it in some detail. The types of this system are defined inductively as follows

  1. there are two basic types i (the type of individuals) and o (the type of propositions)
  2. if A, B are types then AB, the type of functions from A to B, is a type

We can form in this way the types:

io (type of predicates)
(io) → o (type of predicates of predicates)

which correspond to the types (i) and ((i)) but also the new types

ii (type of functions)
(ii) → i (type of functionals)

It is convenient to write



A1 → (A2 → … → (AnB))

In this way


corresponds to the type (A1,…,An).

First-order logic considers only types of the form

i,…,ii (type of function symbols), and
i,…,io (type of predicate, relation symbols)

Notice that


stands for

A → (BC)

(association to the right).

For the terms of this logic, we shall not follow Church's account but a slight variation of it, due to Curry (who had similar ideas before Church's paper appeared) and which is presented in detail in R. Hindley's book on type theory. Like Church, we use λ-calculus, which provides a general notation for functions

M ::= x | M M | λx. M

Here we have used the so-called BNF notation, very convenient in computing science. This gives a syntactic specification of the λ-terms which, when expanded, means:

The notation for function application M N is a little different than the mathematical notation, which would be M(N). In general,

M1 M2 M3

stands for

(M1 M2) M3

(association to the left). The term λx. M represents the function which to N associates M[x:=N]. This notation is so convenient that one wonders why it is not widely used in mathematics. The main equation of λ-calculus is then (β-conversion)

x. M) N = M[x:=N]

which expresses the meaning of λx. M as a function. We have used M[x:=N] as a notation for the value of the expression that results when N is substituted for the variable x in the matrix M. One usually sees this equation as a rewrite rule (β-reduction)

x. M) NM[x:=N]

In untyped lambda calculus, it may be that such rewriting does not terminate. The canonical example is given by the term Δ = λx. x x and the application

Δ Δ → Δ Δ

(Notice the similarity with Russell's paradox.) The idea of Curry is then to look at types as predicates over lambda terms, writing M: A to express that M satisfies the predicate/type A. The meaning of


is then

M, if M: A then N M: B

which justifies the following rules

N: AB    M: A

     N M: B

   M: B [x: A]

λx. M: AB

In general one works with judgements of the form

x1:A1,...,xn:AnM: A

where x1,..., xn are distinct variables, and M is a term having all free variables among x1,..., xn. In order to be able to get Church's system, one adds some constants in order to form propositions. Typically

not: oo
imply: ooo
and: ooo
forall: (Ao) → o

The term

λx. ¬(x x)

represents the predicate of predicates that do not apply to themselves. This term does not have a type however, that is, it is not possible to find A such that

λx. ¬(x x) : (Ao) → o

which is the formal expression of the fact that Russell's paradox cannot be expressed. Leibniz equality

Q: iio

will be defined as

Q = λx . λy. ∀(λP. imply (P x) (P y))

One usually writes ∀x[M] instead of ∀(λx. M), and the definition of Q can then be rewritten as

Q = λx. λy. ∀P[imply (P x) (P y)]

This example again illustrates that we can formulate impredicative definitions in simple type theory.

The use of λ-terms and β-reduction is most convenient for representing the complex substitution rules that are needed in simple type theory. For instance, if we want to substitute the predicate λx. Q a x for P in the proposition

imply (P a) (P b)

we get

imply ((λx. Q a x) a) ((λx. Q a x) b)

and, using β-reduction,

imply (Q a a) (Q a b)

In summary, simple type theory forbids self-application but not the circularity present in impredicative definitions.

The λ-calculus formalism also allows for a clearer analysis of Russell's paradox. We can see it as the definition of the predicate

R x = ¬(x x)

If we think of β-reduction as the process of unfolding a definition, we see that there is a problem already with understanding the definition of R R

R R → ¬(R R) → ¬(¬(R R)) → …

In some sense, we have a non-wellfounded definition, which is as problematic as a contradiction (a proposition equivalent to its negation). One important theorem, the normalisation theorem, says that this cannot happen with simple types: if we have M: A then M is normalisable in a strong way (any sequence of reductions starting from M terminates).

For more information on this topic, we refer to the entry on Church's simple type theory.

3. Ramified Hierarchy and Impredicative Principles

Russell introduced another hierarchy, that was not motivated by any formal paradoxes expressed in a formal system, but rather by the fear of “circularity” and by informal paradoxes similar to the paradox of the liar. If a man says “I am lying”, then we have a situation reminiscent of Russell's paradox: a proposition which is equivalent to its own negation. Another informal such paradoxical situation is obtained if we define an integer to be the “least integer not definable in less than 100 words”. In order to avoid such informal paradoxes, Russell thought it necessary to introduce another kind of hierarchy, the so-called “ramified hierarchy”. Since this notion has been extremely influential in logic and especially proof theory, we describe it in some details.

In order to further motivate this hierarchy, here is one example due to Russell. If we say

Napoleon was Corsican.

we do not refer in this sentence to any assemblage of properties. The property “to be Corsican” is said to be predicative. If we say on the other hand

Napoleon had all the qualities of a great general

we are referring to a totality of qualities. The property “to have all qualities of a great general” is said to be impredicative.

Another example, also coming from Russell, shows how impredicative properties can potentially lead to problems reminiscent of the liar paradox. Suppose that we suggest the definition

A typical Englishman is one who possesses all the properties possessed by a majority of Englishmen.

It is clear that most Englishmen do not possess all the properties that most Englishmen possess. Therefore, a typical Englishman, according to this definition, should be untypical. The problem, according to Russell, is that the word “typical” has been defined by a reference to all properties and has been treated as itself a property. (It is remarkable that similar problems arise when defining the notion of random numbers, cf. Martin-Löf “Notes on constructive mathematics” (1970).) Russell introduced the ramified hierarchy in order to deal with the apparent circularity of such impredicative definitions. One should make a distinction between the first-order properties, like being Corsican, that do not refer to the totality of properties, and consider that the second-order properties refer only to the totality of first-order properties. One can then introduce third-order properties, that can refer to the totality of second-order property, and so on. This clearly eliminates all circularities connected to impredicative definitions.

At about the same time, Poincaré carried out a similar analysis. He stressed the importance of “predicative” classifications, and of not defining elements of a class using a quantification over this class (Poincaré, 1909). Poincaré used the following example. Assume that we have a collection with elements 0, 1 and an operation + satisfying

x+0 = 0
x+(y+1) = (x+y)+1

Let us say that a property is inductive if it holds of 0 and holds for x+1 if it holds for x.

An impredicative, and potentially “dangerous”, definition would be to define an element to be a number if it satisfies all inductive properties. It is then easy to show that this property “to be a number” is itself inductive. Indeed, 0 is a number since it satisfies all inductive properties, and if x satisfies all inductive properties then so does x+1. Similarly it is easy to show that x+y is a number if x,y are numbers. Indeed the property Q(z) that x+z is a number is inductive: Q(0) holds since x+0=x and if x+z is a number then so is x+(z+1) = (x+z)+1. This whole argument however is circular since the property “to be a number” is not predicative and should be treated with suspicion.

Instead, one should introduce a ramified hierarchy of properties and numbers. At the beginning, one has only first-order inductive properties, which do not refer in their definitions to a totality of properties, and one defines the numbers of order 1 to be the elements satisfying all first-order inductive properties. One can next consider the second-order inductive properties, that can refer to the collection of first-order properties, and the numbers of order 2, that are the elements satisfying the inductive properties of order 2. One can then similarly consider numbers of order 3, and so on. Poincaré emphasizes the fact that a number of order 2 is a fortiori a number of order 1, and more generally, a number of order n+1 is a fortiori a number of order n. We have thus a sequence of more and more restricted properties: inductive properties of order 1, 2, … and a sequence of more and more restricted collections of objects: numbers of order 1, 2, … Also, the property “to be a number of order n” is itself an inductive property of order n+1.

It does not seem possible to prove that x+y is a number of order n if x,y are numbers of order n. On the other hand, it is possible to show that if x is a number of order n+1, and y a number of order n+1 then x+y is a number of order n. Indeed, the property P(z) that “x+z is a number of order n” is an inductive property of order n+1: P(0) holds since x+0 = x is a number of order n+1, and hence of order n, and if P(z) holds, that is if x+z is a number of order n, then so is x+(z+1) = (x+z)+1, and so P(z+1) holds. Since y is a number of order n+1, and P(z) is an inductive property of order n+1, P(y) holds and so x+y is a number of order n. This example illustrates well the complexities introduced by the ramified hierarchy.

The complexities are further amplified if one, like Russell as for Frege, defines even basic objects like natural numbers as classes of classes. For instance the number 2 is defined as the class of all classes of individuals having exactly two elements. We again obtain natural numbers of different orders in the ramified hierarchy. Besides Russell himself, and despite all these complications, Chwistek tried to develop arithmetic in a ramified way, and the interest of such an analysis was stressed by Skolem. See Burgess and Hazen (1998) for a recent development.

Another mathematical example, often given, of an impredicative definition is the definition of least upper bound of a bounded class of real numbers. If we identify a real with the set of rationals that are less than this real, we see that this least upper bound can be defined as the union of all elements in this class. Let us identify subsets of the rationals as predicates. For example, for rational numbers q, P(q) holds iff q is a member of the subset identified as P. Now, we define the predicate LC (a subset of the rationals) to be the least upper bound of class C as:

q[LC(q) ↔ ∃P[C(P) ∧ P(q)]]

which is impredicative: we have defined a predicate L by an existential quantification over all predicates. In the ramified hierarchy, if C is a class of first-order classes of rationals, then L will be a second-order class of rationals. One obtains then not one notion or real numbers, but real numbers of different orders 1, 2, … The least upper bound of a collection of reals of order 1 will then be at least of order 2 in general.

As we saw earlier, yet another example of an impredicative definition is given by Leibniz' definition of equality. For Leibniz, the predicate “is equal to a” is true for b iff b satisfies all the predicates satisfied by a.

How should one deal with the complications introduced by the ramified hierarchy? Russell showed, in the introduction to the second edition to Principia Mathematica, that these complications can be avoided in some cases. He even thought, in Appendix B of the second edition of Principia Mathematica, that the ramified hierarchy of natural numbers of order 1,2,… collapses at order 5. However, Gödel later found a problem in his argument, and indeed, it was shown by Myhill 1974 that this hierarchy actually does not collapse at any finite level. A similar problem, discussed by Russell in the introduction to the second edition to Principia Mathematica arises in the proof of Cantor's theorem that there cannot be any injective functions from the collection of all predicates to the collection of all objects (the version of Russell's paradox in Frege's system that we presented in the introduction). Can this be done in a ramified hierarchy? Russell doubted that this could be done within a ramified hierarchy of predicates and this was indeed confirmed indeed later (Heck, 1996).

Because of these problems, Russell and Whitehead introduced in the first edition of Principia Mathematica the following reducibility axiom: the hierarchy of predicates, first-order, second-order, etc., collapses at level 1. This means that for any predicate of any order, there is a predicate of the first-order level which is equivalent to it. In the case of equality for instance, we postulate a first-order relation “a=b” which is equivalent to “a satisfies all properties that b satisfies”. The motivation for this axiom was purely pragmatic. Without it, all basic mathematical notions, like real or natural numbers are stratified into different orders. Also, despite the apparent circularity of impredicative definitions, the axiom of reducibility does not seem to lead to inconsistencies.

As noticed however first by Chwistek, and later by Ramsey, in the presence of the axiom of reducibility, there is actually no point in introducing the ramified hierarchy at all! It is much simpler to accept impredicative definitions from the start. The simple “extensional” hierarchy of individuals, classes, classes of classes, … is then enough. We get in this way the simpler systems formalised in Gödel 1931 or Church 1940 that were presented above.

The axiom of reducibility draws attention to the problematic status of impredicative definitions. To quote Weyl 1946, the axiom of reducibility “is a bold, an almost fantastic axiom; there is little justification for it in the real world in which we live, and none at all in the evidence on which our mind bases its constructions”! So far, no contradictions have been found using the reducibility axiom. However, as we shall see below, proof-theoretic investigations confirm the extreme strength of such a principle.

The idea of the ramified hierarchy has been extremely important in mathematical logic. Russell considered only the finite iteration of the hierarchy: first-order, second-order, etc., but from the beginning, the possibility of extending the ramification transfinitely was considered. Poincaré (1909) mentions the work of Koenig in this direction. For the example above of numbers of different order, he also defines a number to be inductive of order ω if it is inductive of all finite orders. He then points out that x+y is inductive of order ω if both x and y are. This shows that the introduction of transfinite orders can in some case play the role of the axiom of reducibility. Such transfinite extension of the ramified hierarchy was analysed further by Gödel who noticed the key fact that the following form of the reducibility axiom is actually provable: when one extends the ramified hierarchy of properties over the natural numbers into the transfinite this hiearchy collapses at level ω1, the least uncountable ordinal (Gödel, 1995, and Prawitz, 1970). Furthermore, while at all levels < ω1, the collection of predicates is countable, the collection of predicates at level ω1 is of cardinality ω1. This fact was a strong motivation behind Gödel's model of constructible sets. In this model the collection of all subsets of the set of natural numbers (represented by predicates) is of cardinality ω1 and is similar to the ramified hierarchy. This model satisfies in this way the Continuum Hypothesis, and gives a relative consistency proof of this axiom. (The motivation of Gödel was originally only to build a model where the collection of all subsets of natural numbers is well-ordered.)

The ramified hierarchy has been also the source of much work in proof theory. After the discovery by Gentzen that the consistency of Arithmetic could be proved by transfinite induction (over decidable predicates) along the ordinal ε0, the natural question was to find the corresponding ordinal for the different levels of the ramified hierarchy. Schütte (1960) found that for the first level of the ramified hierarchy, that is if we extend arithmetic by quantifying only over first-order properties, we get a system of ordinal strength εε0. For the second level we get the ordinal strength εεε0, etc. We recall that εα denotes the α-th ε-ordinal number, an ε-ordinal number being an ordinal β such that ωβ = β, see Schütte (1960).

Gödel stressed the fact that his approach to the problem of the continuum hypothesis was not constructive, since it needs the uncountable ordinal ω1, and it was natural to study the ramified hierarchy along constructive ordinals. After preliminary works of Lorenzen and Wang, Schütte analysed what happens if we proceed in the following more constructive way. Since arithmetic has for ordinal strength ε0 we consider first the iteration of the ramified hierarchy up to ε0. Schütte computed the ordinal strength of the resulting system and found an ordinal strength u(1)>ε0. We iterate then ramified hierarchy up to this ordinal u(1) and get a system of ordinal strength u(2)> u(1) , etc. Each u(k) can be computed in terms of the so-called Veblen hierarchy: u(k+1) is φu(k)(0). The limit of this process gives an ordinal called Γ0: if we iterate the ramified hierarchy up to the ordinal Γ0 we get a system of ordinal strength Γ0. Such an ordinal was obtained independently about the same time by S. Feferman. It has been claimed that Γ0 plays for predicative systems a role similar to ε0 for Arithmetic. Recent proof-theoretical works however are concerned with systems having bigger proof-theoretical ordinals that can be considered predicative (see for instance Palmgren (1995)).

Besides these proof theoretic investigations related to the ramified hierarchy, much work has been devoted in proof theory to analysing the consistency of the axiom of reducibility, or, equivalently, the consistency of impredicative definitions. Following Gentzen's analysis of the cut-elimination property in the sequent calculus, Takeuti found an elegant sequent formulation of simple type theory (without ramification) and made the bold conjecture that cut-elimination should hold for this system. This conjecture seemed at first extremely dubious given the circularity of impredicative quantification, which is well reflected in this formalism. The rule for quantifications is indeed

Γ ⊢ ∀X[A(X)]

Γ ⊢ A[X:=T]

where T is any term predicate, which may itself involve a quantification over all predicates. Thus the formula A[X:=T] may be itself much more complex than the formula A(X).

One early result is that cut-elimination for Takeuti's impredicative system implies in a finitary way the consistency of second-order Arithmetic. (One shows that this implies the consistency of a suitable form of infinity axiom, see Andrews, 2002.) Following work by Schütte, it was later shown by W. Tait and D. Prawitz that indeed the cut-elimination property holds (the proof of this has to use a stronger proof theoretic principle, as it should be according to the incompletness theorem.)

What is important here is that these studies have revealed the extreme power of impredicative quantification or, equivalently, the extreme power of the axiom of reducibility. This confirms in some way the intuitions of Poincaré and Russell. The proof-theoretic strength of second-order Arithmetic is way above all ramified extensions of Arithmetic considered by Schütte. On the other hand, despite the circularity of impredicative definitions, which is made so explicit in Takeuti's calculus, no paradoxes have been found yet in second-order Arithmetic.

Another research direction in proof theory has been to understand how much of impredicative quantification can be explained from principles that are available in intuitionistic mathematics. The strongest such principles are strong forms of inductive definitions. With such principles, one can explain a limited form a impredicative quantification, called Π11-comprehension, where one uses only one level of impredicative quantification over predicates. Interestingly, almost all known uses of impredicative quantifications: Leibniz equality, least upper bound, etc., can be done with Π11-comprehension. This reduction of Π11-comprehension was first achieved by Takeuti in a quite indirect way, and was later simplified by Buchholz and Schütte using the so-called Ω-rule. It can be seen as a constructive explanation of some restricted, but nontrivial, uses of impredicative definitions.

4. Type Theory/Set Theory

Type theory can be used as a foundation for mathematics, and indeed, it was presented as such by Russell in his 1908 paper, which appeared the same year as Zermelo's paper, presenting set theory as a foundation for mathematics.

It is clear intuitively how we can explain type theory in set theory: a type is simply interpreted as a set, and function types AB can be explained using the set theoretic notion of function (as a functional relation, i.e. a set of pairs of elements). The type Ao corresponds to the powerset operation.

The other direction is more interesting. How can we explain the notion of sets in term of types? There is an elegant solution, due to A. Miquel, which complements previous works by P. Aczel (1978) and which has also the advantage of explaining non necessarily well-founded sets a la Finsler. One simply interprets a set as a pointed graph (where the arrow in the graph represents the membership relation). This is very conveniently represented in type theory, a pointed graph being simply given by a type A and a pair of elements

a: A, R: AAo

We can then define in type theory when two such sets A, a, R and B, b, S are equal: this is the case iff there is a bisimulation T between A and B such that T a b holds. A bisimulation is a relation

T: ABo

such that whenever T x y and R x x′; hold, there exists y′ such that T xy′ and S y y′ hold, and whenever T x y and R y y′ hold, there exists x′ such that T xy′ and R x x hold. We can then define the membership relation: the set represented B, b, S is a member of the set represented by A, a, R iff there exists a1 such that R a1 a and A, a1, R and B, b, S are bisimilar.

It can then be checked that all the usual axioms of set theory extensionality, power set, union, comprehension over bounded formulae (and even antifoundation, so that the membership relation does not need to be well-founded) hold in this simple model. (A bounded formula is a formula where all quantifications are of the form ∀xa … or ∃xa …). In this way it can been shown that Church's simple type theory is equiconsistent with the bounded version of Zermelo's set theory.

5. Type Theory/Category Theory

There are deep connections between type theory and category theory. We limit ourselves to presenting two applications of type theory to category theory: the constructions of the free cartesian closed category and of the free topos (see the entry on category theory for an explanation of "cartesian closed" and "topos").

For building the free cartesian closed category, we extend simple type theory with the type 1 (unit type) and the product type A × B, for A, B types. The terms are extended by adding pairing operations and projections and a special element of type 1. As in Lambek and Scott (1986) one can then define a notion of typed conversions between terms, and show that this relation is decidable. One can then show (Lambek and Scott, 1986) that the category with types as objects and as morphisms from A to B the set of closed terms of type AB (with conversion as equality) is the free cartesian closed category. This can be used to show that equality between arrows in this category is decidable.

The theory of types of Church can also be used to build the free topos. For this we take as objects pairs A,E with A type and E a partial equivalence relation, that is a closed term E: AAo which is symmetric and transitive. We take as morphisms between A, E and B, F the relations R: ABo that are functional that is such that for any a: A satisfying E a a there exists one, and only one (modulo F) element b of B such that F b b and R a b. For the subobject classifier we take the pair o, E with E: ooo defined as

E M N = and (imply M N) (imply N M)

One can then show that this category forms a topos, indeed the free topos.

It should be noted that the type theory in Lambek and Scott (1986) uses a variation of type theory, introduced by Henkin and refined by P. Andrews (2002) which is to have an extensional equality as the only logical connective, i.e. a polymorphic constant

eq : AAo

and to define all logical connectives from this connective and constants T, F : o. For instance, one defines

P   =df   eq (λx. T) P

The equality at type o is logical equivalence.

One advantage of the intensional formulation is that it allows for a direct notation of proofs based on λ-calculus (Martin-Löf, 1971 and Coquand, 1986).

6. Extensions of Type System, Polymorphism, Paradoxes

We have seen the analogy between the operation A → A → o on types and the powerset operation on sets. In set theory, the powerset operation can be iterated transfinitely along the cumulative hierarchy. It is then natural to look for analogous transfinite versions of type theory.

One such extension of Church's simple type theory is obtained by adding universes (Martin-Löf, 1970). Adding a universe is a reflection process: we add a type U whose objects are the types considered so far. For Church's simple type theory we will have

o: U, i: U and AB: U if A: U, B: U

and, furthermore, A is a type if A: U. We can then consider types such as

(A: U) → AA

and functions such as

id = λAx. x : (A: U) → AA

The function id takes as argument a “small” type A: U and an element x of type A, and outputs an element of type A. More generally if T(A) is a type under the assumption A: U, one can form the dependent type

(A: U) → T(A)

That M is of this type means that M A: T(A) whenever A: U. We get in this way extensions of type theory whose strength is similar to the one of Zermelo's set theory (Miquel, 2001). More powerful form of universes are considered in (Palmgren, 1998). Miquel (2003) presents a version of type theory of strength equivalent to the one of Zermelo-Fraenkel.

One very strong form of universe is obtained by adding the axiom U: U. This was suggested by P. Martin-Löf in 1970. J.Y. Girard showed that the resulting type theory is inconsistent as a logical system (Girard, 1972). Though it seems at first that one could directly reproduce Russell's paradox using a set of all sets, such a direct paradox is actually not possible due to the difference between sets and types. Indeed the derivation of a contradiction in such a system is subtle and has been rather indirect (though, as noticed by Miquel (2001), it can now be reduced to Russell's paradox by representing sets as pointed graphs). J.Y. Girard first obtained his paradox for a weaker system. This paradox was refined later (Coquand, 1994 and Hurkens, 1995). (The notion of pure type system, introduced by Barendregt (1992), is convenient for getting a sharp formulation of these paradoxes.) Instead of the axiom U: U one assumes only

(A: U) → T(A) : U

if T(A) : U [A: U]. Notice the circularity, indeed of the same kind as the one that is rejected by the ramified hierarchy: we define an element of type U by quantifying over all elements of U. For instance the type

(A: U) → AA: U

will be the type of the polymorphic identity function. Despite this circularity, J.Y. Girard was able to show normalisation for type systems with this form of polymorphism. However, the extension of Church's simple type theory with polymorphism is inconsistent as a logical system, i.e. all propositions (terms of type o) are provable.

J.Y. Girard's motivation for considering a type system with polymorphism was to extend Gödel's Dialectica (Gödel, 1958) interpretation to second-order arithmetic. He proved normalisation using the reducibility method, that had been introduced by Tait (1967) while analysing Gödel, 1958. It is quite remarkable that the circularity inherent in impredicativity does not result in non-normalisable terms. (Girard's argument was then used to show that cut-elimination terminates in Takeuti's sequent calculus presented above.) A similar system was introduced independently by J. Reynolds (1974) while analysing the notion of polymorphism in computer science.

Martin-Löf's introduction of a type of all types comes from the identification of the concept of propositions and types, suggested by the work of Curry and Howard. It is worth recalling here his three motivating points:

  1. Russell's definition of types as ranges of significance of propositional functions
  2. the fact that one needs to quantify over all propositions (impredicativity of simple type theory)
  3. identification of proposition and types

Given (1) and (2) we should have a type of propositions (as in simple type theory), and given (3) this should also be the type of all types. Girard's paradox shows that one cannot have (1),(2) and (3) simultaneously. Martin-Löf's choice was to take away (2), restricting type theory to be predicative (and, indeed, the notion of universe appeared first in type theory as a predicative version of the type of all types). The alternative choice of taking away (3) is discussed by Coquand (1986). For more information on this topic, we refer to the entry on constructive type theory.


Related Entries

Russell, Bertrand | category theory | Frege, Gottlob | Frege, Gottlob: logic, theorem, and foundations for arithmetic | logic: paraconsistent | mathematics: inconsistent | Peano, Giuseppe | Principia Mathematica | type theory: Church's type theory | type theory: constructive