Tarski's Truth Definitions
In 1933 the Polish logician Alfred Tarski published a paper in which he discussed the criteria that a definition of ‘true sentence’ should meet, and gave examples of several such definitions for particular formal languages. In 1956 he and his colleague Robert Vaught published a revision of one of the 1933 truth definitions, to serve as a truth definition for model-theoretic languages. This entry will simply review the definitions and make no attempt to explore the implications of Tarski's work for semantics (natural language or programming languages) or for the philosophical study of truth.
- 1. The 1933 programme and the semantic conception
- 2. Some kinds of truth definition on the 1933 pattern
- 3. The 1956 definition and its offspring
- Bibliography
- Other Internet Resources
- Related Entries
1. The 1933 programme and the semantic conception
In the late 1920s Alfred Tarski embarked on a project to give rigorous definitions for notions useful in scientific methodology. In 1933 he published (in Polish) his analysis of the notion of a true sentence. This long paper undertook two tasks: first to say what should count as a satisfactory definition of ‘true sentence’ for a given formal language, and second to show that there do exist satisfactory definitions of ‘true sentence’ for a range of formal languages. We begin with the first task; Section 2 will consider the second.
We say that a language is fully interpreted if all its sentences have meanings that make them either true or false. All the languages that Tarski considered in the 1933 paper were fully interpreted, with one exception described in Section 2.2 below. This was the main difference between the 1933 definition and the later model-theoretic definition of 1956, which we shall examine in Section 3.
Tarski described several conditions that a satisfactory definition of truth should meet.
1.1 Object language and metalanguage
If the language under discussion (the object language) is L, then the definition should be given in another language known as the metalanguage, call it M. The metalanguage should contain a copy of the object language (so that anything one can say in L can be said in M too), and M should also be able to talk about the sentences of L and their syntax. Finally Tarski allowed M to contain notions from set theory, and a 1-ary predicate symbol True with the intended reading ‘is a true sentence of L’. The main purpose of the metalanguage was to formalise what was being said about the object language, and so Tarski also required that the metalanguage should carry with it a set of axioms expressing everything that one needs to assume for purposes of defining and justifying the truth definition. The truth definition itself was to be a definition of True in terms of the other expressions of the metalanguage. So the definition was to be in terms of syntax, set theory and the notions expressible in L, but not semantic notions like ‘denote’ or ‘mean’ (unless the object language happened to contain these notions).
Tarski assumed, in the manner of his time, that the object language L and the metalanguage M would be languages of some kind of higher order logic. Today it is more usual to take some kind of informal set theory as one's metalanguage; this would affect a few details of Tarski's paper but not its main thrust. Also today it is usual to define syntax in set-theoretic terms, so that for example a string of letters becomes a sequence. In fact one must use a set-theoretic syntax if one wants to work with an object language that has uncountably many symbols, as model theorists have done freely for over half a century now.
1.2 Formal correctness
The definition of True should be ‘formally correct’. This means that it should be a sentence of the form
For all x, True(x) if and only if φ(x),
where True never occurs in φ; or failing this, that the definition should be provably equivalent to a sentence of this form. The equivalence must be provable using axioms of the metalanguage that don't contain True. Definitions of the kind displayed above are usually called explicit, though Tarski in 1933 called them normal.
1.3 Material adequacy
The definition should be ‘materially adequate’ (trafny — a better translation would be ‘accurate’). This means that the objects satisfying φ should be exactly the objects that we would intuitively count as being true sentences of L, and that this fact should be provable from the axioms of the metalanguage. At first sight this is a paradoxical requirement: if we can prove what Tarski asks for, just from the axioms of the metalanguage, then we must already have a materially adequate formalisation of ‘true sentence of L’ within the metalanguage, suggesting an infinite regress. In fact Tarski escapes the paradox by using (in general) infinitely many sentences of M to express truth, namely all the sentences of the form
φ(s) if and only if ψ
whenever s is the name of a sentence S of L and ψ is the copy of S in the metalanguage. So the technical problem is to find a single formula φ that allows us to deduce all these sentences from the axioms of M; this formula φ will serve to give the explicit definition of True.
Tarski's own name for this criterion of material adequacy was Convention T. More generally his name for his approach to defining truth, using this criterion, was the semantic conception of truth.
As Tarski himself emphasised, Convention T rapidly leads to the liar paradox if the language L has enough resources to talk about its own semantics. (See the entry on the revision theory of truth.) Tarski's own conclusion was that a truth definition for a language L has to be given in a metalanguage which is essentially stronger than L.
There is a consequence for the foundations of mathematics. First-order Zermelo-Fraenkel set theory is widely regarded as the standard of mathematical correctness, in the sense that a proof is correct if and only if it can be formalised as a formal proof in set theory. We would like to be able to give a truth definition for set theory; but by Tarski's result this truth definition can't be given in set theory itself. The usual solution is to give the truth definition informally in English. But there are a number of ways of giving limited formal truth definitions for set theory. For example Azriel Levy showed that for every natural number n there is a Σn formula that is satisfied by all and only the set-theoretic names of true Σn sentences of set theory. The definition of Σn is too technical to give here, but three points are worth making. First, every sentence of set theory is provably equivalent to a Σn sentence for any large enough n. Second, the class of Σn formulas is closed under adding existential quantifiers at the beginning, but not under adding universal quantifiers. Third, the class is not closed under negation; this is how Levy escapes Tarski's paradox. (See the entry on set theory.) Essentially the same devices allow Jaakko Hintikka to give an internal truth definition for his Independence-Friendly Logic; this logic shares the second and third properties of Levy's classes of formulas.
2. Some kinds of truth definition on the 1933 pattern
In his 1933 paper Tarski went on to show that many fully interpreted formal languages do have a truth definition that satisfies his conditions. He gave four examples in that paper. One was a trivial definition for a finite language; it simply listed the finitely many true sentences. One was a definition by quantifier elimination; see Section 2.2 below. The remaining two, for different classes of language, were examples of what people today think of as the standard Tarski truth definition; they are forerunners of the 1956 model-theoretic definition.
2.1 The standard truth definitions
The two standard truth definitions are at first glance not definitions of truth at all, but definitions of a more complicated relation involving assignments a of objects to variables:
a satisfies the formula F.
In fact satisfaction reduces to truth in this sense: a satisfies F if and only if taking each free variable in F as a name of the object assigned to it by a makes F into a true sentence. So it follows that our intuitions about when a sentence is true can guide our intuitions about when an assignment satisfies a formula. But none of this can enter into the formal definition of truth, because ‘taking a variable as a name of an object’ is a semantic notion, and Tarski's truth definition has to be built only on notions from syntax and set theory (together with those in the object language); recall Section 1.1. In fact Tarski's reduction goes in the other direction: if F has no free variables, then to say that F is true is to say that every assignment satisfies it.
The reason why Tarski defines satisfaction directly, and then deduces a definition of truth, is that satisfaction obeys recursive conditions in the following sense: if F is a compound formula, then to know which assignments satisfy F, it's enough to know which assignments satisfy the immediate constituents of F. Here are two typical examples:
- The assignment a satisfies the formula ‘F and G’ if and only if a satisfies F and a satisfies G.
- The assignment a satisfies the formula ‘For all x, G’ if and only if for every individual i, if b is the assignment that assigns i to the variable x and is otherwise exactly like a, then b satisfies G.
We have to use a different approach for atomic formulas. But for these, at least assuming for simplicity that L has no function symbols, we can use the metalanguage copies #(R) of the predicate symbols R of the object language. Thus:
- The assignment a satisfies the formula R(x,y) if and only if #(R)(a(x),a(y)).
(Warning: the expression # is in the metametalanguage, not in the metalanguage M. We may or may not be able to find a formula of M that expresses # for predicate symbols; it depends on exactly what the language L is.)
One sometimes says that Tarski's definition of satisfaction is compositional, meaning that the class of assignments which satisfy a compound formula F is determined solely by (1) the syntactic rule used to construct F from its immediate constituents and (2) the classes of assignments that satisfy these immediate constituents. (This is sometimes phrased loosely as: satisfaction is defined recursively. But this formulation misses the central point, that (1) and (2) don't contain any syntactic information about the immediate constituents.) Compositionality explains why Tarski switched from truth to satisfaction. You can't define whether ‘For all x, G’ is true in terms of whether G is true, because in general G has a free variable x and so it isn't either true or false.
The name ‘compositionality’ is from a paper of Katz and Fodor in 1963 on natural language semantics. In talking about compositionality, we have moved to thinking of Tarski's definition as a semantics, i.e. a way of assigning ‘meanings’ to formulas. (Here we take the meaning of a sentence to be its truth value.) Compositionality means essentially that the meanings assigned to formulas give at least enough information to determine the truth values of sentences containing them. One can ask conversely whether Tarski's semantics provides only as much information as we need about each formula, in order to reach the truth values of sentences. If the answer is yes, we say that the semantics is fully abstract (for truth). One can show fairly easily, for any of the standard languages of logic, that Tarski's definition of satisfaction is in fact fully abstract.
As it stands, Tarski's definition of satisfaction is not an explicit definition, because satisfaction for one formula is defined in terms of satisfaction for other formulas. So to show that it is formally correct, we need a way of converting it to an explicit definition. One way to do this is as follows, using either higher order logic or set theory. Suppose we write S for a binary relation between assignments and formulas. We say that S is a satisfaction relation if for every formula G, S meets the conditions put for satisfaction of G by Tarski's definition. For example, if G is ‘G1 and G2’, S should satisfy the following condition for every assignment a:
S(a,G) if and only if S(a,G1) and S(a,G2).
We can define ‘satisfaction relation’ formally, using the recursive clauses and the conditions for atomic formulas in Tarski's recursive definition. Now we prove, by induction on the complexity of formulas, that there is exactly one satisfaction relation S. (There are some technical subtleties, but it can be done.) Finally we define
a satisfies F if and only if: there is a satisfaction relation S such that S(a,F).
It is then a technical exercise to show that this definition of satisfaction is materially adequate. Actually one must first write out the counterpart of Convention T for satisfaction of formulas, but I leave this to the reader.
2.2 The truth definition by quantifier elimination
The remaining truth definition in Tarski's 1933 paper — the third as they appear in the paper — is really a bundle of related truth definitions, all for the same object language L but in different interpretations. The quantifiers of L are assumed to range over a particular class, call it A; in fact they are second order quantifiers, so that really they range over the collection of subclasses of A. The class A is not named explicitly in the object language, and thus one can give separate truth definitions for different values of A, as Tarski proceeds to do. So for this section of the paper, Tarski allows one and the same sentence to be given different interpretations; this is the exception to the general claim that his object language sentences are fully interpreted. But Tarski stays on the straight and narrow: he talks about ‘truth’ only in the special case where A is the class of all individuals. For other values of A, he speaks not of ‘truth’ but of ‘correctness in the domain A’.
These truth or correctness definitions don't fall out of a definition of satisfaction. In fact they go by a much less direct route, which Tarski describes as a ‘purely accidental’ possibility that relies on the ‘specific peculiarities’ of the particular object language. It may be helpful to give a few more of the technical details than Tarski does, in a more familiar notation than Tarski's, in order to show what is involved. Tarski refers his readers to a paper of Thoralf Skolem in 1919 for the technicalities.
One can think of the language L as the first-order language with predicate symbols ⊆ and =. The language is interpreted as talking about the subclasses of the class A. In this language we can define:
- ‘x is the empty set’ (viz. x ⊆ every class).
- ‘x is an atom’ (viz. x is not empty, but every subclass of x not equal to x is empty).
- ‘x has exactly k members’ (where k is a finite number; viz. there are exactly k distinct atoms ⊆ x).
- ‘There are exactly k elements in A’ (viz. there is a class with exactly k members, but there is no class with exactly k+1 members).
Now we aim to prove:
Lemma. Every formula F of L is equivalent to (i.e. is satisfied by exactly the same assignments as) some boolean combination of sentences of the form ‘There are exactly k elements in A’ and formulas of the form ‘There are exactly k elements that are in v1, not in v2, not in v3 and in v4’ (or any other combination of this type, using only variables free in F).
The proof is by induction on the complexity of formulas. For atomic formulas it is easy. For boolean combinations of formulas it is easy, since a boolean combination of boolean combinations is again a boolean combination. For formulas beginning with ∀, we take the negation. This leaves just one case that involves any work, namely the case of a formula beginning with an existential quantifier. By induction hypothesis we can replace the part after the quantifier by a boolean combination of formulas of the kinds stated. So a typical case might be:
∃z (there are exactly two elements that are in z and x and not in y).
This holds if and only if there are at least two elements that are in x and not in y. We can write this in turn as: The number of elements in x and not in y is not 0 and is not 1; which is a boolean combination of allowed formulas. The general proof is very similar but more complicated.
When the lemma has been proved, we look at what it says about a sentence. Since the sentence has no free variables, the lemma tells us that it is equivalent to a boolean combination of statements saying that A has a given finite number of elements. So if we know how many elements A has, we can immediately calculate whether the sentence is ‘correct in the domain A’.
One more step and we are home. As we prove the lemma, we should gather up any facts that can be stated in L, are true in every domain, and are needed for proving the lemma. For example we shall almost certainly need the sentence saying that ⊆ is transitive. Write T for the set of all these sentences. (In Tarski's presentation T vanishes, since he is using higher order logic and the required statements about classes become theorems of logic.) Thus we reach, for example:
Theorem. If the domain A is infinite, then a sentence S of the language L is correct in A if and only if S is deducible from T and the sentences saying that the number of elements of A is not any finite number.
The class of all individuals is infinite (Tarski asserts), so the theorem applies when A is this class. And in this case Tarski has no inhibitions about saying not just ‘correct in A’ but ‘true’; so we have our truth definition.
The method we have described revolves almost entirely around removing existential quantifiers from the beginnings of formulas; so it is known as the method of quantifier elimination. It is not as far as you might think from the two standard definitions. In all cases Tarski assigns to each formula, by induction on the complexity of formulas, a description of the class of assignments that satisfy the formula. In the two previous truth definitions this class is described directly; in the quantifier elimination case it is described in terms of a boolean combination of formulas of a simple kind.
At around the same time as he was writing the 1933 paper, Tarski gave a truth definition by quantifier elimination for the first-order language of the field of real numbers. He published it separately, and at first only as an interesting way of characterising the relations definable by formulas. Later he gave a fuller account, emphasising that his method provided not just a truth definition but an algorithm for determining which sentences about the real numbers are true and which are false.
3. The 1956 definition and its offspring
In 1933 Tarski assumed that the formal languages that he was dealing with had two kinds of symbol (apart from punctuation), namely constants and variables. The constants included logical constants, but also any other terms of fixed meaning. The variables had no independent meaning and were simply part of the apparatus of quantification.
Model theory by contrast works with three levels of symbol. There are the logical constants (=, ¬, & for example), the variables (as before), and between these a middle group of symbols which have no fixed meaning but get a meaning through being applied to a particular structure. The symbols of this middle group include the nonlogical constants of the language, such as relation symbols, function symbols and constant individual symbols. They also include the quantifier symbols ∀ and ∃, since we need to refer to the structure to see what set they range over. This type of three-level language corresponds to mathematical usage; for example we write the addition operation of an abelian group as +, and this symbol stands for different functions in different groups.
So one has to work a little to apply the 1933 definition to model-theoretic languages. There are basically two approaches: (1) Take one structure A at a time, and regard the nonlogical constants as constants, interpreted in A. (2) Regard the nonlogical constants as variables, and use the 1933 definition to describe when a sentence is satisfied by an assignment of the ingredients of a structure A to these variables. There are problems with both these approaches, as Tarski himself describes in several places. The chief problem with (1) is that in model theory we very frequently want to use the same language in connection with two or more different structures — example when we are defining elementary embeddings between structures (see the entry on first-order model theory). The problem with (2) is more abstract: it is disruptive and bad practice to talk of formulas with free variables being ‘true’. (We saw in Section 2.2 how Tarski avoided talking about truth in connection with sentences that have varying interpretations.) What Tarski did in practice, from the appearance of his textbook in 1936 to the late 1940s, was to use a version of (2) and simply avoid talking about model-theoretic sentences being true in structures; instead he gave an indirect definition of what it is for a structure to be a ‘model of’ a sentence, and apologised that strictly this was an abuse of language. (Chapter VI of Tarski 1994 still contains relics of this old approach.)
By the late 1940s it had become clear that a direct model-theoretic truth definition was needed. Tarski and colleagues experimented with several ways of casting it. The version we use today is based on that published by Tarski and Robert Vaught in 1956. See the entry on classical logic for an exposition.
The right way to think of the model-theoretic definition is that we have sentences whose truth value varies according to the situation where they are used. So the nonlogical constants are not variables; they are definite descriptions whose reference depends on the context. Likewise the quantifiers have this indexical feature, that the domain over which they range depends on the context of use. In this spirit one can add other kinds of indexing. For example a Kripke structure is an indexed family of structures, with a relation on the index set; these structures and their close relatives are fundamental for the semantics of modal, temporal and intuitionist logic.
Already in the 1950s model theorists were interested in formal languages that include kinds of expression different from anything in Tarski's 1933 paper. Extending the truth definition to infinitary logics was no problem at all. Nor was there any serious problem about most of the generalised quantifiers proposed at the time. For example there is a quantifier Qxy with the intended meaning:
QxyF(x,y) if and only if there is an infinite set X of elements such that for all a and b in X, F(a,b).
This definition itself shows at once how the required clause in the truth definition should go.
In 1961 Leon Henkin pointed out two sorts of model-theoretic language that didn't immediately have a truth definition of Tarski's kind. The first had infinite strings of quantifiers:
∀v1 ∃v2 ∀v3 ∃v4 …R(v1,v2,v3, v4,…).
The second had quantifiers that are not linearly ordered. For ease of writing I use Hintikka's later notation for these:
∀v1 ∃v2 ∀v3 (∃v4/∀v1) R(v1,v2,v3, v4).
Here the slash after ∃v4 means that this quantifier is outside the scope of the earlier quantifier ∀v1 (and also outside that of the earlier existential quantifier).
Henkin pointed out that in both cases one could give a natural semantics in terms of Skolem functions. For example the second sentence can be paraphrased as
∃f∃g ∀v1 ∀v3R(v1,f(v1),v 3,g(v3)),
which has a straightforward Tarski truth condition in second order logic. Hintikka then observed that one can read the Skolem functions as winning strategies in a game, as in (the entry on) logic and games. In this way one can build up a compositional semantics, by assigning to each formula a game. A sentence is true if and only if the player Myself (in Hintikka's nomenclature) has a winning strategy for the game assigned to the sentence. This game semantics agrees with Tarski's on conventional first-order sentences. But it is far from fully abstract; probably one should think of it as an operational semantics, describing how a sentence is verified rather than whether it is true.
The problem of giving a Tarski-style semantics for Henkin's two languages turned out to be different in the two cases. With the first, the problem is that the syntax of the language is not well-founded: there is an infinite descending sequence of subformulas as one strips off the quantifiers one by one. Hence there is no hope of giving a definition of satisfaction by recursion on the complexity of formulas. The remedy is to note that the explicit form of Tarski's truth definition in Section 2.1 above didn't require a recursive definition; it needed only that the conditions on the satisfaction relation S pin it down uniquely. For Henkin's first style of language this is still true, though the reason is no longer the well-foundedness of the syntax.
For Henkin's second style of language, at least in Hintikka's notation, the syntax is well-founded, but the displacement of the quantifier scopes means that the usual quantifier clauses in the definition of satisfaction no longer work. To get a compositional and fully abstract semantics, one has to ask not what assignments of variables satisfy a formula, but what sets of assignments satisfy the formula ‘uniformly’, where ‘uniformly’ means ‘independent of assignments to certain variables, as shown by the slashes on quantifiers inside the formula’. Henkin's second example is of more than theoretical interest, because clashes between the semantic and the syntactic scope of quantifiers occur very often in natural languages.
Bibliography
- Feferman, S. 2004, "Tarski's conceptual analysis of semantical notions", in Sémantique et Épistémologie, ed. Ali Benmakhlouf, Casablanca : Editions Le Fennec,79-108.
- Henkin, L. 1961, "Some remarks on infinitely long formulas", in Infinitistic methods: Proceedings of the symposium on foundations of mathematics, Oxford: Pergamon Press,167-183.
- Hintikka, J. 1996, The Principles of Mathematics Revisited, Cambridge : Cambridge University Press.
- Hodges, W. 1997, "Compositional semantics for a language of imperfect information", Logic Journal of the IGPL 5, 539-563.
- Hodges, W. 2004, "What languages have Tarski truth definitions?", Annals of Pure and Applied Logic 126, 93-113.
- Katz, J. and Fodor, J. 1963, "The structure of a semantic theory", Language 39, 170-210.
- Levy, A. 1965, A hierarchy of formulas in set theory, Memoirs of American Mathematical Society 57.
- Skolem, T. 1919, "Untersuchungen über die Axiome des Klassenkalküls und über Produktations- und Summationsprobleme, welche gewisse Klassen von Aussagen betreffen", Videnskapsselskapets Skrifter, I. Matem.-naturv. klasse 3.
- Tarski, A. 1933, "The concept of truth in the languages of the deductive sciences" (Polish), Prace Towarzystwa Naukowego Warszawskiego, Wydzial III Nauk Matematyczno-Fizycznych 34, Warsaw; reprinted in Zygmunt 1995, pages 13-172; expanded English translation in Tarski 1983, pages 152-278.
- Tarski, A. 1944, "The semantic conception of truth", Philosophy and Phenomenlological Research 4, 13-47.
- Tarski, A. 1983, Logic, Semantics, Metamathematics, papers from 1923 to 1938, ed. John Corcoran, Indianapolis : Hackett Publishing Company.
- Tarski, A. 1994, Introduction to Logic and to the Methodology of the Deductive Sciences (new edition of book originally published in Polish in 1936), New York, Oxford University Press.
- Tarski, A. and Vaught, R. 1956, "Arithmetical extensions of relational systems", Compositio Mathematica 13, 81-102.
- Zygmunt, J. ed. 1995, Alfred Tarski, Pisma Logiczno-Filozoficzne, 1 Prawda, Warsaw : Wydawnictwo Naukowe PWN.
Other Internet Resources
[Please contact the author with suggestions.]
Related Entries
logic: and games | logic: infinitary | logic: intuitionistic | logic: temporal | model theory | model theory: first-order | truth: deflationary theory of | truth: revision theory of