This is a file in the archives of the Stanford Encyclopedia of Philosophy. |
version |
Stanford Encyclopedia of Philosophy |
content revised
|
We recall and refine some definitions from the entries on classical logic and model theory. A signature is a set of individual constants, predicate symbols and function symbols; each of the predicate symbols and function symbols has an arity (for example it is binary if its arity is 2). Each signature K gives rise to a first-order language, by building up formulas from the symbols in the signature together with logical symbols (including =) and punctuation.
If K is a signature, then a structure of signature K, say A, consists of the following items:
For example the field of real numbers forms a structure R whose elements are the real numbers, with signature consisting of the individual constant 0 to name the number zero, a 1-ary function symbol - for minus, and two 2-ary function symbols + and . for plus and times. At first sight we can't add a symbol to express 1/x, since all the named functions have to be defined on the whole domain of the structure, and there is no such real number as 1/0. But on second thoughts this is not a serious problem; any competent mathematician puts the condition x is not zero before dividing by x, and so it never matters what the value of 1/0 is, and we can harmlessly take it to be 42. But most model theorists are uncomfortable with any kind of division by zero, so they stick with plus, times and minus.
If L is the first-order language of signature K, then
Tarski's model-theoretic truth definition
tells us when a sentence of L is true in A, and when an
assignment of elements of A to variables satisfies a formula
of L in A. Instead of talking of assignments satisfying a
formula, model theorists often speak of the set of n-tuples
of elements of A that is defined by a formula
(v1,...,vn); the
connection is that an n-tuple
(a1,...,an) is in the
defined set if and only if the assignment taking each
vi to ai
satisfies the formula.
If
is a sentence, we write
Ato mean that![]()
![]()
Ato mean that the n-tuple a is in the set defined by![]()
[a]
Two L-structures that are models of exactly the same sentences of L are said to be elementarily equivalent. Elementary equivalence is an equivalence relation on the class of all L-structures. The set of all the sentences of L that are true in the L-structure A is called the complete theory of A, in symbols Th(A). A theory that is Th(A) for some structure A is said to be complete. (By the completeness theorem for first-order logic, for which see the entry on classical logic, a theory is complete if and only if it is maximal syntactically consistent.) The two structures A and B are elementarily equivalent if and only if Th(A) = Th(B).
To continue the example of the field R of real numbers: It is often not at all obvious whether two given structures are or are not elementarily equivalent. One of the greatest achievements of the pre-history of model theory was Tarski's description in 1930 of Th(R) (which he published in full only after the war; see his book in the Bibliography below). This description implied among other things that the structures elementarily equivalent to R are exactly the real-closed fields, a class of fields which was already known to the algebraists in its own right.
When mathematicians introduce a class of structures, they like to
define what they count as the basic maps between these
structures. The basic maps between structures of the same signature K
are called homomorphisms, defined as follows. A
homomorphism from structure A to structure B is a function
f from dom(A) to dom(B) with the property
that for every atomic formula
(v1,...,vn)
and any n-tuple a =
(a1,...,an) of
elements of A,
Awhere b is (f(a1),...,f(an)). If we have ![]()
[a]
B
![]()
[b]
If A and B are structures of signature K with dom(A) a subset of dom(B), and the interpretations in A of the symbols in K are just the restrictions of their interpretations in B, then we say that A is a substructure of B and conversely B is an extension of A. If moreover B has some elements that are not in A, we say that A is a proper substructure of B and B is an proper extension of A. If B is a structure and X is a nonempty subset of dom(B), then there is a unique smallest substructure of B whose domain contains all of X. It is known as the substructure of B generated by X, and we find it by first adding to X all the elements cB where c are individual constants, and then closing off under the functions FB where F are function symbols.
For example the substructure of the field R generated by the number 1 consists of 1, 0 (since it is named by the constant 0), 1+1, 1+1+1 etc., -1, -2 etc., in other words the ring of integers. (There is no need to close off under multiplication too, since the set of integers is already closed under multiplication.) If we had included a symbol for 1/x too, the substructure generated by 1 would have been the field of rational numbers. So the notion of substructure is sensitive to the choice of signature.
AWe say that e is an elementary embedding of A into B if e is an elementary map and its domain is the whole domain of A. As the name implies, elementary embeddings are always embeddings.![]()
(a1,...,an)
B
![]()
(e(a1),...,e(a n)).
If there is an elementary embedding from A to B
then A and B are elementarily equivalent. On the
other hand an embedding between elementarily equivalent structures,
or even between isomorphic structures, need not be elementary. (For
example, writing Z for the abelian group of
the integers with signature consisting of 0 and +, the embedding from
Z to Z that takes
each integer n to 2n is an embedding, and of course
Z is isomorphic to itself; but this
embedding is not elementary, since 1 satisfies the formula
y(y
+ y = v1), but 2 doesn't.)
We say that A is an elementary substructure of B, and B is an elementary extension of A, if A is a substructure of B and the inclusion map is an elementary embedding. It's immediate from the definitions that an elementary extension of an elementary extension of A is again an elementary extension of A.
Elementary embeddings are natural maps to consider within first-order model theory. Around 1950 Abraham Robinson was impressed that maps between algebraic structures in general seem hardly ever to be elementary, whereas some important maps (such as embeddings between two algebraically closed fields, or between two real-closed fields) turn out to be elementary. He was also surprised to find that this fact about algebraically closed fields is another way of stating a celebrated theorem called the Hilbert Nullstellensatz. These observations of Robinson have had a huge effect on the development of model theory. In Robinson's terminology, a first-order theory is model-complete if every embedding between models of the theory is elementary. This notion has found many uses, and it often appears in applications of model theory in algebra.
Elementary embeddings have a number of properties that make them useful. We have space for four.
The downward Loewenheim-Skolem theorem:There is a proof of this in the entry on classical logic, using Skolem hulls. Note that
Suppose L is a first-order language which hasformulas, A is an L-structure and
is a cardinal which is at least
but less than the cardinality of A. Suppose also that X is a set of at most
elements of A. Then A has an elementary substructure which has cardinality exactly
and contains all the elements in X.
The elementary chain theorem:The latter is a consequence of the compactness theorem in the next section.
Suppose that L is a first-order language and A0, A1, ... is a sequence (of any length) of L-structures such that any structure in the sequence is an elementary substructure of all the later structures in the sequence. Then there is a unique smallest L-structure B which contains all the structures in the sequence as substructures; this structure B is an elementary extension of all the structures in the sequence.The elementary amalgamation theorem:
Suppose L is a first-order language, A is an L-structure and B, C are two elementary extensions of A. Then there are an elementary extension D of B and an elementary embedding e of C into D such that (i) for each element a of A, e(a) = a, and (ii) if c is an element of C but not of A, then e(c) is not in B.
The upward Loewenheim-Skolem theorem:There is a proof of this in the entry on classical logic. The name of the theorem is a little unfortunate, since the theorem was first proved by Tarski, and Skolem didn't even believe it (because he didn't believe in uncountable cardinals).
Suppose L is a first-order language which hasformulas, A is an L-structure whose cardinality is an infinite cardinal
, and
is a cardinal which is at least as great as both
and
. Then A has an elementary extension whose cardinality is
.
There is another proof using the elementary amalgamation theorem and
the elementary chain theorem. The compactness theorem and the diagram
lemma (see below) allow us to prove that A has a proper
elementary extension
A. Now use
A
and again
A
for the structures B and C
in the elementary amalgamation theorem. Then D as in the
theorem is an elementary extension of A, and by (ii) in the
theorem, it must contain elements that are not in A, so that
it is a proper elementary extension. Repeat to get a proper
elementary extension of D, and so on until you have an
infinite elementary chain. Use the elementary chain theorem to find
an elementary extension of A that sits on top of this
chain. Keep repeating these moves until you have an elementary
extension of A that has cardinality at least
.
Then if necessary use the downward Loewenheim-Skolem theorem to pull
the cardinality down to exactly
.
This kind of argument is very common in first-order model
theory.
If T is a first-order theory, and every finite subset of T has a model, then T has a model.There is a proof of this theorem in the entry on classical logic. The theorem has several useful paraphrases. For example it is equivalent to the following statement:
Suppose T is a first-order theory and(See the entry Model Theory for the notionis a first-order sentence. If T
![]()
then there is a finite subset U of T such that U
![]()
.
Anatolii Mal'tsev first gave the compactness theorem in 1938 (for first-order logic of any signature), and used it in 1940/1 to prove several theorems about groups; this seems to have been the first application of model theory within classical mathematics. Leon Henkin rediscovered the theorem a few years later and gave some further applications. The theorem fails badly for nearly all infinitary languages.
If BNamely, if an element of A is named by a new constant c, then map that element to the element of Bis a model of the diagram of A, and B is B
with the new constants removed from the signature, then there is an embedding of A into B.
Suppose L and M are first-order languages, LThere are several proofs of this theorem, and not all of them are model-theoretic. Without the last sentence, the theorem is known as Craig's interpolation theorem, since William Craig proved this version a few years before Roger Lyndon found the full statement in 1959. As Craig noted at the time, his interpolation theorem gives a neat proof of Evert Beth's definability theorem, which runs as follows.M is the smallest first-order language containing both L and M, and L
M is the language consisting of all the formulas which are in both L and M. Suppose T is a theory in L, U is a theory in M, and no (L
M)-structure is both a model of T and a model of U. Then there is a sentence
of L
M which is true in all models of T and false in all models of U. (This sentence
is called the interpolant.) Moreover every predicate symbol with a positive occurrence in
has a positive occurrence in some sentence of T and a negative occurrence in some sentence of U, and conversely every predicate symbol with a negative occurrence in
has a negative occurrence in some sentence of T and a positive occurrence in some sentence of U.
Suppose that L is a first-order language and M is the first-order
language got by adding to L a new predicate symbol R. Suppose also
that T is a theory in M. We say that T implicitly defines R
if it is false that there are two M-structures which are models of T,
have the same elements and interpret all the symbols of L in the same
way but interpret the symbol R differently. We say that T
defines R explicitly if there is a formula
(x1,...,xn)
of L such that in every model of T, the formulas
and R(x1,...,xn)
are satisfied by exactly the same n-tuples
(a1,...,an) of
elements. It is easy to see that if T defines R explicitly then it
defines R implicitly. (This fact is known as Padoa's
method; Padoa used the failure of implicit definability as a way
of proving the failure of explicit definability.) Beth's theorem
is the converse:
Suppose that L, M, R and T are as above. If T defines R implicitly then T defines R explicitly.
Suppose L is a first-order language which has countably many formulas. Suppose T is a complete theory in L, andis a set of formulas of L which all have the free variable x. Finally suppose that every model of T contains an element which satisfies all the formulas in
. Then there is a formula
(x) of L such that in every model of T there is an element satisfying
, and every element that satisfies
in any model of T also satisfies all the formulas in
. (In other words, there is a finite reason why the type
can't be omitted in any model of T.)
This theorem, which goes back to the mid 1950s, very definitely depends on the language being first-order and countable. It has several useful generalisations, for example model-theoretic forcing, which is an analogue of the forcing construction in set theory. In fact the games used for model-theoretic forcing (see the entry on logic and games) can be adapted to prove the omitting types theorem too. There are similar but more complicated theorems for uncountable first-order languages; some of these can be paraphrased as omitting types theorems for infinitary languages.
Let T be a theory consisting of strict universal Horn sentences. Then T has a model A with the property that for every model B of T there is a unique homomorphism from A to B. (Such a model A is called an initial model of T. It is unique up to isomorphism.)This theorem is a generalisation, due to Mal'tsev, of a group-theoretic construction called construction by generators and relations. It is the main idea behind algebraic specification, which is one approach to the specification of systems in computer science. The required behaviour of the system is written down as a set of strict universal Horn sentences, and then the initial model of these sentences is an abstract version of the required system.
(a,b) is in PC if and only if a is in PA and b is in PB.The structures A and B are called the factors of A X B. In the same way we can form products of any number of structures. If all the factors of a product are the same structure A, the product is called a power of A. A theorem called the Feferman-Vaught theorem tells us how to work out the complete theory of the product from the complete theories of its factors.
This construction has some variants. We can define an equivalence relation on the domain of a product C, and then take a structure D whose elements are the equivalence classes; the predicate symbols are interpreted so as to make the natural map from dom(C) to dom(D) a homomorphism. In this case the structure D is called a reduced product of the factors of C. It is a reduced power of A if all the factors are equal to A; in this case the diagonal map from A to D is the one got by taking each element a to the equivalence class of the element (a,a,...).
Suppose we use a set I to index the factors in a product C. An ultrafilter over I is a set U of subsets of I with the properties
If U is an ultrafilter then the diagonal map from A to U-prod A is an elementary embedding.If the ultrafilter U is nonprincipal, i.e. contains no finite sets, then the diagonal map is not onto the domain of U-prod A, and in fact U-prod A is generally much larger than A. So we have a way of constructing large elementary extensions. The axiom of choice guarantees that every infinite set has many nonprincipal ultrafilters over it. Ultrapowers are an essential tool for handling large cardinals in set theory (see the entry on set theory).
A remarkable (but in practice not terribly useful) theorem of Saharon Shelah tells us that a pair of structures A and B are elementarily equivalent if and only if they have ultrapowers that are isomorphic to each other.
BWe say that A is saturated if whenever X is a set of elements of A, of cardinality less than that of A, and B is any elementary extension of A, we always have that every element of B has the same type over X as some element already in A.![]()
[b,d]
B
![]()
[c,d].
This rather heavy definition gives little clue how useful saturated structures are. If every structure had a saturated elementary extension, many of the results of model theory would be much easier to prove. Unfortunately the existence of saturated elementary extensions depends on features of the surrounding universe of sets. There are technical ways around this obstacle, for example using weakenings of the notion of saturation. We have two main ways of constructing saturated elementary extensions. One is by ultrapowers, using cleverly constructed ultrafilters. The other is by taking unions of elementary chains, generalising the proof we gave for the upward Loewenheim-Skolem theorem.
The existence of partially saturated elementary extensions of the field R of real numbers is the main technical fact behind Abraham Robinson's nonstandard analysis. See Section 4 of the entry on model theory for more information on this. Though model theory provided the first steps in nonstandard analysis, this branch of analysis rapidly became a subject in its own right, and its links with first-order model theory today are rather slim.
Now there is a heuristic principle that many people have used, though
it seems to have no simple formulation. I suggest Few is
beautiful. The principle says that if a first-order theory T
constrains its models (of a particular cardinality) to be all similar
to each other, this can only be because the models of T have few
irregularities and asymmetries. So there should be a good structural
description of these models. One should expect that they are
good structures from the point of view of classical
mathematics. As a first step, one easily sees from the upward and
downward Loewenheim-Skolem theorems that if T is
-categorical
for some
at least as large as the number of formulas in the language of T,
then T must be a complete theory. From now on, T is a complete theory
with infinite models; and for simplicity we shall assume that the
language of T is countable.
In 1954 Jerzy Los announced that he could only find three kinds of
example of theories T that are
-categorical.
Namely:
This question of Los was a tremendous stimulus to research, and it led to a classic paper of Michael Morley in 1965 which showed that Los's three possibilities are in fact the only ones. One central idea of Morley's analysis was that models of an uncountably categorical theory have the smallest possible number of types of element; this led directly to the branch of model theory called stability theory, which studies theories that have a limited number of element types. These theories have the remarkable property that every infinite indiscernible sequence in any of their models is indiscernible under any linear ordering whatever; so these sequences are a kind of generalisation of bases of vector spaces. Another idea implicit in Morley's work, but much clarified by later work of John Baldwin and Alistair Lachlan, was that in any model of an uncountably categorical theory there is a central core (called a strongly minimal set) which carries a dependence relation obeying similar laws to linear dependence in vector spaces. In terms of this dependence relation one can define a dimension for the model, and what remains of the model outside the core is so closely tied to the core that the dimension determines the model up to isomorphism.
Saharon Shelah developed Morley's ideas with tremendous resourcefulness and energy. His main aim was to stretch the Few is beautiful idea by showing that there are clear dividing lines between kinds of theory T. On one side of a dividing line are theories with some good structural property that forces the number of nonisomorphic models of a given cardinality to be small. On the other side, every theory has (for example) two models of the same cardinality that are not isomorphic but are extremely hard to tell apart. Shelah coined the name classification theory for this research, though the name never caught on. The text of Lascar listed below is an elegant introduction to this whole programme, from Los to Shelah. Meanwhile Shelah himself has extended it far beyond first-order logic. Even in the first-order case, Shelah had to invent new set-theoretic techniques (such as proper forcing) to carry out his constructions.
Partly because of the difficulty of communications between Siberia and the West, these results of Zil'ber took some time to digest, and in part they had to be rediscovered in the West. But when the message did finally get through, the result was a new branch of model theory which has come to be known as geometric model theory. The programme is broadly to classify structures according to (a) what groups or fields are interpretable in them (in the sense sketched in the entry on model theory) and (b) whether or not the structures have modular geometries; and then to use this classification to solve problems in model theory and geometry. Since the mid 1980s the leader of this research has been Ehud Hrushovski. In the early 1990s, using joint work with Zil'ber, Hrushovski gave a model-theoretic proof (the first complete proof to be found) of the geometric Mordell-Lang conjecture in all characteristics; this was a conjecture in classical diophantine geometry. The book edited by Bouscaren in the references below is devoted to Hrushovski's proof and the necessary background in model theory. Both (a) and (b) are fundamental to Hrushovski's argument.
Every set of elements definable by a first-order formula is a finite union of open intervals with named endpoints, together with some finite set of elements.A linearly ordered structure with this property is said to be o-minimal. (The idea of the name is that o-minimality is an analogue of Morley's strong minimality, in a form that makes sense for structures that carry a linear ordering, whence o- for ordering.)
In 1982 Lou van den Dries showed that the fact that the field of real numbers is o-minimal gives a large amount of useful information about the definable sets of higher dimension, such as the family of definable subsets of the real plane. Soon after this, Julia Knight, Anand Pillay and Charles Steinhorn noticed that if a structure A is o-minimal, then so is any structure elementarily equivalent to A, and that Van den Dries' analysis of higher-dimensional definable set applies to all these structures. These results led to much activity on the frontier between model theory and function theory. Several old problems from model theory and function theory were solved. Alex Wilkie showed that the field of real numbers with a symbol for exponentiation is o-minimal and has a model-complete complete theory, and thereby gave a positive answer to Tarski's old problem of whether this structure allows a quantifier elimination (though his method was very far from the syntactic analysis that Tarski had in mind). We now know a wide range of ways of adding interesting features to the field of real numbers in such a way that the resulting structure is still o-minimal (and hence in some sense mathematically tractable). Van den Dries has urged that o-minimal structures provide a good setting for developing the tame topology programme of Alexander Grothendieck.
Wilfrid Hodges W.Hodges@qmw.ac.uk |