# First-order Model Theory

*First published Sat Nov 10, 2001; substantive revision Thu Jan 25, 2024*

First-order model theory, also known as classical model theory, is a branch of mathematics that deals with the relationships between descriptions in first-order languages and the structures that satisfy these descriptions. From one point of view, this is a vibrant area of mathematical research that brings logical methods (in particular the theory of definition) to bear on deep problems of classical mathematics. From another point of view, first-order model theory is the paradigm for the rest of model theory; it is the area in which many of the broader ideas of model theory were first worked out.

- 1. First-order languages and structures
- 2. Elementary maps
- 3. Five grand theorems
- 4. Three useful constructions
- 5. Three successful programmes
- Bibliography
- Academic Tools
- Other Internet Resources
- Related Entries

## 1. First-order languages and structures

Mathematical model theory carries a heavy load of notation, and HTML
is not the best container for it. In what follows, syntactic objects
(languages, theories, sentences) are generally written in roman or
greek letters (for example L, T, φ), and set-theoretic objects
such as structures and their elements are written in italic
(*A*, *a*). Two exceptions are that variables are italic
(*x*, *y*) and that sequences of elements are written
with lower case roman letters (a, b).

We recall and refine some definitions from the entries on
classical logic
and
model theory.
A *signature* is a set of individual constants, predicate
symbols and function symbols; each of the predicate symbols and
function symbols has an *arity* (for example it is binary if
its arity is 2). Each signature K gives rise to a first-order
language, by building up formulas from the symbols in the signature
together with logical symbols (including =) and punctuation.

If K is a signature, then a *structure of signature* K, say
*A*, consists of the following items:

- A set called the
*domain*of*A*and written dom(*A*); it is usually assumed to be nonempty; - for each individual constant
*c*in K, an element*c*of dom(^{A}*A*); - for each predicate symbol
*P*of arity*n*, an*n*-ary relation*P*on dom(^{A}*A*); - for each function symbol
*F*of arity*n*, an*n*-ary function*F*from dom(^{A}*A*) to dom(*A*).

The *elements* of *A* are the elements of
dom(*A*). Likewise the *cardinality* or *power*
of *A* is the cardinality of its domain. Since we can recover
the signature K from the first-order language L that it generates, we
can and will refer to structures of signature K as
L-*structures*. We think of *c* as a name for the
element *c*^{A} in the structure *A*,
and likewise with the other symbols.

For example the field of real numbers forms a structure
* R* whose elements are the real numbers, with
signature consisting of the individual constant 0 to name the number
zero, a 1-ary function symbol − for minus, and two 2-ary
function symbols + and . for plus and times. At first sight we
can’t add a symbol to express 1/

*x*, since all the named functions have to be defined on the whole domain of the structure, and there is no such real number as 1/0. But on second thoughts this is not a serious problem; any competent mathematician puts the condition ‘

*x*is not zero’ before dividing by

*x*, and so it never matters what the value of 1/0 is, and we can harmlessly take it to be 42. But most model theorists are uncomfortable with any kind of division by zero, so they stick with plus, times and minus.

If L is the first-order language of signature K, then
Tarski’s model-theoretic truth definition
tells us when a sentence of L is true in *A*, and when an
assignment of elements of *A* to variables satisfies a formula
of L in *A*. Instead of talking of assignments satisfying a
formula, model theorists often speak of the set of *n*-tuples
of elements of *A* that is *defined* by a formula
φ(*v*_{1},…,*v*_{n});
the connection is that an *n*-tuple
(*a*_{1},…,*a*_{n}) is
in the defined set if and only if the assignment taking each
*v*_{i} to *a*_{i}
satisfies the formula.

If φ is a sentence, we write

A⊨ φ

to mean that φ is true in *A*, or in other words,
*A* is a model of φ. If
φ(*v*_{1},…,*v*_{n})
is a formula with free variables as shown, we write

A⊨ φ[a]

to mean that the *n*-tuple a is in the set defined by φ.
(The entry on
classical logic
uses the notation ‘*A,s* ⊨ φ’, where
*s* is any assignment to all the variables of L that assigns to
each variable *v*_{i} free in φ the
*i*-th element in the *n*-tuple a.)

Two L-structures that are models of exactly the same sentences of L
are said to be *elementarily equivalent*. Elementary
equivalence is an equivalence relation on the class of all
L-structures. The set of all the sentences of L that are true in the
L-structure *A* is called the *complete theory* of
*A*, in symbols Th(*A*). A theory that is Th(*A*)
for some structure *A* is said to be *complete*. (By the
completeness theorem for first-order logic, for which see the entry on
classical logic,
a theory is complete if and only if it is maximal syntactically
consistent.) The two structures *A* and *B* are
elementarily equivalent if and only if Th(*A*) =
Th(*B*).

To continue the example of the field * R* of
real numbers: It is often not at all obvious whether two given
structures are or are not elementarily equivalent. One of the greatest
achievements of the pre-history of model theory was Tarski’s
description in 1930 of Th(

*) (which he published in full only after the war; see his book in the Bibliography below). This description implied among other things that the structures elementarily equivalent to*

**R***are exactly the real-closed fields, a class of fields which was already known to the algebraists in its own right.*

**R**
When mathematicians introduce a class of structures, they like to
define what they count as the basic maps between these structures. The
basic maps between structures of the same signature K are called
*homomorphisms*, defined as follows. A *homomorphism from
structure A to structure B* is a function *f* from
dom(*A*) to dom(*B*) with the property that for every
atomic formula
φ(*v*_{1},…,*v*_{n})
and any *n*-tuple a =
(*a*_{1},…,*a*_{n}) of
elements of *A*,

A⊨ φ [a] ⇒B⊨ φ [b]

where b is
(*f*(*a*_{1}),…,*f*(*a*_{n})).
If we have ‘⇔’ in place of ‘⇒’ in
the quoted condition, we say that *f* is an *embedding*
of *A* into *B*. Since the language includes =, an
embedding of *A* into *B* is always one-to-one, though
it need not be onto the domain of *B*. If it is onto, then the
inverse map from dom(*B*) to dom(*A*) is also a
homomorphism, and both the embedding and its inverse are said to be
*isomorphisms*. We say that two structures are
*isomorphic* if there is an isomorphism from one to the other.
Isomorphism is an equivalence relation on the class of all structures
of a fixed signature K. If two structures are isomorphic then they
share all model-theoretic properties; in particular they are
elementarily equivalent.

If *A* and *B* are structures of signature K with
dom(*A*) a subset of dom(*B*), and the interpretations
in *A* of the symbols in K are just the restrictions of their
interpretations in *B*, then we say that *A* is a
*substructure* of *B* and conversely *B* is an
*extension* of *A*. If moreover *B* has some
elements that are not in *A*, we say that *A* is a
*proper substructure* of *B* and *B* is an
*proper extension* of *A*. If *B* is a structure
and *X* is a nonempty subset of dom(*B*), then there is
a unique smallest substructure of *B* whose domain contains all
of *X*. It is known as the *substructure of B generated by
X*, and we find it by first adding to *X* all the elements
*c ^{B}* where

*c*are individual constants of K, and then closing off under the functions

*F*where

^{B}*F*are function symbols of K.

For example the substructure of the field * R*
generated by the number 1 consists of 1, 0 (since it is named by the
constant 0), 1+1, 1+1+1 etc., −1, −2 etc., in other words
the ring of integers. (There is no need to close off under
multiplication too, since the set of integers is already closed under
multiplication.) If we had included a symbol for 1/

*x*too, the substructure generated by 1 would have been the field of rational numbers. So the notion of substructure is sensitive to the choice of signature.

## 2. Elementary maps

Let L be a first-order language and let *A* and *B* be
L-structures. Suppose *e* is a function which takes some
elements of *A* to elements of *B*. We say that
*e* is an *elementary map* if whenever a sequence of
elements *a*_{1}, …,
*a*_{n} in the domain of *e* satisfy a
formula
φ(*x*_{1},…,*x*_{n})
of L in *A*, their images under *e* satisfy the same
formula in *B*; in symbols

A⊨ φ(a_{1},…,a_{n}) ⇒B⊨ φ(e(a_{1}),…,e(a_{n})).

We say that *e* is an *elementary embedding* of
*A* into *B* if *e* is an elementary map and its
domain is the whole domain of *A*. As the name implies,
elementary embeddings are always embeddings.

If there is an elementary embedding from *A* to *B* then
*A* and *B* are elementarily equivalent. On the other
hand an embedding between elementarily equivalent structures, or even
between isomorphic structures, need not be elementary. (For example,
writing * Z* for the abelian group of the
integers with signature consisting of 0 and +, the embedding from

*to*

**Z***that takes each integer*

**Z***n*to 2

*n*is an embedding, and of course

*is isomorphic to itself; but this embedding is not elementary, since 1 satisfies the formula ¬∃*

**Z***y*(

*y*+

*y*=

*v*

_{1}), but 2 doesn’t.)

We say that *A* is an *elementary substructure* of
*B*, and *B* is an *elementary extension* of
*A*, if *A* is a substructure of *B* and the
inclusion map is an elementary embedding. It’s immediate from
the definitions that an elementary extension of an elementary
extension of *A* is again an elementary extension of
*A*.

Elementary embeddings are natural maps to consider within first-order
model theory. Around 1950 Abraham Robinson was impressed that maps
between algebraic structures in general seem hardly ever to be
elementary, whereas some important maps (such as embeddings between
two algebraically closed fields, or between two real-closed fields)
turn out to be elementary. He was also surprised to find that this
fact about algebraically closed fields is another way of stating a
celebrated theorem called the Hilbert Nullstellensatz. These
observations of Robinson have had a huge effect on the development of
model theory. In Robinson’s terminology, a first-order theory is
*model-complete* if every embedding between models of the
theory is elementary. This notion has found many uses, and it often
appears in applications of model theory in algebra.

A notion that is closely related to model-completeness, but should not
be confused with it, is elimination of quantifiers. Suppose L is a
first-order language, T is a theory in L, and Φ is a set of
formulas of L. We say that T has *elimination of quantifiers down
to* Φ if for every formula
φ(*x*_{1},…,*x*_{n})
of L there is a formula
ψ(*x*_{1},…,*x*_{n})
in Φ such that in every model of T, φ and ψ are satisfied
by exactly the same *n*-tuples of elements
(*a*_{1},…,*a*_{n}).
(The ‘method of elimination of quantifiers’ discussed in
section 2.2 of
Tarski’s truth definitions)
was a syntactic and pre-model-theoretic method for proving
elimination of quantifiers down to a particular set of formulas.) A
theory is said to *have quantifier elimination* if it has
elimination of quantifiers down to quantifier-free formulas.

The connection between model-completeness and elimination of quantifiers is as follows. Robinson showed that a theory is model-complete if and only if it has elimination of quantifiers down to existential formulas (i.e. formulas that either are quantifier-free or consist of one or more existential quantifiers followed by a quantifier-free formula). So theories that have quantifier elimination are model-complete, but the converse need not hold. Still, showing that a theory is model-complete is sometimes a useful first step towards showing that it has quantifier elimination.

To return to elementary embeddings: They have a number of properties that make them useful. We have space for four.

The downward Löwenheim-Skolem theorem:

Suppose L is a first-order language which has κ formulas,Ais an L-structure and λ is a cardinal which is at least κ but less than the cardinality ofA. Suppose also thatXis a set of at most λ elements ofA. ThenAhas an elementary substructure which has cardinality exactly λ and contains all the elements inX.

There is a proof of this in the entry on classical logic, using Skolem hulls. Note that λ must be infinite since every first-order language has infinitely many formulas.

The elementary chain theorem:

Suppose that L is a first-order language andA_{0},A_{1}, … is a sequence (of any length) of L-structures such that any structure in the sequence is an elementary substructure of all the later structures in the sequence. Then there is a unique smallest L-structureBwhich contains all the structures in the sequence as substructures; this structureBis an elementary extension of all the structures in the sequence.

The elementary amalgamation theorem:

Suppose L is a first-order language,Ais an L-structure andB,Care two elementary extensions ofA. Then there are an elementary extensionDofBand an elementary embeddingeofCintoDsuch that (i) for each elementaofA,e(a)=a, and (ii) ifcis an element ofCbut not ofA, thene(c)is not inB.

The elementary amalgamation theorem is a consequence of the compactness theorem in the next section.

The upward Löwenheim-Skolem theorem:

Suppose L is a first-order language which has κ formulas,Ais an L-structure whose cardinality is an infinite cardinal μ, and λ is a cardinal which is at least as great as both κ and μ. ThenAhas an elementary extension whose cardinality is λ.

This also follows from the compactness theorem, as shown in the entry on classical logic. The name of the theorem is a little unfortunate, since the theorem was first proved by Tarski, and Skolem didn’t even believe it (because he didn’t believe in uncountable cardinals).

There is another proof using the elementary amalgamation theorem and
the elementary chain theorem. One can show that the structure
*A* has a proper elementary extension *A*′. (There
is a proof of this using the compactness theorem and the diagram lemma
— see 3.1 and 3.2 below; another proof is by ultrapowers —
see 4.1 below.) Now use *A*′ and again *A*′
for the structures *B* and *C* in the elementary
amalgamation theorem. Then *D* as in the theorem is an
elementary extension of *A*, and by (ii) in the theorem, it
must contain elements that are not in *A*′, so that it is
a proper elementary extension. Repeat to get a proper elementary
extension of *D*, and so on until you have an infinite
elementary chain. Use the elementary chain theorem to find an
elementary extension of *A* that sits on top of this chain.
Keep repeating these moves until you have an elementary extension of
*A* that has cardinality at least λ. Then if necessary
use the downward Löwenheim-Skolem theorem to pull the cardinality
down to exactly λ. This kind of argument is very common in
first-order model theory. By careful choice of the amalgams at the
steps in the construction, we can often ensure that the top structure
has further properties that we might want (such as saturation, see 4.2
below).

## 3. Five grand theorems

The five theorems reported in this section are in some sense the pillars of classical model theory. All of them are theorems about first-order model theory. A great deal of the work done in the third quarter of the twentieth century was devoted to working out the consequences of these theorems within first-order model theory, and the extent to which similar theorems hold for languages that are not first-order.

### 3.1 The compactness theorem

If T is a first-order theory, and every finite subset of T has a model, then T has a model.

There is a proof of this theorem in the entry on classical logic. The theorem has several useful paraphrases. For example it is equivalent to the following statement:

Suppose T is a first-order theory and φ is a first-order sentence. If T ⊨ φ then there is a finite subset U of T such that U ⊨ φ.

(See the entry on model theory for the notion ⊨ of model-theoretic consequence. To derive the second statement from the first, note that ‘T ⊨ φ’ is true if and only if there is no model of the theory T ∪ {¬ φ}.)

Anatolii Mal’tsev first gave the compactness theorem in 1938 (for first-order logic of any signature), and used it in 1940/1 to prove several theorems about groups; this seems to have been the first application of model theory within classical mathematics. Leon Henkin and Abraham Robinson independently rediscovered the theorem a few years later and gave some further applications. The theorem fails badly for nearly all infinitary languages.

### 3.2 The diagram lemma

If *A* is an L-structure, then we form the *diagram* of
*A* as follows. First add to L a supply of new individual
constants to serve as names for all the elements of *A*. (This
illustrates how in first-order model theory we easily find ourselves
using uncountable signatures. The ‘symbols’ in these
signatures are abstract set-theoretic objects, not marks on a page.)
Then using L and these new constants, the *diagram* of
*A* is the set of all the atomic sentences and negations of
atomic sentences that are true in *A*.

IfB′ is a model of the diagram ofA, andBisB′ with the new constants removed from the signature, then there is an embedding ofAintoB.

Namely, if an element of *A* is named by a new constant
*c*, then map that element to the element of *B*′
named *c*. A variant of this lemma is used in the proof of the
elementary amalgamation theorem.

### 3.3 The Lyndon interpolation theorem

This theorem may have the longest pedigree of any theorem of model
theory, since it generalises the Laws of Distribution for syllogisms,
which go back at least to the early Renaissance. The theorem is
easiest to state if we assume that our first-order languages have
symbols ∧, ∨ and ¬, but not → or ⇔. Then an
occurrence of a predicate symbol R in a formula φ is said to be
*positive* (resp. *negative*) if it lies within the
scope of an even (resp. odd) number of occurrences of ¬.

Suppose L and M are first-order languages, L ∪ M is the smallest first-order language containing both L and M, and L ∩ M is the language consisting of all the formulas which are in both L and M. Suppose T is a theory in L, U is a theory in M, and no (L ∪ M)-structure is both a model of T and a model of U. Then there is a sentence φ of L ∩ M which is true in all models of T and false in all models of U. (This sentence φ is called theinterpolant.) Moreover every predicate symbol with a positive occurrence in φ has a positive occurrence in some sentence of T and a negative occurrence in some sentence of U, and conversely every predicate symbol with a negative occurrence in φ has a negative occurrence in some sentence of T and a positive occurrence in some sentence of U.

There are several proofs of this theorem, and not all of them are model-theoretic. Without the last sentence, the theorem is known as Craig’s interpolation theorem, since William Craig proved this version a few years before Roger Lyndon found the full statement in 1959. As Craig noted at the time, his interpolation theorem gives a neat proof of Evert Beth’s definability theorem, which runs as follows.

Suppose that L is a first-order language and M is the first-order
language got by adding to L a new predicate symbol R. Suppose also
that T is a theory in M. We say that T *implicitly defines* R
if it is false that there are two M-structures which are models of T,
have the same elements and interpret all the symbols of L in the same
way but interpret the symbol R differently. We say that T
*defines* R *explicitly* if there is a formula
φ(*x*_{1},…,*x*_{n})
of L such that in every model of T, the formulas φ and
R(*x*_{1},…,*x*_{n}) are
satisfied by exactly the same *n*-tuples
(*a*_{1},…,*a*_{n}) of
elements. It is easy to see that if T defines R explicitly then it
defines R implicitly. (This fact is known as *Padoa’s
method*; Padoa used the failure of implicit definability as a way
of proving the failure of explicit definability.) Beth’s theorem
is the converse:

Suppose that L, M, R and T are as above. If T defines R implicitly then T defines R explicitly.

### 3.4 The omitting types theorem

This theorem needs some preliminary definitions. Suppose L is a first-order language, T is a complete theory in L, and Φ is a set of formulas of L which all have the free variable*x*(and no other free variable). We say that an L-structure

*A*

*realises*Φ if there is an element of

*A*that satisfies all the formulas in Φ; if

*A*has no such element, we say that

*A*

*omits*Φ. A formula ψ(

*x*) of L, with just the free variable

*x*, is called a

*support*of Φ in T if the consequences of T include the sentence ∃

*x*ψ(

*x*) and the sentence ∀

*x*(ψ(

*x*) → φ(

*x*)) whenever φ(

*x*) is a formula in Φ. It is not hard to see that if there is a support of Φ in T then every model of T realises Φ. The converse is not always true, but the omitting types theorem tells us that it is true if we restrict ourselves to countable first-order languages:

Suppose L is a first-order language which has countably many formulas. Suppose T is a complete theory in L, and Φ is a set of formulas of L which all have the free variablex. Finally suppose that every model of T with just countably many elements realises Φ. Then there is a support of Φ in T. (In other words, there is a finite reason why the ‘type’ Φ can’t be omitted in any model of T.)

The omitting types theorem goes back to the mid 1950s. It very
definitely depends on the language being first-order and countable. It
has several useful generalisations, for example *model-theoretic
forcing*, which is an analogue of the forcing construction in
set theory.
In fact the games used for model-theoretic forcing (see the entry on
logic and games)
can be adapted to prove the omitting types theorem too. There are
similar but more complicated theorems for uncountable first-order
languages; some of these can be paraphrased as omitting types theorems
for
infinitary languages.

### 3.5 The initial model theorem

A quantifier-free formula is said to be a *Horn formula* (after
Alfred Horn) if it has one of the three forms

- ψ,
- φ
_{1}∧ … ∧ φ_{n}→ ψ, - ¬(φ
_{1}∧ … ∧ φ_{n}),

where the formulas φ_{1}, …,
φ_{n}, ψ are all atomic. A *universal Horn
sentence* (also known to the computer scientists as a *Horn
clause*) is a sentence that consists of universal quantifiers
followed by a quantifier-free Horn formula; it is said to be
*strict* if no negation sign occurs in it (i.e. if it
doesn’t come from a quantifier-free Horn formula of the third
kind).

Let T be a theory consisting of strict universal Horn sentences. Then T has a modelAwith the property that for every modelBof T there is a unique homomorphism fromAtoB. (Such a modelAis called aninitial modelof T. It is unique up to isomorphism.)

This theorem is a generalisation, due to Mal’tsev, of a
group-theoretic construction called *construction by generators and
relations*. It is the main idea behind *algebraic
specification*, which is one approach to the specification of
systems in computer science. The required behaviour of the system is
written down as a set of strict universal Horn sentences, and then the
initial model of these sentences is an abstract version of the
required system.

## 4. Three useful constructions

A construction is a procedure for building a structure. We have already seen several constructions in the theorems above: for example the omitting types construction and the initial model construction. Here are three more.

### 4.1 Products and reduced products

If *A* and *B* are L-structures, we form their
*product* *C* = *A* × *B* as
follows. The elements of *C* are the ordered pairs
(*a*,*b*) where *a* is an element of *A*
and *b* is an element of *B*. The predicate symbols are
interpreted ‘pointwise’, i.e. so that for example

(a,b)is in P^{C}if and only ifais in P^{A}andbis in P^{B}.

The structures *A* and *B* are called the
*factors* of *A* × *B*. In the same way we
can form products of any number of structures. If all the factors of a
product are the same structure *A*, the product is called a
*power* of *A*. A theorem called the *Feferman-Vaught
theorem* tells us how to work out the complete theory of the
product from the complete theories of its factors.

This construction has some variants. We can define an equivalence
relation on the domain of a product *C*, and then take a
structure *D* whose elements are the equivalence classes; the
predicate symbols are interpreted in *D* so as to make the
natural map from dom(*C*) to dom(*D*) a homomorphism. In
this case the structure *D* is called a *reduced
product* of the factors of *C*. It is a *reduced
power* of *A* if all the factors are equal to *A*;
in this case the *diagonal map* from *A* to *D*
is the one got by taking each element *a* to the equivalence
class of the element (*a*,*a*,…).

Suppose we use a set *I* to index the factors in a product
*C*. An *ultrafilter* over *I* is a set
*U* of subsets of *I* with the properties

- if sets
*X*and*Y*are in*U*then so is their intersection*X ∩ Y*; - if
*X*is in*U*and*X ⊆ Y ⊆ I*then*Y*is in*U*; - for each subset
*X*of*I*, exactly one of*X*and its complement*I*\ X is in*U*.

If we have an ultrafilter *U* over *I*, then we can
construct a reduced product from *C* by making two elements of
*C* equivalent if and only if the set of indices at which they
are equal is a set in the ultrafilter *U*. This is indeed an
equivalence relation on the domain of *C*, and the resulting
reduced product is called an *ultraproduct* of the factors of
*C*. If *C* is a power of *A* then this
ultraproduct is called an *ultrapower* of *A*, and it is
sometimes written *U*-prod *A*. A theorem called
*Łoś’s theorem* describes what sentences are
true in an ultraproduct. Its most useful consequence is the
following:

IfUis an ultrafilter then the diagonal map fromAtoU-prodAis an elementary embedding.

If the ultrafilter *U* is *nonprincipal*, i.e. contains
no finite sets, then the diagonal map is not onto the domain of
*U*-prod *A*, and in fact *U*-prod *A* is
generally much larger than *A*. So we have a way of
constructing large elementary extensions. The axiom of choice
guarantees that every infinite set has many nonprincipal ultrafilters
over it. Ultrapowers are an essential tool for handling large
cardinals in set theory (see the entry on
set theory).

A remarkable (but in practice not terribly useful) theorem of Saharon
Shelah tells us that a pair of structures *A* and *B*
are elementarily equivalent if and only if they have ultrapowers that
are isomorphic to each other.

### 4.2 Saturation

Suppose *A* is an L-structure, *X* is a set of elements
of *A*, *B* is an elementary extension of *A* and
*b*, *c* are two elements of *B*. Then *b*
and *c* are said to have *the same type over X* if for
every formula
φ(*v*_{1},…,*v*_{n+1})
of L and every *n*-tuple d of elements of *X*,

B⊨ φ[b,d] ⇔B⊨ φ[c,d].

We say that *A* is *saturated* if whenever *X* is
a set of elements of *A*, of cardinality less than that of
*A*, and *B* is any elementary extension of *A*,
we always have that every element of *B* has the same type over
*X* as some element already in *A*.

This rather heavy definition gives little clue how useful saturated structures are. If every structure had a saturated elementary extension, many of the results of model theory would be much easier to prove. Unfortunately the existence of saturated elementary extensions depends on features of the surrounding universe of sets. There are technical ways around this obstacle, for example using weakenings of the notion of saturation. We have two main ways of constructing elementary extensions with some degree of saturation. One is by ultrapowers, using cleverly constructed ultrafilters. The other is by taking unions of elementary chains, generalising the proof we gave for the upward Löwenheim-Skolem theorem.

The existence of partially saturated elementary extensions of the
field * R* of real numbers is the main
technical fact behind Abraham Robinson’s

*nonstandard analysis*. See Section 4 of the entry on model theory for more information on this. Though model theory provided the first steps in nonstandard analysis, this branch of analysis rapidly became a subject in its own right, and its links with first-order model theory today are rather slim.

### 4.3 Ehrenfeucht-Mostowski models

Let *A* be an L-structure, *X* a set of elements of
*A* and < a linear ordering of *X* (not necessarily
definable by a first-order formula). We say that (*X*,<) is
an *indiscernible sequence* in *A* if for every natural
number *n*, and all elements
*a*_{1},…,*a*_{n},*b*_{
1},…,*b* _{n} of *A* such
that *a*_{1} < … <
*a*_{n} and *b*_{1} <
… < *b*_{n}, the map taking each
*a*_{i} to the corresponding
*b*_{i} is an elementary map. If T is a theory
with infinite models, then T has models that are the Skolem hulls (see
the entry on
classical logic)
of indiscernible sequences. These models are known as
*Ehrenfeucht-Mostowski models*, after the two Polish model
theorists who first carried out this construction in the mid 1950s.
These models tend to be the opposite of saturated; we can arrange that
very few types over sets of elements are represented among their
elements. Some important distinctions between different models of set
theory can be expressed in terms of the indiscernible sequences within
these models; see the entry on
set theory.

## 5. Three successful programmes

Every healthy branch of mathematics needs a set of problems that form a serious challenge for its researchers. We close with a brief introduction to some of the research programmes that drove first-order model theory forwards in the second half of the twentieth century. The book of Marcja and Toffalori in the bibliography gives further information about these programmes. There are other current programmes besides these; see for example the handbook edited by Yuri Ershov, which is about model theory when the structures are built recursively.

### 5.1. Categoricity and classification

In 1904 Oswald Veblen described a theory as *categorical* if it
has just one model up to isomorphism, i.e. it has a model and all its
models are isomorphic to each other. (The name was suggested to him by
John Dewey, who also suggested the name ‘disjunctive’ for
other theories. This pair of terms come from traditional logic as
names of types of sentence.) The depressing news is that there are no
categorical first-order theories with infinite models; we can see this
at once from the upward Löwenheim-Skolem theorem. In fact if T is
a first-order theory with infinite models, then the strongest kind of
categoricity we can hope for in T is that for certain infinite
cardinals κ, T has exactly one model of cardinality κ, up
to isomorphism. This property of T is called
κ-*categoricity*.

Now there is a heuristic principle that many people have used, though it seems to have no simple formulation. We suggest ‘Few is beautiful’. The principle says that if a first-order theory T constrains its models (of a particular cardinality) to be all similar to each other, this can only be because the models of T have few irregularities and asymmetries. So there should be a good structural description of these models. One should expect that they are ‘good structures’ from the point of view of classical mathematics. As a first step, one easily sees from the upward and downward Löwenheim-Skolem theorems that if T is κ-categorical for some κ at least as large as the number of formulas in the language of T, then T must be a complete theory. From now on, T is a complete theory with infinite models; and for simplicity we will assume that the language of T is countable.

In 1954 Jerzy Łoś announced that he could only find three kinds of example of theories T that are κ-categorical. Namely:

- T is
*totally categorical*if it is κ-categorical for every infinite cardinal κ. A typical example is the complete theory of an infinite-dimensional vector space over a finite field. - T is
*uncountably categorical*(but not totally categorical) if it is κ-categorical precisely when κ is uncountable. Essentially the only example that Łoś could find was the complete theory of an algebraically closed field; this is uncountably categorical by a well-known theorem of Steinitz. - T is
*countably categorical*(but not uncountably categorical) if it is κ-categorical precisely when κ is countable. A typical example is the complete theory of a dense linear ordering with no first or last element; this is countably categorical by a well-known theorem of Cantor.

Łoś asked whether there are any other possibilities besides these three. (Of course most complete theories are not κ-categorical for any κ.)

This question of Łoś was a tremendous stimulus to research,
and it led to a classic paper of Michael Morley in 1965 which showed
that Łoś’s three possibilities are in fact the only
ones. One central idea of Morley’s analysis was that models of
an uncountably categorical theory have the smallest possible number of
types of element; this led directly to the branch of model theory
called *stability theory*, which studies theories that have a
limited number of element types. These theories have the remarkable
property that every infinite indiscernible sequence in any of their
models is indiscernible under any linear ordering whatever; so these
sequences are a kind of generalisation of bases of vector spaces.
Another idea implicit in Morley’s work, but much clarified by
later work of William Marsh, John Baldwin and Alistair Lachlan, was
that in any model of an uncountably categorical theory there is a
central core (called a *strongly minimal set*) which carries a
dependence relation obeying similar laws to linear dependence in
vector spaces. In terms of this dependence relation one can define a
dimension for the model, and what remains of the model outside the
core is so closely tied to the core that the dimension determines the
model up to isomorphism.

Saharon Shelah developed Morley’s ideas with great
resourcefulness and energy. His main aim was to stretch the ‘Few
is beautiful’ idea by showing that there are clear dividing
lines between kinds of theory T. On one side of a dividing line are
theories with some good structural property that forces the number of
nonisomorphic models of a given cardinality to be small. On the other
side, every theory has (for example) two models of the same
cardinality that are not isomorphic but are extremely hard to tell
apart. Shelah coined the name *classification theory* for this
research. The text of Lascar listed below is an elegant introduction
to this whole programme, from Łoś to Shelah. Meanwhile
Shelah himself has extended it far beyond first-order logic. Even in
the first-order case, Shelah had to invent new set-theoretic
techniques (such as proper forcing) to carry out his
constructions.

Classification theory has two related though fundamentally distinct
aims: to classify the models of a theory (or to show that such a
classification is impossible) by some relatively simple combinatorial
invariants and to classify theories themselves by the presence or
absence of some fundamental structures within their models. With the
second edition of Classification Theory, Shelah dropped the subtitle
“And the number of non-isomorphic models” in order to
emphasize the broader goals of the project. In general, a class of
theories may be recognized as classification theoretically robust if
it admits characterizations both in terms of cardinal invariants and
with respect to some absolute condition on formulae relative to the
theory. For example, by definition, a theory T in a language L is
stable if there is some cardinal κ ≥ |L| so that for every
model M of T of cardinality at most κ, the number of 1-types
over M is at most κ. Equivalently, a theory T is stable if no
formula has the *order property* relative to T. That is, there
is no L-formula φ(*x*;*y*) (where *x* and
*y* may be tuples of variables) so that for each natural number
n it is consistent with T that there be *a*_{1},
…, *a*_{n}; *b*_{1},
…, *b*_{n} so that
φ(*a*_{i},*b*_{j})
holds just in case *i* ≤ *j*.

### 5.2. Geometric model theory

Geometric model theory grew out of Michael Morley’s 1965 paper,
but in a different direction from Shelah’s work (though today it
makes regular use of technical tools developed by Shelah in his
classification programme). Morley had shown that models of an
uncountably categorical theory have structural properties that are
interesting in their own right, regardless of the complete theory of
the structure; so it became the custom to talk of *uncountably
categorical structures*, meaning models of uncountably categorical
theories. (And likewise *totally categorical structures*.)
Independently Boris Zilber in Siberia and Greg Cherlin in the United
States noticed that any infinite group that is definable in an
uncountably categorical structure must have many features in common
with the algebraic groups studied by algebraic geometers. Zilber in
particular showed that many methods from algebraic geometry generalise
to the model-theoretic case. His secret weapon was Bezout’s
Theorem from geometry, which he used to guide him to solutions of very
difficult model theoretic problems; for example his theorem that no
totally categorical theory can be axiomatised by a finite number of
axioms. (It was secret in the sense that it guided his intuition but
never appeared explicitly in his results.) Zilber also noticed an
important difference between the first and second of
Łoś’s examples above. Namely, in a vector space the
subspaces (i.e. the subsets closed under linear dependence) form a
modular lattice; but the algebraically closed subsets of an
algebraically closed field form a lattice that is not modular.

Partly because of the difficulty of communications between Siberia and
the West, these results of Zilber took some time to digest, and in
part they had to be rediscovered in the West. But when the message did
finally get through, the result was a new branch of model theory which
has come to be known as *geometric model theory*. The programme
is broadly to classify structures according to (a) what groups or
fields are interpretable in them (in the sense sketched in the entry
on
model theory)
and (b) whether or not the structures have ‘modular
geometries’; and then to use this classification to solve
problems in model theory and geometry. From the mid 1980s the leader
of this research was Ehud Hrushovski. In the early 1990s, using joint
work with Zilber, Hrushovski gave a model-theoretic proof (the first
complete proof to be found) of the geometric Mordell-Lang conjecture
in all characteristics; this was a conjecture in classical diophantine
geometry. Bouscaren (ed.) 1998 is devoted to Hrushovski’s proof
and the necessary background in model theory. Both (a) and (b) are
fundamental to Hrushovski’s argument.

### 5.3. O-minimality

Of the three programmes described here, this is the oldest, since it
grew out of Tarski’s description of the complete theory of the
field of real numbers (which he proved by the method of quantifier
elimination). In the course of giving this description, Tarski had
shown that every first-order formula φ(*x*) in the relevant
language, possibly with parameters, is satisfied by exactly the same
assignments as some boolean combination of formulas of the form *x
< s* or *t < x* where *s*, *t* are
constant terms naming parameters. Another way of saying this is
that

Every set of elements definable by a first-order formula is a finite union of open intervals with named endpoints, together with some finite set of elements.

A linearly ordered structure with this property is said to be
*o-minimal*. (The idea of the name is that o-minimality is an
analogue of ‘strong minimality’, in a form that makes
sense for structures that carry a linear ordering, whence
‘o-’ for ordering.)

In 1982 Lou van den Dries showed that the fact that the field of real
numbers is o-minimal gives a large amount of useful information about
the definable sets of higher dimension, such as the family of
definable subsets of the real plane. Soon after this, Julia Knight,
Anand Pillay and Charles Steinhorn noticed that if a structure
*A* is o-minimal, then so is any structure elementarily
equivalent to *A*, and that van den Dries’ analysis of
higher-dimensional definable set applies to all these structures.
These results led to much activity on the frontier between model
theory and function theory. Several old problems from model theory and
function theory were solved. Alex Wilkie showed that the field of real
numbers with a symbol for exponentiation is o-minimal and has a
model-complete complete theory, and thereby gave a positive answer to
Tarski’s old problem of whether this structure allows a
quantifier elimination, though his method was very far from the
syntactic analysis that Tarski had in mind. (This is one case where we
need to remember the difference between being model-complete and
having quantifier elimination; see section 2 above. The question
whether this particular theory has quantifier elimination is much
harder and is closely related to a deep conjecture of number theory
called Schanuel’s conjecture; see the paper of Macintyre and
Wilkie.) We now know a wide range of ways of adding interesting
features to the field of real numbers in such a way that the resulting
structure is still o-minimal (and hence in some sense mathematically
tractable). Van den Dries has urged that o-minimal structures provide
a good setting for developing the ‘tame topology’
programme of Alexander Grothendieck.

In 2006, Jonathan Pila and Alex Wilkie showed that provided one
removes the subsets defined using only polynomial inequalities,
subsets of *R*^{n} definable
in an o-minimal expansion of the real field have few rational points.
Subsequently, following a strategy first employed by Pila and Umberto
Zannier to reprove the Manin-Mumford conjecture various authors have
used this o-minimal counting theorem to solve some important open
problems in diophantine geometry.

Kobi Peterzil and Sergei Starchenko have developed a theory of
o-minimal *complex* analysis. Just as with the classical
approach to complex analysis, one may interpret the complex numbers as
the set of ordered pairs of real numbers with addition and
multiplication defined by the usual rules involving their real and
imaginary parts. Of their results in this area, their algebraicity
theorem, which asserts that if a subset of
*C*^{n} is complex analytic
(meaning that it is closed and is locally defined by the vanishing of
finitely many complex analytic functions) and definable in some
o-minimal expansion of the real field, then it must be algebraic, that
is defined by the vanishing of polynomial equations, is the most
striking result and has been strong consequences in the study of
functional transcendence and homogeneous dynamics.

All three of these programmes generated new techniques for proof,
constructions and classifications. As we should expect, researchers
have explored the range of application of each technique. One result
of this has been the emergence of several useful classes of
first-order theories which relate to more than one of the three
programmes. For example a central tool of Shelah’s
classification theory was his notion of *forking*, a
far-reaching generalisation of earlier algebraic notions of dependence
relation. The class of *simple* theories is defined by the fact
that forking has certain nice properties while the class of *rosy
theories* is characterised by the existence of a good notion of
independence coming from a further generalisation of forking called
*þ-forking*; several natural examples of simple theories
came to light in geometric model theory, and the complete theories of
o-minimal structures are examples of rosy theories. Other classes of
theories, denoted by increasing obscure abbreviations, such at the
class of NSOP_{1 }(“not the stronger order property,
one”) theories, have been isolated for which some of the seemingly
characteristic features of stable theories persist in a modified
form. For example, the afore mentioned
NSOP_{1} are defined by the absence of a certain tree
configuration but are also characterized by the presence of a good
theory of independence, which goes under the name of
“Kim-independence”. In parallel with these technical
advances, first-order model theory continues to grow more closely
involved with problems in number theory, functional analysis and other
branches of pure and, and even applied, mathematics.

In Shelah’s programme of classification theory, a central rôle
is played by *dividing lines*. That is, the class of all
theories should be divided into those having some property and those
which do not and this division into these two classes should be
authentic in the sense that the classes should admit various
definitions of different characters. For example, the class of stable
theories may be described in many facially dissimilar ways as, for
example, those theories with respect to which every type is definable,
those with respect to which no formula has the order property, or
those for which there is some infinite cardinal *κ* at
least as large as that of the language for which over every model of
size *κ* there are no more than *κ* many
*1*-types. Many of these classification theoretic classes are
defined by forbidding the existence of certain combinatorial
configurations. For example, the class of *dependent* or
*NIP* (for “Not the Independence Property”) theories is defined
by saying that for **no** formula
φ(*x*,*y*) is it possible to find a model *M*
and sequences
(*a _{i}*)

_{i=0}

^{∞}and (

*b*) where the

_{S}*b*sequence is indexed by the subsets of the natural numbers so that

*M*⊧ φ(

*a*) if and only if

_{i},b_{S}*i ∈ S*. Around the same time that Shelah isolated the independence property, Vladimir Vapnik and Alexey Chervonenkis discovered the same notion in machine learning theory. Consequences of the model theoretic analysis of NIP have since been drawn in theory of neural networks, extremal combinatorics, the theory of differential privacy, and machine learning theory more broadly.

## Bibliography

- Beth, E., 1953, “On Padoa’s method in the theory of
definition”,
*Nederlandse Akademie van Wetenschappen*, Proceedings (Series A), 56: 330–339. - Bouscaren, E. (ed.), 1998,
*Model Theory and Algebraic Geometry: An introduction to E. Hrushovski’s proof of the geometric Mordell-Lang conjecture*(Lecture Notes in Mathematics: Volume 1696), Berlin: Springer-Verlag. - Buechler, S., 1996,
*Essential Stability Theory*, Berlin: Springer-Verlag. - Chang, C. and Keisler, J., 1990,
*Model Theory*, Amsterdam: North-Holland. - Chatzidakis, Z.
*et al*. (eds.), 2008,*Model Theory with Applications to Algebra and Analysis*, Volumes 1 and 2, Cambridge: Cambridge University Press. - Dries, L. van den, 1998,
*Tame Topology and O-minimal Structures*, Cambridge: Cambridge University Press. - Ealy, C. and Onshuus, A., 2007, “Characterizing rosy
theories”,
*Journal of Symbolic Logic*, 72: 919–940. - Ehrig, H. and Mahr, B., 1985,
*Fundamentals of Algebraic Specification I: Equations and Initial Semantics*, Berlin: Springer-Verlag. - Ershov, Y. (ed.), 1998,
*Handbook of Recursive Mathematics I, Recursive Model Theory*, New York: Elsevier. - Hart, B., Lachlan, A. and Valeriote, M., 1996,
*Algebraic Model Theory*, Dordrecht: Kluwer. - Haskell, D., Pillay, A. and Steinhorn, C., 2000,
*Model Theory, Algebra, and Geometry*, Cambridge: Cambridge University Press. - Hodges, W., 1993,
*Model Theory*, Cambridge: Cambridge University Press. - –––, 1998, “The laws of distribution for
syllogisms”,
*Notre Dame Journal of Formal Logic*, 39: 221–230. - Karpinski, M. and A. Macintyre, 1997, Polynomial bounds for VC
dimension of sigmoidal and general Pfaffian neural networks, 1st
Annual Dagstuhl Seminar on Neural Computing (1994),
*Journal of Computer and System Sciences*, 54(1[2]): 169–176. - Lascar, D., 1986,
*Stability in Model Theory*, Harlow: Longman. - Macintyre, A. and Wilkie, A., 1996, “On the decidability of
the real exponential field”, in
*Kreiseliana: About and around Georg Kreisel*, P. Odifreddi (ed.), Wellesley MA : A. K. Peters, 441–467. - Marcja, A. and Toffalori, C., 2003,
*A Guide to Classical and Modern Model Theory*, Dordrecht: Kluwer. - Marker, D., 2002,
*Model Theory: An Introduction*, New York: Springer-Verlag. - Morley, M., 1965, “Categoricity in power”,
*Transactions of the American Mathematical Society*, 114: 514–538. - Peterzil, Y. and S. Starchenko, Sergei, 2009, Complex analytic
geometry and analytic-geometric categories,
*Journal für die reine und angewandte Mathematik*, 626: 39–74. - Pillay, A., 1996,
*Geometric Stability Theory*, Oxford: Oxford University Press. - Pila, J., 2011, “O-minimality and the André-Oort
conjecture for
**C**^{n}”,*Annals of Mathematics*(2), 173(3): 1779–1840. - Pila, J. and Wilkie, A., 2006, “The rational points of a
definable set”,
*Duke Mathematics Journal*, 133(3): 591–616. - Pila, J. and Zannier, U., 2008, “Rational points in periodic
analytic sets and the Manin-Mumford conjecture”,
*Rendiconti Lincei Matematica E Applicazioni*, 19(2): 149–162. - Poizat, B., 2000,
*A Course in Model Theory*, New York: Springer. - Shelah, S., 1990,
*Classification Theory*, Amsterdam: North-Holland. - Tarski, A., 1951,
*A Decision Method for Elementary Algebra and Geometry*, Berkeley: University of California Press. - Vaught, R., 1974, “Model theory before 1945”, in
*Proceedings of the Tarski Symposium*, L. Henkin,*et al*. (eds.), Providence RI : American Mathematical Society, 153–172. - Veblen, O., 1904, “A System of Axioms for Geometry”,
*Transactions of the American Mathematical Society*, 5(3): 343–384 - Wagner, F., 2000,
*Simple Theories*, Dordrecht: Kluwer Academic Publishers. - Wilkie, A., 1996, “Model completeness results for expansions
of the real field by restricted Pfaffian functions and the exponential
function”,
*Journal of the American Mathematical Society*, 9: 1051–1094.

## Academic Tools

How to cite this entry. Preview the PDF version of this entry at the Friends of the SEP Society. Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers, with links to its database.

## Other Internet Resources

- Hrushovski, E. and Loeser, F., 2010, Non-archimedean tame topology and stably dominated types, online manuscript at arXiv.org.
- Simon, P., 2012, Lecture notes on NIP theories, online manuscript at arXiv.org.
- Schematic representation of the classification of theories.