# Classical Logic

*First published Sat Sep 16, 2000; substantive revision Sun Mar 11, 2018*

Typically, a *logic* consists of a formal or informal language
together with a deductive system and/or a model-theoretic semantics.
The language has components that correspond to a part of a natural language like
English or Greek. The deductive system is to capture, codify, or
simply record *arguments* that are *valid* for the given
language, and the semantics is to capture, codify, or record the
meanings, or truth-conditions for at least part of the language.

The following sections provide the basics of a typical logic,
sometimes called “classical elementary logic” or “classical
first-order logic”. Section 2 develops a formal language, with a
rigorous syntax and grammar. The formal language is a recursively
defined collection of strings on a fixed alphabet. As such, it has
no meaning, or perhaps better, the meaning of its formulas is given
by the deductive system and the semantics. Some of the symbols have
counterparts in ordinary language. We define an *argument* to
be a non-empty collection of sentences in the formal language, one of
which is designated to be the conclusion. The other sentences (if
any) in an argument are its premises. Section 3 sets up a deductive
system for the language, in the spirit of natural deduction. An
argument is *derivable* if there is a deduction from some or all of
its premises to its conclusion. Section 4 provides a model-theoretic
semantics. An argument is *valid* if there is no
interpretation (in the semantics) in which its premises are all true
and its conclusion false. This reflects the longstanding view that a
valid argument is truth-preserving.

In Section 5, we turn to relationships between the deductive system
and the semantics, and in particular, the relationship between
derivability and validity. We show that an argument is derivable only
if it is valid. This pleasant feature, called *soundness*,
entails that no deduction takes one from true premises to a false
conclusion. Thus, deductions preserve truth. Then we establish a converse, called
*completeness*, that an argument is valid only if it is
derivable. This shows that the deductive system is rich enough
to provide a deduction for every valid argument. So there are enough
deductions: all and only valid arguments are derivable. We briefly
indicate other features of the logic, some of which are corollaries to
soundness and completeness.

The final section, Section 6, is devoted to the a brief examination of the philosophical position that classical logic is “the one right logic”.

- 1. Introduction
- 2. Language
- 3. Deduction
- 4. Semantics
- 5. Meta-theory
- 6. The One Right Logic?
- Bibliography
- Academic Tools
- Other Internet Resources
- Related Entries

## 1. Introduction

Today, logic is a branch of mathematics and a branch of philosophy.
In most large universities, both departments offer courses in logic,
and there is usually a lot of overlap between them. Formal languages,
deductive systems, and model-theoretic semantics are mathematical
objects and, as such, the logician is interested in their mathematical
properties and relations. Soundness, completeness, and most of the
other results reported below are typical examples. Philosophically,
logic is at least closely related to the study of *correct
reasoning*. Reasoning is an epistemic, mental activity. So logic
is at least closely allied with epistemology. Logic is also a central
branch of computer science, due, in part, to interesting computational
relations in logical systems, and, in part, to the close connection
between formal deductive argumentation and reasoning (see the entries
on
recursive functions,
computability and complexity, and
philosophy of computer science).

This raises questions concerning the philosophical relevance of the various mathematical aspects of logic. How do deducibility and validity, as properties of formal languages--sets of strings on a fixed alphabet--relate to correct reasoning? What do the mathematical results reported below have to do with the original philosophical issues concerning valid reasoning? This is an instance of the philosophical problem of explaining how mathematics applies to non-mathematical reality.

Typically, ordinary deductive reasoning takes place in a natural language, or perhaps a natural language augmented with some mathematical symbols. So our question begins with the relationship between a natural language and a formal language. Without attempting to be comprehensive, it may help to sketch several options on this matter.

One view is that the formal languages accurately exhibit actual
features of certain fragments of a natural language. Some
philosophers claim that declarative sentences of natural language
have underlying *logical forms* and that these forms are
displayed by formulas of a formal language. Other writers hold that
(successful) declarative sentences express *propositions*; and
formulas of formal languages somehow display the forms of these
propositions. On views like this, the components of a logic provide
the underlying deep structure of correct reasoning. A chunk of
reasoning in natural language is correct if the forms underlying the
sentences constitute a valid or deducible argument. See for example,
Montague [1974], Davidson [1984], Lycan [1984] (and the
entry on
logical form).

Another view, held at least in part by Gottlob Frege and Wilhelm
Leibniz, is that because natural languages are fraught with vagueness
and ambiguity, they should be *replaced* by formal languages. A
similar view, held by W. V. O. Quine (e.g., [1960], [1986]), is that a
natural language should be *regimented*, cleaned up for serious
scientific and metaphysical work. One desideratum of the enterprise is
that the logical structures in the regimented language should be
transparent. It should be easy to “read off” the logical
properties of each sentence. A regimented language is similar to a
formal language regarding, for example, the explicitly presented rigor
of its syntax and its truth conditions.

On a view like this, deducibility and validity represent
*idealizations* of correct reasoning in natural language. A
chunk of reasoning is correct to the extent that it corresponds to,
or can be regimented by, a valid or deducible argument in a formal
language.

When mathematicians and many philosophers engage in deductive
reasoning, they occasionally invoke formulas in a formal language to
help disambiguate, or otherwise clarify what they mean. In other
words, sometimes formulas in a formal language are *used* in
ordinary reasoning. This suggests that one might think of a formal
language as an
*addendum* to a natural language. Then our present question
concerns the relationship between this addendum and the original
language. What do deducibility and validity, as sharply defined on
the addendum, tell us about correct deductive reasoning in general?

Another view is that a formal language is a *mathematical
model* of a natural language in roughly the same sense as, say, a
collection of point masses is a model of a system of physical objects,
and the Bohr construction is a model of an atom. In other words, a
formal language displays certain features of natural languages, or
idealizations thereof, while ignoring or simplifying other
features. The purpose of mathematical models is to shed light on what
they are models of, without claiming that the model is accurate in all
respects or that the model should replace what it is a model of. On a
view like this, deducibility and validity represent mathematical
models of (perhaps different aspects of) correct reasoning in natural
languages. Correct chunks of deductive reasoning correspond, more or
less, to valid or deducible arguments; incorrect chunks of reasoning
roughly correspond to invalid or non-deducible arguments. See, for
example, Corcoran [1973], Shapiro [1998], and Cook [2002].

There is no need to adjudicate this matter here. Perhaps the truth lies in a combination of the above options, or maybe some other option is the correct, or most illuminating one. We raise the matter only to lend some philosophical perspective to the formal treatment that follows.

## 2. Language

Here we develop the basics of a formal language, or to be precise, a class of formal languages. Again, a formal language is a recursively defined set of strings on a fixed alphabet. Some aspects of the formal languages correspond to, or have counterparts in, natural languages like English. Technically, this “counterpart relation” is not part of the formal development, but we will mention it from time to time, to motivate some of the features and results.

### 2.1 Building blocks

We begin with analogues of *singular terms*, linguistic items
whose function is to denote a person or object. We call these
*terms*. We assume a stock of *individual constants*.
These are lower-case letters, near the beginning of the Roman
alphabet, with or without numerical subscripts:

We envisage a potential infinity of individual constants. In the present system each constant is a single character, and so individual constants do not have an internal syntax. Thus we have an infinite alphabet. This could be avoided by taking a constant like \(d_{22}\), for example, to consist of three characters, a lowercase “\(d\)” followed by a pair of subscript “2”s.

We also assume a stock of *individual variables*. These are
lower-case letters, near the end of the alphabet, with or without
numerical subscripts:

In ordinary mathematical reasoning, there are two functions terms need to fulfill. We need to be able to denote specific, but unspecified (or arbitrary) objects, and sometimes we need to express generality. In our system, we use some constants in the role of unspecified reference and variables to express generality. Both uses are recapitulated in the formal treatment below. Some logicians employ different symbols for unspecified objects (sometimes called “individual parameters”) and variables used to express generality.

Constants and variables are the only terms in our formal language, so
all of our terms are simple, corresponding to proper names and some
uses of pronouns. We call a term closed if it is not a variable. In general, we use \(v\) to represent variables, and \(t\)
to represent a closed term, an individual constant. Some authors also introduce *function
letters*, which allow complex terms corresponding to:
“\(7+4\)” and “the wife of Bill Clinton”, or
complex terms containing variables, like “the father of
\(x\)” and “\(x/y\)”. Logic books aimed at
mathematicians are likely to contain function letters, probably due to
the centrality of functions in mathematical discourse. Books aimed at
a more general audience (or at philosophy students), may leave out
function letters, since it simplifies the syntax and theory. We follow
the latter route here. This is an instance of a general tradeoff
between presenting a system with greater expressive resources, at the
cost of making its formal treatment more complex.

For each natural number \(n\), we introduce a stock of
\(n\)-place *predicate letters*. These are upper-case
letters at the beginning or middle of the alphabet. A superscript
indicates the number of places, and there may or may not be a
subscript. For example,

are three-place predicate letters. We often omit the superscript, when no confusion will result. We also add a special two-place predicate symbol “\(=\)” for identity.

Zero-place predicate letters are sometimes called “sentence letters”. They correspond to free-standing sentences whose internal structure does not matter. One-place predicate letters, called “monadic predicate letters”, correspond to linguistic items denoting properties, like “being a man”, “being red”, or “being a prime number”. Two-place predicate letters, called “binary predicate letters”, correspond to linguistic items denoting binary relations, like “is a parent of” or “is greater than”. Three-place predicate letters correspond to three-place relations, like “lies on a straight line between”. And so on.

The *non-logical terminology* of the language consists of its
individual constants and predicate letters. The symbol “\(=\)”, for
identity, is not a non-logical symbol. In taking identity to be
logical, we provide explicit treatment for it in the deductive system
and in the model-theoretic semantics. Most authors do the same, but
there is some controversy over the issue (Quine [1986, Chapter
5]). If \(K\) is a set of constants and predicate letters, then
we give the fundamentals of a language \(\LKe\)
built on this set of non-logical terminology. It may be called the
*first-order language with identity* on \(K\). A similar
language that lacks the symbol for identity (or which takes identity
to be non-logical) may be called
\(\mathcal{L}1K\),
the *first-order language without identity* on \(K\).

### 2.2 Atomic formulas

If \(V\) is an \(n\)-place predicate letter in \(K\),
and \(t_1, \ldots,t_n\)
are terms of \(K\),
then \(Vt_1 \ldots t_n\)
is an *atomic formula* of
\(\LKe\).
Notice that the terms \(t_1, \ldots,t_n\) need not be distinct. Examples of
atomic formulas include:

The last one is an analogue of a statement that a certain relation \((A)\) holds between three objects \((a, b, c)\). If \(t_1\) and \(t_2\) are terms, then \(t_1 =t_2\) is also an atomic formula of \(\LKe\). It corresponds to an assertion that \(t_1\) is identical to \(t_2\).

If an atomic formula has no variables, then it is called an
*atomic sentence*. If it does have variables, it is
called *open*. In the above list of examples, the first and
second are open; the rest are sentences.

### 2.3 Compound formulas

We now introduce the final items of the lexicon:

\[ \neg, \amp, \vee, \rightarrow, \forall, \exists, (, ) \]
We give a recursive definition of a *formula* of
\(\LKe\):

- All atomic formulas of \(\LKe\) are formulas of \(\LKe\).
- If \(\theta\) is a formula of \(\LKe\), then so is \(\neg \theta\).

A formula corresponding to \(\neg \theta\) thus says that it is not the case that \(\theta\). The symbol “\(\neg\)” is called “negation”, and is a unary connective.

- If \(\theta\) and \(\psi\) are formulas of \(\LKe\), then so is \((\theta \amp \psi)\).

The ampersand “\(\amp\)” corresponds to the English “and” (when “and” is used to connect sentences). So \((\theta \amp \psi)\) can be read “\(\theta\) and \(\psi\)”. The formula \((\theta \amp \psi)\) is called the “conjunction” of \(\theta\) and \(\psi\).

- If \(\theta\) and \(\psi\) are formulas of \(\LKe\), then so is \((\theta \vee \psi)\).

The symbol “\(\vee\)” corresponds to “either … or … or both”, so \((\theta \vee \psi)\) can be read “\(\theta\) or \(\psi\)”. The formula \((\theta \vee \psi)\) is called the “disjunction” of \(\theta\) and \(\psi\).

- 5. If \(\theta\) and \(\psi\) are formulas of \(\LKe\), then so is \((\theta \rightarrow \psi)\).

The arrow “\(\rightarrow\)” roughly corresponds to “if … then … ”, so \((\theta \rightarrow \psi)\) can be read “if \(\theta\) then \(\psi\)” or “\(\theta\) only if \(\psi\)”.

The symbols “\(\amp\)”, “\(\vee\)”, and “\(\rightarrow\)” are called “binary connectives”, since they serve to “connect” two formulas into one. Some authors introduce \((\theta \leftrightarrow \psi)\) as an abbreviation of \(((\theta \rightarrow \psi) \amp(\psi \rightarrow \theta))\). The symbol “\(\leftrightarrow\)” is an analogue of the locution “if and only if”.

- If \(\theta\) is a formula of \(\LKe\) and \(v\) is a variable, then \(\forall v \theta\) is a formula of \(\LKe\).

The symbol “\(\forall\)” is called a *universal
quantifier*, and is an analogue of “for all”; so
\(\forall v\theta\) can be read “for all \(v,
\theta\)”.

- If \(\theta\) is a formula of \(\LKe\) and \(v\) is a variable, then \(\exists v \theta\) is a formula of \(\LKe\).

The symbol “\(\exists\)” is called an
*existential quantifier*, and is an analogue of “there
exists” or “there is”; so \(\exists v \theta\)
can be read “there is a \(v\) such that \(\theta\)”.

- That’s all folks. That is, all formulas are constructed in accordance with rules (1)–(7).

Clause (8) allows us to do inductions on the complexity of formulas. If a certain property holds of the atomic formulas and is closed under the operations presented in clauses (2)–(7), then the property holds of all formulas. Here is a simple example:

**Theorem 1**. Every formula of
\(\LKe\)
has the same number of left and right parentheses. Moreover, each
left parenthesis corresponds to a unique right parenthesis, which
occurs to the right of the left parenthesis. Similarly, each right
parenthesis corresponds to a unique left parenthesis, which occurs to
the left of the given right parenthesis. If a parenthesis occurs
between a matched pair of parentheses, then its mate also occurs
within that matched pair. In other words, parentheses that occur
within a matched pair are themselves matched.

**Proof**: By clause (8), every formula is built up
from the atomic formulas using clauses (2)–(7). The atomic formulas
have no parentheses. Parentheses are introduced only in clauses (3)–(5), and
each time they are introduced as a matched set. So at any stage in
the construction of a formula, the parentheses are paired off.

We next define the notion of an occurrence of a variable being
*free* or *bound* in a formula. A variable that
immediately follows a quantifier (as in “\(\forall x\)”
and “\(\exists y\)”) is neither free nor bound. We do not
even think of those as occurrences of the variable. All variables that
occur in an atomic formula are free. If a variable occurs free (or
bound) in \(\theta\) or in \(\psi\), then that same occurrence is free
(or bound) in \(\neg \theta, (\theta \amp \psi), (\theta \vee \psi)\),
and \((\theta \rightarrow \psi)\). That is, the (unary and binary)
connectives do not change the status of variables that occur in
them. All occurrences of the variable \(v\) in \(\theta\) are bound in
\(\forall v \theta\) and \(\exists v \theta\). Any *free*
occurrences of \(v\) in \(\theta\) are bound by the initial
quantifier. All other variables that occur in \(\theta\) are free or
bound in \(\forall v \theta\) and \(\exists v \theta\), as they are in
\(\theta\).

For example, in the formula
\((\forall\)x(*Axy*
\(\vee Bx) \amp Bx)\), the occurrences of “\(x\)” in
*Axy* and in the first \(Bx\) are bound by the
quantifier. The occurrence of “\(y\)” and last
occurrence of “\(x\)” are free. In
\(\forall x(Ax \rightarrow \exists\)*xBx*), the
“\(x\)” in \(Ax\) is bound by the initial
universal quantifier, while the other occurrence of \(x\) is
bound by the existential quantifier. The above syntax allows this
“double-binding”. Although it does not create any
ambiguities (see below), we will avoid such formulas, as a matter of
taste and clarity.

The syntax also allows so-called vacuous binding, as in \(\forall\)x\(Bc\). These, too, will be avoided in what follows. Some treatments of logic rule out vacuous binding and double binding as a matter of syntax. That simplifies some of the treatments below, and complicates others.

Free variables correspond to place-holders, while bound variables
are used to express generality. If a formula has no free variables,
then it is called a *sentence*. If a formula has free
variables, it is called *open*.

### 2.4 Features of the syntax

Before turning to the deductive system and semantics, we mention a few features of the language, as developed so far. This helps draw the contrast between formal languages and natural languages like English.

We assume at the outset that all of the categories are disjoint. For example, no connective is also a quantifier or a variable, and the non-logical terms are not also parentheses or connectives. Also, the items within each category are distinct. For example, the sign for disjunction does not do double-duty as the negation symbol, and perhaps more significantly, no two-place predicate is also a one-place predicate.

One difference between natural languages like English and formal languages like \(\LKe\) is that the latter are not supposed to have any ambiguities. The policy that the different categories of symbols do not overlap, and that no symbol does double-duty, avoids the kind of ambiguity, sometimes called “equivocation”, that occurs when a single word has two meanings: “I’ll meet you at the bank.” But there are other kinds of ambiguity. Consider the English sentence:

John is married, and Mary is single, or Joe is crazy.

It can mean that John is married and either Mary is single or Joe is
crazy, or else it can mean that either both John is married and Mary
is single, or else Joe is crazy. An ambiguity like this, due to
different ways to parse the same sentence, is sometimes called an
“amphiboly”. If our formal language did not have the
parentheses in it, it would have amphibolies. For example, there would
be a “formula” \(A \amp B \vee\)*
C*. Is this supposed to be \(((A \amp B)
\vee C)\), or is it \((A \amp(B \vee C))\)? The parentheses resolve what would be an
amphiboly.

Can we be sure that there are no other amphibolies in our language? That is, can we be sure that each formula of \(\LKe\) can be put together in only one way? Our next task is to answer this question.

Let us temporarily use the term “unary marker” for the negation symbol \((\neg)\) or a quantifier followed by a variable (e.g., \(\forall x, \exists z)\).

**Lemma 2**. Each formula consists of a string of zero
or more unary markers followed by either an atomic formula or a
formula produced using a binary connective, via one of clauses
(3)–(5).

**Proof**: We proceed by induction on the complexity of
the formula or, in other words, on the number of formation rules that
are applied. The Lemma clearly holds for atomic formulas. Let \(n\) be
a natural number, and suppose that the Lemma holds for any formula
constructed from \(n\) or fewer instances of clauses
(2)–(7). Let \(\theta\) be a formula constructed from \(n+1\)
instances. The Lemma holds if the last clause used to construct
\(\theta\) was either (3), (4), or (5). If the last clause used to
construct \(\theta\) was (2), then \(\theta\) is \(\neg \psi\). Since
\(\psi\) was constructed with \(n\) instances of the rule, the Lemma
holds for \(\psi\) (by the induction hypothesis), and so it holds for
\(\theta\). Similar reasoning shows the Lemma to hold for \(\theta\)
if the last clause was (6) or (7). By clause (8), this exhausts the
cases, and so the Lemma holds for \(\theta\), by induction.

**Lemma 3**. If a formula
\(\theta\) contains a left parenthesis,
then it ends with a right parenthesis, which matches the leftmost left
parenthesis in
\(\theta\).

**Proof**: Here we also proceed by induction on the
number of instances of (2)–(7) used to construct the
formula. Clearly, the Lemma holds for atomic formulas, since they
have no parentheses. Suppose, then, that the Lemma holds for formulas
constructed with \(n\) or fewer instances of (2)–(7), and let
\(\theta\) be constructed with
\(n+1\)
instances. If the last clause applied was (3)–(5), then the Lemma
holds since
\(\theta\) itself begins with a left
parenthesis
and ends with the matching right parenthesis. If the last clause
applied was
(2), then
\(\theta\) is
\(\neg \psi\),
and the induction hypothesis applies to
\(\psi\).
Similarly, if the last clause applied was (6) or (7), then
\(\theta\)
consists of a quantifier, a variable, and a formula to which we can
apply the
induction hypothesis. It follows that the Lemma holds for
\(\theta\).

**Lemma 4**. Each formula contains at least one atomic
formula.

The proof proceeds by induction on the number of instances of (2)–(7) used to construct the formula, and we leave it as an exercise.

**Theorem 5**. Let \(\alpha, \beta\) be nonempty
sequences of characters on our alphabet, such that \(\alpha \beta\)
(i.e \(\alpha\) followed by \(\beta)\) is a formula. Then \(\alpha\)
is *not* a formula.

**Proof**: By Theorem 1 and Lemma 3, if \(\alpha\)
contains a left parenthesis, then the right parenthesis that matches
the leftmost left parenthesis in \(\alpha \beta\) comes at the end of
\(\alpha \beta\), and so the matching right parenthesis is in
\(\beta\). So, \(\alpha\) has more left parentheses than right
parentheses. By Theorem \(1, \alpha\) is not a formula. So now suppose
that \(\alpha\) does not contain any left parentheses. By Lemma \(2,
\alpha \beta\) consists of a string of zero or more unary markers
followed by either an atomic formula or a formula produced using a
binary connective, via one of clauses (3)–(5). If the latter
formula was produced via one of clauses (3)–(5), then it begins
with a left parenthesis. Since \(\alpha\) does not contain any
parentheses, it must be a string of unary markers. But then \(\alpha\)
does not contain any atomic formulas, and so by Lemma \(4, \alpha\) is
not a formula. The only case left is where \(\alpha \beta\) consists
of a string of unary markers followed by an atomic formula, either in
the form \(t_1 =t_2\) or \(Pt_1 \ldots t_n\). Again, if \(\alpha\)
just consisted of unary markers, it would not be a formula, and so
\(\alpha\) must consist of the unary markers that start \(\alpha
\beta\), followed by either \(t_1\) by itself, \(t_1 =\) by itself, or
the predicate letter \(P\), and perhaps some (but not all) of the
terms \(t_1, \ldots,t_n\). In the first two cases, \(\alpha\) does
not contain an atomic formula, by the policy that the categories do
not overlap. Since \(P\) is an \(n\)-place predicate letter, by the
policy that the predicate letters are distinct, \(P\) is not an
\(m\)-place predicate letter for any \(m \ne n\). So the part of
\(\alpha\) that consists of \(P\) followed by the terms is not an
atomic formula. In all of these cases, then, \(\alpha\) does not
contain an atomic formula. By Lemma \(4, \alpha\) is not a
formula.

We are finally in position to show that there is no amphiboly in our language.

**Theorem 6**. Let \(\theta\) be any formula of
\(\LKe\). If \(\theta\) is not atomic, then there is one and only one
among (2)–(7) that was the last clause applied to construct
\(\theta\). That is, \(\theta\) could not be produced by two
different clauses. Moreover, no formula produced by clauses
(2)–(7) is atomic.

**Proof**: By Clause (8), either \(\theta\) is atomic or
it was produced by one of clauses (2)–(7). Thus, the first
symbol in \(\theta\) must be either a predicate letter, a term, a
unary marker, or a left parenthesis. If the first symbol in \(\theta\)
is a predicate letter or term, then \(\theta\) is atomic. In this
case, \(\theta\) was not produced by any of (2)–(7), since all
such formulas begin with something other than a predicate letter or
term. If the first symbol in \(\theta\) is a negation sign
“\(\neg\)”, then was \(\theta\) produced by clause (2),
and not by any other clause (since the other clauses produce formulas
that begin with either a quantifier or a left parenthesis). Similarly,
if \(\theta\) begins with a universal quantifier, then it was produced
by clause (6), and not by any other clause, and if \(\theta\) begins
with an existential quantifier, then it was produced by clause (7),
and not by any other clause. The only case left is where \(\theta\)
begins with a left parenthesis. In this case, it must have been
produced by one of (3)–(5), and not by any other clause. We only
need to rule out the possibility that \(\theta\) was produced by more
than one of (3)–(5). To take an example, suppose that \(\theta\)
was produced by (3) and (4). Then \(\theta\) is \((\psi_1 \amp
\psi_2)\) and \(\theta\) is also \((\psi_3 \vee \psi_4)\), where
\(\psi_1, \psi_2, \psi_3\), and \(\psi_4\) are themselves
formulas. That is, \((\psi_1 \amp \psi_2)\) is the very same formula
as \((\psi_3 \vee \psi_4)\). By Theorem \(5, \psi_1\) cannot be a
proper part of \(\psi_3\), nor can \(\psi_3\) be a proper part of
\(\psi_1\). So \(\psi_1\) must be the same formula as \(\psi_3\). But
then “\(\amp\)” must be the same symbol as
“\(\vee\)”, and this contradicts the policy that all of
the symbols are different. So \(\theta\) was not produced by both
Clause (3) and Clause (4). Similar reasoning takes care of the other
combinations.

This result is sometimes called “unique readability”. It
shows that each formula is produced from the atomic formulas via the
various clauses in exactly one way. If \(\theta\) was produced by
clause (2), then its *main connective* is the initial
“\(\neg\)”. If \(\theta\) was produced by clauses (3),
(4), or (5), then its *main connective* is the introduced
“\(\amp\)”, “\(\vee\)”, or
“\(\rightarrow\)”, respectively. If \(\theta\) was
produced by clauses (6) or (7), then its *main connective* is
the initial quantifier. We apologize for the tedious details. We
included them to indicate the level of precision and rigor for the
syntax.

## 3. Deduction

We now introduce a *deductive system*, \(D\), for our
languages. As above, we define an *argument* to be a non-empty
collection of sentences in the formal language, one of which is
designated to be the *conclusion*. If there are any other
sentences in the argument, they are its
*premises*.^{[1]}
By convention, we use “\(\Gamma\)”,
“\(\Gamma'\)”, “\(\Gamma_1\)”, etc, to range
over sets of sentences, and we use the letters “\(\phi\)”,
“\(\psi\)”, “\(\theta\)”, uppercase or
lowercase, with or without subscripts, to range over single
sentences. We write “\(\Gamma, \Gamma'\)” for the union of
\(\Gamma\) and \(\Gamma'\), and “\(\Gamma, \phi\)” for the
union of \(\Gamma\) with \(\{\phi\}\).

We write an argument in the form \(\langle \Gamma, \phi \rangle\), where \(\Gamma\) is a set of sentences, the premises, and \(\phi\) is a single sentence, the conclusion. Remember that \(\Gamma\) may be empty. We write \(\Gamma \vdash \phi\) to indicate that \(\phi\) is deducible from \(\Gamma\), or, in other words, that the argument \(\langle \Gamma, \phi \rangle\) is deducible in \(D\). We may write \(\Gamma \vdash_D \phi\) to emphasize the deductive system \(D\). We write \(\vdash \phi\) or \(\vdash_D \phi\) to indicate that \(\phi\) can be deduced (in \(D)\) from the empty set of premises.

The rules in \(D\) are chosen to match logical relations concerning the English analogues of the logical terminology in the language. Again, we define the deducibility relation by recursion. We start with a rule of assumptions:

- (As) If \(\phi\) is a member of \(\Gamma\), then \(\Gamma \vdash \phi\).

We thus have that \(\{\phi \}\vdash \phi\); each premise follows from itself. We next present two clauses for each connective and quantifier. The clauses indicate how to “introduce” and “eliminate” sentences in which each symbol is the main connective.

First, recall that “\(\amp\)” is an analogue of the English connective “and”. Intuitively, one can deduce a sentence in the form \((\theta \amp \psi)\) if one has deduced \(\theta\) and one has deduced \(\psi\). Conversely, one can deduce \(\theta\) from \((\theta \amp \psi)\) and one can deduce \(\psi\) from \((\theta \amp \psi)\):

- \((\amp \mathrm{I})\) If \(\Gamma_1 \vdash \theta\) and \(\Gamma_2 \vdash \psi\), then \(\Gamma_1, \Gamma_2 \vdash(\theta \amp \psi)\).
- \((\amp \mathrm{E})\) If \(\Gamma \vdash(\theta \amp \psi)\) then \(\Gamma \vdash \theta\); and if \(\Gamma \vdash(\theta \amp \psi)\) then \(\Gamma \vdash \psi\).

The name “&I” stands for “&-introduction”; “&E” stands for “&-elimination”.

Since, the symbol “\(\vee\)” corresponds to the English “or”, \((\theta \vee \psi)\) should be deducible from \(\theta\), and \((\theta \vee \psi)\) should also be deducible from \(\psi\):

- \((\vee \mathrm{I})\) If \(\Gamma \vdash \theta\) then \(\Gamma \vdash(\theta \vee \psi)\); if \(\Gamma \vdash \psi\) then \(\Gamma \vdash(\theta \vee \psi)\).

The elimination rule is a bit more complicated. Suppose that “\(\theta\) or \(\psi\)” is true. Suppose also that \(\phi\) follows from \(\theta\) and that \(\phi\) follows from \(\psi\). One can reason that if \(\theta\) is true, then \(\phi\) is true. If instead \(\psi\) is true, we still have that \(\phi\) is true. So either way, \(\phi\) must be true.

- \((\vee \mathrm{E})\) If \(\Gamma_1 \vdash(\theta \vee \psi), \Gamma_2, \theta \vdash \phi\) and \(\Gamma_3, \psi \vdash \phi\), then \(\Gamma_1, \Gamma_2, \Gamma_3 \vdash \phi\).

For the next clauses, recall that the symbol, “\(\rightarrow\)”, is an analogue of the English “if … then … ” construction. If one knows, or assumes \((\theta \rightarrow \psi)\) and also knows, or assumes \(\theta\), then one can conclude \(\psi\). Conversely, if one deduces \(\psi\) from an assumption \(\theta\), then one can conclude that \((\theta \rightarrow \psi)\).

- \(({\rightarrow}\mathrm{I})\) If \(\Gamma, \theta \vdash \psi\), then \(\Gamma \vdash(\theta \rightarrow \psi)\).
- \(({\rightarrow}\mathrm{E})\) If \(\Gamma_1 \vdash(\theta \rightarrow \psi)\) and \(\Gamma_2 \vdash \theta\), then \(\Gamma_1, \Gamma_2 \vdash \psi\).

This elimination rule is sometimes called “modus ponens”. In some logic texts, the introduction rule is proved as a “deduction theorem”.

Our next clauses are for the negation sign, “\(\neg\)”. The
underlying idea is that a sentence \(\psi\) is inconsistent with its
negation \(\neg \psi\). They cannot both be true. We call a pair of
sentences \(\psi, \neg \psi\) *contradictory opposites*. If one
can deduce such a pair from an assumption \(\theta\), then one can
conclude that \(\theta\) is false, or, in other words, one can conclude
\(\neg \theta\).

- \((\neg \mathrm{I})\) If \(\Gamma_1, \theta \vdash \psi\) and \(\Gamma_2, \theta \vdash \neg \psi\), then \(\Gamma_1, \Gamma_2 \vdash \neg \theta\).

By (As), we have that \(\{A,\neg A\}\vdash A\) and
\(\{\)*A,\(\neg\)A*\(\}\vdash \neg A\). So by \(\neg\)I we have
that \(\{A\}\vdash \neg \neg A\). However, we do not have the converse
yet. Intuitively, \(\neg \neg \theta\) corresponds to “it is not
the case that it is not the case that” . One might think that
this last is equivalent to \(\theta\), and we have a rule to that
effect:

- (DNE) If \(\Gamma \vdash \neg \neg \theta\), then \(\Gamma \vdash \theta\).

The name DNE stands for “double-negation
elimination”. There is some controversy over this inference. It
is rejected by philosophers and mathematicians who do not hold that
each meaningful sentence is either true or not
true. *Intuitionistic logic* does not sanction the inference in
question (see, for example Dummett [2000], or the entry on
intuitionistic logic, or
history of intuitionistic logic),
but, again, classical logic does.

To illustrate the parts of the deductive system \(D\) presented thus far, we show that \(\vdash(A \vee \neg A)\):

- \(\{\neg(A \vee \neg A), A\}\vdash \neg(A \vee \neg A)\), by (As)
- \(\{\neg(A \vee \neg A), A\}\vdash A\), by (As).
- \(\{\neg(A \vee \neg A), A\}\vdash(A \vee \neg A)\), by \((\vee\)I), from (ii).
- \(\{\neg(A \vee \neg A)\}\vdash \neg A\), by \((\neg\)I), from (i) and (iii).
- \(\{\neg(A \vee \neg A), \neg A\}\vdash \neg(A \vee \neg A)\), by (As)
- \(\{\neg(A \vee \neg A), \neg A\}\vdash \neg A\), by (As)
- \(\{\neg(A \vee \neg A), \neg A\}\vdash(A \vee \neg A)\), by \((\vee\)I), from (vi).
- \(\{\neg(A \vee \neg A)\}\vdash \neg \neg A\), by \((\neg\)I), from (v) and (vii).
- \(\vdash \neg \neg(A \vee \neg A)\), by \((\neg\)I), from (iv) and (viii).
- \(\vdash(A \vee \neg A)\), by (DNE), from (ix).

The principle \((\theta \vee \neg \theta)\) is sometimes called the
*law of excluded middle*. It is not valid in intuitionistic
logic.

Let \(\theta, \neg \theta\) be a pair of contradictory opposites, and let \(\psi\) be any sentence at all. By (As) we have \(\{\theta, \neg \theta, \neg \psi \}\vdash \theta\) and \(\{\theta, \neg \theta, \neg \psi \}\vdash \neg \theta\). So by \((\neg\)I), \(\{\theta, \neg \theta \}\vdash \neg \neg \psi\). So, by (DNE) we have \(\{\theta , \neg \theta \}\vdash \psi\) . That is, anything at all follows from a pair of contradictory opposites. Some logicians introduce a rule to codify a similar inference:

If \(\Gamma_1 \vdash \theta\) and \(\Gamma_2 \vdash \neg \theta\), then for any sentence \(\psi, \Gamma_1, \Gamma_2 \vdash \psi\)

The inference is sometimes called *ex falso quodlibet* or, more
colorfully, *explosion*. Some call it
“\(\neg\)-elimination”, but perhaps this stretches the notion
of “elimination” a bit. We do not officially
include *ex falso quodlibet* as a separate rule in \(D\),
but as will be shown below (Theorem 10), each instance of it is
derivable in our system \(D\).

Some logicians object to *ex falso quodlibet*, on the ground
that the sentence \(\psi\) may be *irrelevant* to any of the
premises in \(\Gamma\). Suppose, for example, that one starts with some
premises \(\Gamma\) about human nature and facts about certain people,
and then deduces both the sentence “Clinton had extra-marital
sexual relations” and “Clinton did not have extra-marital
sexual relations”. One can perhaps conclude that there is
something wrong with the premises \(\Gamma\). But should we be allowed to
then deduce *anything at all* from \(\Gamma\)? Should we be
allowed to deduce “The economy is sound”?

A small minority of logicians, called *dialetheists*, hold
that some contradictions are actually true. For them, *ex falso
quodlibet* is not truth-preserving.

Deductive systems that demur from *ex falso quodlibet* are
called *paraconsistent*. Most relevant logics are
paraconsistent. See the entries on
relevance logic,
paraconsistent logic,
and
dialetheism.
Or see Anderson and Belnap [1975], Anderson, Belnap, and Dunn [1992],
and Tennant [1997] for fuller overviews of relevant logic; and Priest
[2006],[2006a] for dialetheism. Deep philosophical issues concerning
the nature of
logical consequence
are involved. Far be it for
an article in a philosophy encyclopedia to avoid philosophical issues,
but space considerations preclude a fuller treatment of this issue
here. Suffice it to note that the inference *ex falso
quodlibet* is sanctioned in systems of *classical logic*,
the subject of this article. It is essential to establishing the
balance between the deductive system and the semantics (see §5
below).

The next pieces of \(D\) are the clauses for the quantifiers. Let
\(\theta\) be a formula, \(v\) a
variable, and \(t\) a term (i.e., a variable or a constant). Then
define
\(\theta(v|t)\) to be
the result of substituting \(t\) for each *free* occurrence of
\(v\) in
\(\theta\). So, if
\(\theta\) is \((Qx \amp \exists\)*xPxy*), then
\(\theta(x|c)\) is
\((Qc \amp \exists\)*xPxy*). The last
occurrence of \(x\) is not free.

A sentence in the form \(\forall v \theta\) is an analogue of the English “for every \(v, \theta\) holds”. So one should be able to infer \(\theta(v|t)\) from \(\forall v \theta\) for any closed term \(t\). Recall that the only closed terms in our system are constants.

- \((\forall \mathrm{E})\) If \(\Gamma \vdash \forall v \theta\), then \(\Gamma \vdash \theta(v|t)\), for any closed term \(t\).

The idea here is that if \(\forall v \theta\) is true, then \(\theta\) should hold of \(t\), no matter what \(t\) is.

The introduction clause for the universal quantifier is a bit more complicated. Suppose that a sentence \(\theta\) contains a closed term \(t\), and that \(\theta\) has been deduced from a set of premises \(\Gamma\). If the closed term \(t\) does not occur in any member of \(\Gamma\), then \(\theta\) will hold no matter which object \(t\) may denote. That is, \(\forall v \theta\) follows.

- \((\forall \mathrm{I})\) For any closed term \(t\), if \(\Gamma\vdash\theta (v|t)\), then \(\Gamma\vdash\forall v\theta\) provided that \(t\) is not in \(\Gamma\) or \(\theta\).

This rule \((\forall \mathbf{I})\) corresponds to a common inference in
mathematics. Suppose that a mathematician says “let \(n\)
be a natural number” and goes on to show that \(n\) has a
certain property \(P\), without assuming anything
about \(n\) (except that it is a natural number). She then
reminds the reader that \(n\) is “arbitrary”, and
concludes that \(P\) holds for *all* natural numbers. The
condition that the term
\(t\) not occur in any premise is what guarantees that it is
indeed “arbitrary”. It could be any object, and so anything we
conclude about it holds for all objects.

The existential quantifier is an analogue of the English expression “there exists”, or perhaps just “there is”. If we have established (or assumed) that a given object \(t\) has a given property, then it follows that there is something that has that property.

- \((\exists \mathrm{I})\) For any closed term \(t\), if \(\Gamma\vdash\theta (v|t)\) then \(\Gamma\vdash\exists v\theta\).

The elimination rule for \(\exists\) is not quite as simple:

- \((\exists \mathrm{E})\) For any closed term \(t\), if \(\Gamma_1\vdash\exists v\theta\) and \(\Gamma_2, \theta(v|t)\vdash\phi\), then \(\Gamma_1 ,\Gamma_2\vdash\phi\), provided that \(t\) does not occur in \(\phi\), \(\Gamma_2\) or \(\theta\).

This elimination rule also corresponds to a common inference. Suppose
that a mathematician assumes or somehow concludes that there is a
natural number with a given property \(P\). She then says “let
\(n\) be such a natural number, so that \(Pn\)”, and goes on to
establish a sentence \(\phi\), which does not mention the number
\(n\). If the derivation of \(\phi\) does not invoke anything about
\(n\) (other than the assumption that it has the given property
\(P)\), then \(n\) could have been any number that has the property
\(P\). That is, \(n\) is an *arbitrary* number with property
\(P\). It does not matter which number \(n\) is. Since
\(\phi\) does not mention \(n\), it follows from the assertion that
something has property \(P\). The provisions added to \((\exists\)E)
are to guarantee that \(t\) is “arbitrary”.

The final items are the rules for the identity sign “=”. The introduction rule is about a simple as can be:

- \(({=}\mathrm{I})\) \(\Gamma \vdash t=t\), where \(t\) is any closed term.

This “inference” corresponds to the truism that everything is identical to itself. The elimination rule corresponds to a principle that if \(a\) is identical to \(b\), then anything true of \(a\) is also true of \(b\).

- \(({=}\mathrm{E})\) For any closed terms \(t_1\) and \(t_2\), if \(\Gamma_1 \vdash t_1 =t_2\) and \(\Gamma_2 \vdash \theta\), then \(\Gamma_1, \Gamma_2 \vdash \theta'\), where \(\theta'\) is obtained from \(\theta\) by replacing one or more occurances of \(t_1\) with \(t_2\).

The rule \(({=}\mathrm{E})\) indicates a certain restriction in the expressive resources of our language. Suppose, for example, that Harry is identical to Donald (since his mischievous parents gave him two names). According to most people’s intuitions, it would not follow from this and “Dick knows that Harry is wicked” that “Dick knows that Donald is wicked”, for the reason that Dick might not know that Harry is identical to Donald. Contexts like this, in which identicals cannot safely be substituted for each other, are called “opaque”. We assume that our language \(\LKe\) has no opaque contexts.

One final clause completes the description of the deductive system \(D\):

- (*) That’s all folks. \(\Gamma \vdash \theta\) only if \(\theta\) follows from members of \(\Gamma\) by the above rules.

Again, this clause allows proofs by induction on the rules used to establish an argument. If a property of arguments holds of all instances of (As) and \(({=}\mathrm{I})\), and if the other rules preserve the property, then every argument that is deducible in \(D\) enjoys the property in question.

Before moving on to the model theory for \(\LKe\), we pause to note a few features of the deductive system. To illustrate the level of rigor, we begin with a lemma that if a sentence does not contain a particular closed term, we can make small changes to the set of sentences we prove it from without problems. We allow ourselves the liberty here of extending some previous notation: for any terms \(t\) and \(t'\), and any formula \(\theta\), we say that \(\theta(t|t')\) is the result of replacing all free occurrences of \(t\) in \(\theta\) with \(t'\).

**Lemma 7.** If \(\Gamma_1\) and \(\Gamma_2\) differ
only in that wherever \(\Gamma_1\) contains \(\theta\), \(\Gamma_2\)
contains \(\theta(t|t')\), then for any sentence \(\phi\) not
containing \(t\) or \(t'\), if \(\Gamma_1\vdash\phi\) then
\(\Gamma_2\vdash\phi\).

**Proof:** The proof proceeds by induction on the number
of steps in the proof of \(\phi\). Crucial to this proof is the fact
that \(\theta=\theta(t|t')\) whenever \(\theta\) does not contain
\(t\) or \(t'\). When the number of steps in the proof of \(\phi\) is
one, this means that the last (and only) rule applied is (As) or
(=I). Then, since \(\phi\) does not contain \(t\) or \(t'\), if
\(\Gamma_1\vdash\phi\) we simply apply the same rule ((As) or (=I)) to
\(\Gamma_2\) to get \(\Gamma_2\vdash\phi\). Assume that there are
\(n>1\) steps in the proof of \(\phi\), and that Lemma 7 holds for any
proof with less than \(n\) steps. Suppose that the \(n^{th}\) rule
applied to \(\Gamma_1\) was (\(\amp I\)). Then \(\phi\) is
\(\psi\amp\chi\), and \(\Gamma_1\vdash\phi\amp\chi\). But then we know
that previous steps in the proof include \(\Gamma_1\vdash\psi\) and
\(\Gamma_1\vdash\chi\), and by induction, we have
\(\Gamma_2\vdash\psi\) and \(\Gamma_2\vdash\chi\), since neither
\(\psi\) nor \(\chi\) contain \(t\) or \(t'\). So, we simply apply
(\(\amp I\)) to \(\Gamma_2\) to get \(\Gamma_2\vdash\psi\amp\chi\) as
required. Suppose now that the last step applied in the proof of
\(\Gamma_1\vdash\phi\) was (\(\amp E\)). Then, at a previous step in
the proof of \(\phi\), we know \(\Gamma_1\vdash\phi\amp\psi\) for some
sentence \(\psi\). If \(\psi\) does not contain \(t\), then we simply
apply (\(\amp E\)) to \(\Gamma_2\) to obtain the desired result. The
only complication is if \(\psi\) contains \(t\). Then we would have
that \(\Gamma_2\vdash (\phi\amp\psi)(t|t')\). But, since
\((\phi\amp\psi)(t|t')\) is \(\phi(t|t')\amp\psi(t|t')\), and
\(\phi(t|t')\) is just \(\phi\), we can just apply (\(\amp E\)) to get
\(\Gamma_2\vdash\phi\) as required. The cases for the other rules are
similar.

**Theorem 8. The rule of Weakening.** If \(\Gamma_1
\vdash \phi\) and \(\Gamma_1 \subseteq \Gamma_2\), then \(\Gamma_2
\vdash \phi\).

**Proof:** Again, we proceed by induction on the number
of rules that were used to arrive at \(\Gamma_1 \vdash \phi\).
Suppose that \(n\gt 0\) is a natural number, and that the theorem
holds for any argument that was derived using fewer than \(n\)
rules. Suppose that \(\Gamma_1 \vdash \phi\) using exactly \(n\)
rules. If \(n=1\), then the rule is either (As) or \((=\)I). In these
cases, \(\Gamma_2 \vdash \phi\) by the same rule. If the last rule
applied was (&I), then \(\phi\) has the form \((\theta \amp
\psi)\), and we have \(\Gamma_3 \vdash \theta\) and \(\Gamma_4 \vdash
\psi\), with \(\Gamma_1 = \Gamma_3, \Gamma_4\). We apply the
induction hypothesis to the deductions of \(\theta\) and \(\psi\), to
get \(\Gamma_2 \vdash \theta\) and \(\Gamma_2 \vdash \psi\). and then
apply (&I) to the result to get \(\Gamma_2 \vdash \phi\). Most of
the other cases are exactly like this. Slight complications arise only
in the rules \((\forall\)I) and \((\exists\)E), because there we have
to pay attention to the conditions for the rules.

Suppose that the last rule applied to get \(\Gamma_1 \vdash \phi\) is \((\forall\)I). So \(\phi\) is a sentence of the form \(\forall v\theta\), and we have \(\Gamma_1 \vdash \theta (v|t)\) and \(t\) does occur in any member of \(\Gamma_1\) or in \(\theta\). The problem is that \(t\) may occur in a member of \(\Gamma_2\), and so we cannot just invoke the induction hypothesis and apply \((\forall\)I) to the result. So, let \(t'\) be a term not occurring in any sentence in \(\Gamma_2\). Let \(\Gamma'\) be the result of substituting \(t'\) for all \(t\) in \(\Gamma_2\). Then, since \(t\) does not occur in \(\Gamma_1\), \(\Gamma_1\subseteq\Gamma'\). So, the induction hypothesis gives us \(\Gamma'\vdash\theta (v|t)\), and we know that \(\Gamma'\) does not contain \(t\), so we can apply (\(\forall I\)) to get \(\Gamma'\vdash\forall v\theta\). But \(\forall v\theta\) does not contain \(t\) or \(t'\), so \(\Gamma_2\vdash\forall v\theta\) by Lemma 7.

Suppose that the last rule applied was \((\exists\)E), we have \(\Gamma_3 \vdash \exists v\theta\) and \(\Gamma_4, \theta (v|t) \vdash \phi\), with \(\Gamma_1\) being \(\Gamma_3, \Gamma_4\), and \(t\) not in \(\phi\), \(\Gamma_4\) or \(\theta\). If \(t\) does not occur free in \(\Gamma_2\), we apply the induction hypothesis to get \(\Gamma_2 \vdash \exists v\theta\), and then \((\exists\)E) to end up with \(\Gamma_2 \vdash \phi\). If \(t\) does occur free in \(\Gamma_2\), then we follow a similar proceedure to \(\forall I\), using Lemma 7.

Theorem 8 allows us to add on premises at will. It follows that \(\Gamma \vdash \phi\) if and only if there is a subset \(\Gamma'\subseteq \Gamma\) such that \(\Gamma'\vdash \phi\). Some systems of relevant logic do not have weakening, nor does substructural logic (See the entries on relevance logic, substructural logics, and linear logic).

By clause (*), all derivations are established in a finite number of steps. So we have

**Theorem 9**. \(\Gamma \vdash \phi\) if and only if
there is a finite \(\Gamma'\subseteq \Gamma\) such that
\(\Gamma'\vdash \phi\).

**Theorem 10**. The rule of *ex falso quodlibet*
is a “derived rule” of \(D\): if \(\Gamma_1 \vdash
\theta\) and \(\Gamma_2 \vdash \neg \theta\), then \(\Gamma_1,\Gamma_2
\vdash \psi\), for any sentence \(\psi\).

**Proof:** Suppose that \(\Gamma_1 \vdash \theta\) and
\(\Gamma_2 \vdash \neg \theta\). Then by Theorem \(8, \Gamma_1,\neg
\psi \vdash \theta\), and \(\Gamma_2,\neg \psi \vdash \neg
\theta\). So by \((\neg\)I), \(\Gamma_1, \Gamma_2 \vdash \neg \neg
\psi\). By (DNE), \(\Gamma_1, \Gamma_2 \vdash \psi\).

**Theorem 11. The rule of Cut**. If \(\Gamma_1 \vdash
\psi\) and \(\Gamma_2, \psi \vdash \theta\), then \(\Gamma_1, \Gamma_2
\vdash \theta\).

**Proof:** Suppose \(\Gamma_1 \vdash \psi\) and
\(\Gamma_2, \psi \vdash \theta\). We proceed by induction on the
number of rules used to establish \(\Gamma_2, \psi \vdash \theta\).
Suppose that \(n\) is a natural number, and that the theorem holds for
any argument that was derived using fewer than \(n\) rules. Suppose
that \(\Gamma_2, \psi \vdash \theta\) was derived using exactly \(n\)
rules. If the last rule used was \((=\)I), then \(\Gamma_1, \Gamma_2
\vdash \theta\) is also an instance of \((=\)I). If \(\Gamma_2, \psi
\vdash \theta\) is an instance of (As), then either \(\theta\) is
\(\psi\), or \(\theta\) is a member of \(\Gamma_2\). In the former
case, we have \(\Gamma_1 \vdash \theta\) by supposition, and get
\(\Gamma_1, \Gamma_2 \vdash \theta\) by Weakening (Theorem 8). In the
latter case, \(\Gamma_1, \Gamma_2 \vdash \theta\) is itself an
instance of (As). Suppose that \(\Gamma_2, \psi \vdash \theta\) was
obtained using (&E). Then we have \(\Gamma_2, \psi \vdash(\theta
\amp \phi)\). The induction hypothesis gives us \(\Gamma_1, \Gamma_2
\vdash(\theta \amp \phi)\), and (&E) produces \(\Gamma_1, \Gamma_2
\vdash \theta\). The remaining cases are similar.

Theorem 11 allows us to chain together inferences. This fits the practice of establishing theorems and lemmas and then using those theorems and lemmas later, at will. The cut principle is, some think, essential to reasoning. In some logical systems, the cut principle is a deep theorem; in others it is invalid. The system here was designed, in part, to make the proof of Theorem 11 straightforward.

If \(\Gamma \vdash_D \theta\), then
we say that the sentence \(\theta\) is a *deductive consequence*
of the set of sentences \(\Gamma\), and that the argument
\(\langle \Gamma,\theta \rangle\) is *deductively valid*. A sentence
\(\theta\) is a *logical theorem*, or a *deductive logical
truth*,
if \(\vdash_D \theta\). That is,
\(\theta\) is a logical theorem if it is a deductive consequence of the
empty set. A set \(\Gamma\) of sentences is *consistent* if there
is no sentence \(\theta\) such that
\(\Gamma \vdash_D \theta\) and
\(\Gamma \vdash_D \neg \theta\).
That is, a set is consistent if it does not entail a pair of
contradictory opposite sentencess.

**Theorem 12**. A set \(\Gamma\) is consistent if and
only if there is a sentence \(\theta\) such that it is not the case
that \(\Gamma \vdash \theta\).

**Proof:** Suppose that \(\Gamma\) is consistent and let
\(\theta\) be any sentence. Then either it is not the case that
\(\Gamma \vdash \theta\) or it is not the case that \(\Gamma \vdash
\neg \theta\). For the converse, suppose that \(\Gamma\) is
inconsistent and let \(\psi\) be any sentence. We have that there is a
sentence such that both \(\Gamma \vdash \theta\) and \(\Gamma \vdash
\neg \theta\). By *ex falso quodlibet* (Theorem 10), \(\Gamma
\vdash \psi\).

Define a set \(\Gamma\) of sentences of the language \(\LKe\) to
be *maximally consistent* if \(\Gamma\) is consistent and for
every sentence \(\theta\) of \(\LKe\), if \(\theta\) is not in
\(\Gamma\), then \(\Gamma,\theta\) is inconsistent. In other words,
\(\Gamma\) is maximally consistent if \(\Gamma\) is consistent, and
adding any sentence in the language not already in \(\Gamma\) renders
it inconsistent. Notice that if \(\Gamma\) is maximally consistent
then \(\Gamma \vdash \theta\) if and only if \(\theta\) is in
\(\Gamma\).

**Theorem 13. The Lindenbaum Lemma.** Let \(\Gamma\) be
any consistent set of sentences of \(\LKe .\) Then there is a set
\(\Gamma'\) of sentences of \(\LKe\) such that \(\Gamma \subseteq
\Gamma'\) and \(\Gamma'\) is maximally consistent.

**Proof:** Although this theorem holds in general, we
assume here that the set \(K\) of non-logical terminology is either
finite or denumerably infinite (i.e., the size of the natural numbers,
usually called \(\aleph_0)\). It follows that there is an enumeration
\(\theta_0, \theta_1,\ldots\) of the sentences of \(\LKe\), such that
every sentence of \(\LKe\) eventually occurs in the list. Define a
sequence of sets of sentences, by recursion, as follows: \(\Gamma_0\)
is \(\Gamma\); for each natural number \(n\), if \(\Gamma_n,
\theta_n\) is consistent, then let \(\Gamma_{n+1} = \Gamma_n,
\theta_n\). Otherwise, let \(\Gamma_{n+1} = \Gamma_n\). Let
\(\Gamma'\) be the union of all of the sets \(\Gamma_n\). Intuitively,
the idea is to go through the sentences of \(\LKe\), throwing each one
into \(\Gamma'\) if doing so produces a consistent set. Notice that
each \(\Gamma_n\) is consistent. Suppose that \(\Gamma'\) is
inconsistent. Then there is a sentence \(\theta\) such that
\(\Gamma'\vdash \theta\) and \(\Gamma'\vdash \neg \theta\). By Theorem
9 and Weakening (Theorem 8), there is finite subset \(\Gamma''\) of
\(\Gamma'\) such that \(\Gamma''\vdash \theta\) and \(\Gamma''\vdash
\neg \theta\). Because \(\Gamma''\) is finite, there is a natural
number \(n\) such that every member of \(\Gamma''\) is in
\(\Gamma_n\). So, by Weakening again, \(\Gamma_n \vdash \theta\) and
\(\Gamma_n \vdash \neg \theta\). So \(\Gamma_n\) is inconsistent,
which contradicts the construction. So \(\Gamma'\) is consistent. Now
suppose that a sentence \(\theta\) is not in \(\Gamma'\). We have to
show that \(\Gamma', \theta\) is inconsistent. The sentence \(\theta\)
must occur in the aforementioned list of sentences; say that \(\theta\)
is \(\theta_m\). Since \(\theta_m\) is not in \(\Gamma'\), then it is
not in \(\Gamma_{m+1}\). This happens only if \(\Gamma_m,
\theta_m\) is inconsistent. So a pair of contradictory opposites can
be deduced from \(\Gamma_m,\theta_m\). By Weakening, a pair of
contradictory opposites can be deduced from \(\Gamma', \theta_m\). So
\(\Gamma', \theta_m\) is inconsistent. Thus, \(\Gamma'\) is maximally
consistent.

Notice that this proof uses a principle corresponding to the law of excluded middle. In the construction of \(\Gamma'\), we assumed that, at each stage, either \(\Gamma_n\) is consistent or it is not. Intuitionists, who demur from excluded middle, do not accept the Lindenbaum lemma.

## 4. Semantics

Let \(K\) be a set of non-logical terminology. An
*interpretation* for the language \(\LKe\) is a structure \(M =
\langle d,I\rangle\), where \(d\) is a non-empty set, called
the *domain-of-discourse*, or simply the
*domain*, of the interpretation, and \(I\) is an
*interpretation function*. Informally, the domain is what we
interpret the language
\(\LKe\) to be
about. It is what the variables range over. The interpretation
function assigns appropriate extensions to the non-logical terms. In
particular,

If \(c\) is a constant in \(K\), then \(I(c)\) is a member of the domain \(d\).

Thus we assume that every constant denotes something. Systems where
this is not assumed are called *free logics* (see the entry
on free logic). Continuing,

If \(P^0\) is a zero-place predicate letter in \(K\), then \(I(P)\) is a truth value, either truth or falsehood.

If \(Q^1\) is a one-place predicate letter in \(K\), then \(I(Q)\) is a subset of \(d\). Intuitively, \(I(Q)\) is the set of members of the domain that the predicate \(Q\) holds of. For example, \(I(Q)\) might be the set of red members of the domain.

If \(R^2\) is a two-place predicate letter in \(K\), then \(I(R)\) is a set of ordered pairs of members of \(d\). Intuitively, \(I(R)\) is the set of pairs of members of the domain that the relation \(R\) holds between. For example, \(I(R)\) might be the set of pairs \(\langle a,b\rangle\) such that \(a\) and \(b\) are the members of the domain for which \(a\) loves \(b\).

In general, if *S\(^n\)* is an \(n\)-place predicate letter in
\(K\), then \(I(S)\) is a set of ordered \(n\)-tuples of members of
\(d\).

Define \(s\) to be a *variable-assignment*, or simply an
*assignment*, on an interpretation \(M\), if \(s\) is a
function from the variables to the domain \(d\) of \(M\). The role of
variable-assignments is to assign denotations to the *free*
variables of open formulas. (In a sense, the quantifiers determine the
“meaning” of the bound variables.)

Let \(t\) be a term of \(\LKe\). We define the *denotation* of
\(t\) in \(M\) under \(s\), in terms of the interpretation function
and variable-assignment:

If \(t\) is a constant, then \(D_{M,s}(t)\) is \(I(t)\), and if \(t\) is a variable, then \(D_{M,s}(t)\) is \(s(t)\).

That is, the interpretation \(M\) assigns denotations to the constants, while the variable-assignment assigns denotations to the (free) variables. If the language contained function symbols, the denotation function would be defined by recursion.

We now define a relation of *satisfaction* between
interpretations, variable-assignments, and formulas of \(\LKe\). If
\(\phi\) is a formula of \(\LKe, M\) is an interpretation for
\(\LKe\), and \(s\) is a variable-assignment on \(M\), then we write
\(M,s\vDash \phi\) for \(M\) *satisfies* \(\phi\) *under the
assignment* \(s\). The idea is that \(M,s\vDash \phi\) is an
analogue of “\(\phi\) comes out true when interpreted as in
\(M\) via \(s\)”.

We proceed by recursion on the complexity of the formulas of \(\LKe\).

If \(t_1\) and \(t_2\) are terms, then \(M,s\vDash t_1 =t_2\) if and only if \(D_{M,s}(t_1)\) is the same as \(D_{M,s}(t_2)\).

This is about as straightforward as it gets. An identity \(t_1 =t_2\) comes out true if and only if the terms \(t_1\) and \(t_2\) denote the same thing.

If \(P^0\) is a zero-place predicate letter in \(K\), then \(M,s\vDash P\) if and only if \(I(P)\) is truth.

If *S\(^n\)* is an \(n\)-place predicate letter in \(K\) and
\(t_1, \ldots,t_n\) are terms, then \(M,s\vDash St_1 \ldots t_n\) if
and only if the \(n\)-tuple \(\langle D_{M,s}(t_1),
\ldots,D_{M,s}(t_n)\rangle\) is in \(I(S)\).

This takes care of the atomic formulas. We now proceed to the compound formulas of the language, more or less following the meanings of the English counterparts of the logical terminology.

\(M,s\vDash \neg \theta\) if and only if it is not the case that \(M,s\vDash \theta\).

\(M,s\vDash(\theta \amp \psi)\) if and only if both \(M,s\vDash \theta\) and \(M,s\vDash \psi\).

\(M,s\vDash(\theta \vee \psi)\) if and only if either \(M,s\vDash \theta\) or \(M,s\vDash \psi\).

\(M,s\vDash(\theta \rightarrow \psi)\) if and only if either it is not the case that \(M,s\vDash \theta\), or \(M,s\vDash \psi\).

\(M,s\vDash \forall v\theta\) if and only if \(M,s'\vDash \theta\), for every assignment \(s'\) that agrees with \(s\) except possibly at the variable \(v\).

The idea here is that \(\forall v\theta\) comes out true if and only if \(\theta\) comes out true no matter what is assigned to the variable \(v\). The final clause is similar.

\(M,s\vDash \exists v\theta\) if and only if \(M,s'\vDash \theta\), for some assignment \(s'\) that agrees with \(s\) except possibly at the variable \(v\).

So \(\exists v\theta\) comes out true if there is an assignment to \(v\) that makes \(\theta\) true.

Theorem 6, unique readability, assures us that this definition is coherent. At each stage in breaking down a formula, there is exactly one clause to be applied, and so we never get contradictory verdicts concerning satisfaction.

As indicated, the role of variable-assignments is to give denotations to the free variables. We now show that variable-assignments play no other role.

**Theorem 14.** For any formula \(\theta\), if \(s_1\)
and \(s_2\) agree on the free variables in \(\theta\), then \(M,s_1
\vDash \theta\) if and only if \(M,s_2 \vDash \theta\).

**Proof:** We proceed by induction on the complexity of
the formula \(\theta\). The theorem clearly holds if \(\theta\) is
atomic, since in those cases only the values of the
variable-assignments at the variables in \(\theta\) figure in the
definition. Assume, then, that the theorem holds for all formulas less
complex than \(\theta\). And suppose that \(s_1\) and \(s_2\) agree on
the free variables of \(\theta\). Assume, first, that \(\theta\) is a
negation, \(\neg \psi\). Then, by the induction hypothesis, \(M,s_1
\vDash \psi\) if and only if \(M,s_2 \vDash \psi\). So, by the clause
for negation, \(M,s_1 \vDash \neg \psi\) if and only if \(M,s_2 \vDash
\neg \psi\). The cases where the main connective in \(\theta\) is
binary are also straightforward. Suppose that \(\theta\) is \(\exists
v\psi\), and that \(M,s_1 \vDash \exists v\psi\). Then there is an
assignment \(s_1'\) that agrees with \(s_1\) except possibly at \(v\)
such that \(M,s_1'\vDash \psi\). Let \(s_2'\) be the assignment that
agrees with \(s_2\) on the free variables not in \(\psi\) and agrees
with \(s_1'\) on the others. Then, by the induction hypothesis,
\(M,s_2'\vDash \psi\). Notice that \(s_2'\) agrees with \(s_2\) on
every variable except possibly \(v\). So \(M,s_2 \vDash \exists
v\psi\). The converse is the same, and the case where \(\theta\)
begins with a universal quantifier is similar.

By Theorem 14, if \(\theta\) is a sentence, and \(s_1, s_2\), are any two variable-assignments, then \(M,s_1 \vDash \theta\) if and only if \(M,s_2 \vDash \theta\). So we can just write \(M\vDash \theta\) if \(M,s\vDash \theta\) for some, or all, variable-assignments \(s\). So we define

\(M\vDash \theta\) where \(\theta\) is a sentence just in case \(M,s\vDash\theta\) for all variable assignments \(s\).

In this case, we call \(M\) a * model * of \(\theta\).

Suppose that \(K'\subseteq K\) are two sets of non-logical terms. If
\(M = \langle d,I\rangle\) is an interpretation of \(\LKe\), then we
define the *restriction* of \(M\) to \(\mathcal{L}1K'{=}\) to be the
interpretation \(M'=\langle d,I'\rangle\) such that \(I'\) is the
restriction of \(I\) to \(K'\). That is, \(M\) and \(M'\) have the
same domain and agree on the non-logical terminology in \(K'\). A
straightforward induction establishes the following:

**Theorem 15**. If \(M'\) is the restriction of \(M\) to
\(\mathcal{L}1K'{=}\), then for every sentence \(\theta\) of
\(\mathcal{L}1K'\), \(M\vDash\theta\) if and only if \(M'\vDash \theta\).

**Theorem 16.** If two interpretations \(M_1\) and
\(M_2\) have the same domain and agree on all of the non-logical
terminology of a sentence \(\theta\), then \(M_1\vDash\theta\) if and
only if \(M_2\vDash \theta\).

In short, the satisfaction of a sentence \(\theta\) only depends on the domain of discourse and the interpretation of the non-logical terminology in \(\theta\).

We say that an argument \(\langle \Gamma,\theta \rangle\)
is *semantically valid*, or just *valid*, written
\(\Gamma \vDash \theta\), if for every interpretation \(M\) of the
language, if \(M\vDash\psi\), for every member \(\psi\) of \(\Gamma\),
then \(M\vDash\theta\). If \(\Gamma \vDash \theta\), we also say that
\(\theta\) is a *logical consequence*, or *semantic
consequence*, or
*model-theoretic consequence* of \(\Gamma\). The definition
corresponds to the informal idea that an argument is valid if it is
not possible for its premises to all be true and its conclusion
false. Our definition of logical consequence also sanctions the common
thesis that a valid argument is truth-preserving--to the extent that
satisfaction represents truth. Officially, an argument in \(\LKe\) is
valid if its conclusion comes out true under every interpretation of
the language in which the premises are true. Validity is the
model-theoretic counterpart to deducibility.

A sentence \(\theta\) is *logically true*, or
*valid*, if \(M\vDash \theta\), for every interpretation
\(M\). A sentence is logically true if and only if
it is a consequence of the empty set. If \(\theta\) is logically true,
then for any set \(\Gamma\) of sentences, \(\Gamma \vDash \theta\).
Logical truth is the model-theoretic counterpart of theoremhood.

A sentence \(\theta\) is *satisfiable* if there is an
interpretation \(M\) such
that \(M\vDash \theta\). That is, \(\theta\) is satisfiable if
there is an interpretation that satisfies it. A set
\(\Gamma\) of sentences is satisfiable if there is an interpretation
\(M\) such that \(M\vDash\theta\), for every sentence \(\theta\) in \(\Gamma\). If \(\Gamma\) is
a set of sentences and if \(M\vDash \theta\) for each sentence
\(\theta\) in \(\Gamma\), then we say that \(M\) is a *model
of* \(\Gamma\). So a set of sentences is satisfiable if it has a
model. Satisfiability is the model-theoretic counterpart to
consistency.

Notice that \(\Gamma \vDash \theta\) if and only if the set
\(\Gamma,\neg \theta\) is not satisfiable. It follows that if a set
\(\Gamma\) is not satisfiable, then if \(\theta\) is any sentence,
\(\Gamma \vDash \theta\). This is a model-theoretic counterpart
to *ex falso quodlibet* (see Theorem 10). We have the
following, as an analogue to Theorem 12:

**Theorem 17**. Let \(\Gamma\) be a set of sentences. The
following are equivalent: (a) \(\Gamma\) is satisfiable; (b) there is
no sentence \(\theta\) such that both \(\Gamma \vDash \theta\) and
\(\Gamma \vDash \neg \theta\); (c) there is some sentence \(\psi\) such
that it is not the case that \(\Gamma \vDash \psi\).

**Proof:** (a)\(\Rightarrow\)(b): Suppose that \(\Gamma\)
is satisfiable and let \(\theta\) be any sentence. There is an
interpretation \(M\) such that \(M\vDash \psi\)
for every member \(\psi\) of \(\Gamma\). By the clause for negations,
we cannot have both \(M\vDash \theta\) and \(M\vDash \neg
\theta\). So either \(\langle \Gamma,\theta \rangle\) is not valid or
else \(\langle \Gamma,\neg \theta \rangle\) is not valid.
(b)\(\Rightarrow\)(c): This is immediate. (c)\(\Rightarrow\)(a):
Suppose that it is not the case that \(\Gamma \vDash \psi\). Then
there is an interpretation \(M\) such that
\(M\vDash \theta\), for every sentence \(\theta\) in \(\Gamma\) and
it is not the case that \(M\vDash \psi\). A fortiori, \(M\)
satisfies every member of \(\Gamma\), and so \(\Gamma\) is
satisfiable.

## 5. Meta-theory

We now present some results that relate the deductive notions to their model-theoretic counterparts. The first one is probably the most straightforward. We motivated both the various rules of the deductive system \(D\) and the various clauses in the definition of satisfaction in terms of the meaning of the English counterparts to the logical terminology (more or less, with the same simplifications in both cases). So one would expect that an argument is deducible, or deductively valid, only if it is semantically valid.

**Theorem 18. Soundness.** For any sentence \(\theta\) and
set \(\Gamma\) of sentences, if \(\Gamma \vdash_D \theta\), then
\(\Gamma \vDash \theta\).

**Proof:** We proceed by induction on the number of
clauses used to establish \(\Gamma \vdash \theta\). So let \(n\) be a
natural number, and assume that the theorem holds for any argument
established as deductively valid with fewer than \(n\) steps. And
suppose that \(\Gamma \vdash \theta\) was established using exactly
\(n\) steps. If the last rule applied was \((=\)I) then \(\theta\) is
a sentence in the form \(t=t\), and so \(\theta\) is logically true. A
fortiori, \(\Gamma \vDash \theta\). If the last rule applied was
(As), then \(\theta\) is a member of \(\Gamma\), and so of course any
interpretation that satisfies every member of
\(\Gamma\) also satisfies \(\theta\). Suppose the last rule applied is
(&I). So \(\theta\) has the form \((\phi \amp \psi)\), and we have
\(\Gamma_1 \vdash \phi\) and \(\Gamma_2 \vdash \psi\), with \(\Gamma =
\Gamma_1, \Gamma_2\). The induction hypothesis gives us \(\Gamma_1
\vDash \phi\) and \(\Gamma_2 \vDash \psi\). Suppose that \(M\)
satisfies every member of \(\Gamma\). Then \(M\) satisfies every
member of \(\Gamma_1\), and so \(M\) satisfies \(\phi\). Similarly,
\(M\) satisfies every member of \(\Gamma_2\), and so \(M\)
satisfies \(\psi\). Thus, by the clause for “\(\amp\)” in
the definition of satisfaction, \(M\) satisfies \(\theta\). So
\(\Gamma \vDash \theta\).

Suppose the last clause applied was \((\exists\mathrm{E})\). So we have \(\Gamma_1 \vdash \exists v\phi\) and \(\Gamma_2, \phi(v|t) \vdash \theta\), where \(\Gamma = \Gamma_1, \Gamma_2\), and \(t\) does not occur in \(\phi , \theta \), or in any member of \(\Gamma_2\).

We need to show that \(\Gamma\vDash\theta\). By the induction hypothesis, we have that \(\Gamma_1\vDash\exists v\phi\) and \(\Gamma_2, \phi(v|t)\vDash\theta\). Let \(M\) be an interpretation such that \(M\) makes every member of \(\Gamma\) true. So, \(M\) makes every member of \(\Gamma_1\) and \(\Gamma_2\) true. Then \(M,s\vDash\exists v\phi\) for all variable assignments \(s\), so there is an \(s'\) such that \(M,s'\vDash\phi\). Let \(M'\) differ from \(M\) only in that \(I_{M'}(t)=s'(v)\). Then, \(M',s'\vDash\phi(v|t)\) and \(M',s'\vDash\Gamma_2\) since \(t\) does not occur in \(\phi\) or \(\Gamma_2\). So, \(M',s'\vDash\theta\). Since \(t\) does not occur in \(\theta\) and \(M'\) differs from \(M\) only with respect to \(I_{M'}(t)\), \(M,s'\vDash\theta\). Since \(\theta\) is a sentence, \(s'\) doesn't matter, so \(M\vDash\theta\) as desired. Notice the role of the restrictions on \((\exists\)E) here. The other cases are about as straightforward.

**Corollary 19.** Let \(\Gamma\) be a set of sentences. If
\(\Gamma\) is satisfiable, then \(\Gamma\) is consistent.

**Proof:** Suppose that \(\Gamma\) is satisfiable. So let
\(M\) be an interpretation such that \(M\)
satisfies every member of \(\Gamma\). Assume that \(\Gamma\) is
inconsistent. Then there is a sentence \(\theta\) such that \(\Gamma
\vdash \theta\) and \(\Gamma \vdash \neg \theta\). By soundness
(Theorem 18), \(\Gamma \vDash \theta\) and \(\Gamma \vDash \neg
\theta\). So we have that \(M\vDash \theta\) and \(M\vDash \neg
\theta\). But this is impossible, given the clause for negation in the
definition of satisfaction.

Even though the deductive system \(D\) and the model-theoretic semantics were developed with the meanings of the logical terminology in mind, one should not automatically expect the converse to soundness (or Corollary 19) to hold. For all we know so far, we may not have included enough rules of inference to deduce every valid argument. The converses to soundness and Corollary 19 are among the most important and influential results in mathematical logic. We begin with the latter.

**Theorem 20. Completeness. Gödel [1930].** Let
\(\Gamma\) be a set of sentences. If \(\Gamma\) is consistent, then
\(\Gamma\) is satisfiable.

**Proof:** The proof of completeness is rather
complex. We only sketch it here. Let \(\Gamma\) be a consistent set of
sentences of \(\LKe\). Again, we assume for simplicity that the set
\(K\) of non-logical terminology is either finite or countably
infinite (although the theorem holds even if \(K\) is
uncountable). The task at hand is to find an interpretation \(M\) such
that \(M\) satisfies every member of \(\Gamma\). Consider the language
obtained from \(\LKe\) by adding a denumerably infinite stock of new
individual constants \(c_0, c_1,\ldots\) We stipulate that the
constants, \(c_0, c_1,\ldots\), are all different from each other and
none of them occur in \(K\). One interesting feature of this
construction, due to Leon Henkin, is that we build an interpretation
of the language from the language itself, using some of the constants
as members of the domain of discourse. Let \(\theta_0 (x), \theta_1
(x),\ldots\) be an enumeration of the formulas of the expanded
language with at most one free variable, so that each formula with at
most one free variable occurs in the list eventually. Define a
sequence \(\Gamma_0, \Gamma_1,\ldots\) of sets of sentences (of the
expanded language) by recursion as follows: \(\Gamma_0 = \Gamma\); and
\(\Gamma_{n+1} = \Gamma_n,(\exists x\theta_n \rightarrow
\theta_{n}(x|c_i))\), where \(c_i\) is the first constant in the above
list that does not occur in \(\theta_n\) or in any member of
\(\Gamma_n\). The underlying idea here is that if \(\exists
x\theta_n\)is true, then \(c_i\) is to be one such \(x\). Let
\(\Gamma\) be the union of the sets \(\Gamma_n\).

We sketch a proof that \(\Gamma'\) is consistent. Suppose that \(\Gamma'\) is inconsistent. By Theorem 9, there is a finite subset of \(\Gamma\) that is inconsistent, and so one of the sets \(\Gamma_m\) is inconsistent. By hypothesis, \(\Gamma_0 = \Gamma\) is consistent. Let \(n\) be the smallest number such that \(\Gamma_n\) is consistent, but \(\Gamma_{n+1} = \Gamma_n,(\exists x\theta_n \rightarrow \theta_{n}(x|c_i))\) is inconsistent. By \((\neg\)I), we have that

\[\tag{1} \Gamma_n \vdash \neg(\exists x\theta_n \rightarrow \theta_n(x|c_i)). \] By *ex falso quodlibet* (Theorem 10), \(\Gamma_n, \neg
\exists x\theta_n, \exists x\theta_n \vdash \theta_n (x|c_i)\). So by
\((\rightarrow\)I), \(\Gamma_n, \neg \exists x\theta_n \vdash(\exists
x\theta_n \rightarrow \theta_n (x|c_i))\). From this and (1), we have
\(\Gamma_n \vdash \neg \neg \exists x\theta_n\), by \((\neg\)I), and
by (DNE) we have

By (As), \(\Gamma_n, \theta_n (x|c_i), \exists x\theta_n \vdash \theta_n (x|c_i)\). So by \((\rightarrow\)I), \(\Gamma_n, \theta_n (x|c_i)\vdash(\exists x\theta_{n} \rightarrow \theta_{n}(x|c_i))\). From this and (1), we have \(\Gamma_n \vdash \neg \theta_n (x|c_i)\), by \((\neg\)I). Let \(t\) be a term that does not occur in \(\theta_n\) or in any member of \(\Gamma_n\). By uniform substitution of \(t\) for \(c_i\), we can turn the derivation of \(\Gamma_n \vdash \neg \theta_n (x|c_i)\) into \(\Gamma_n \vdash \neg \theta_n (x|t)\). By \((\forall\)I), we have

\[\tag{3} \Gamma_n \vdash \forall v\neg \theta_n (x|v). \]
By (As) we have \(\{\forall v\neg \theta_n (x|v),\theta_n\}\vdash
\theta_n\) and by \((\forall\)E) we have \(\{\forall v\neg \theta_n
(x|v), \theta_n\}\vdash \neg \theta_n\). So \(\{\forall v\neg
\theta_n (x|v), \theta_n\}\) is inconsistent. Let \(\phi\) be any
sentence of the language. By *ex falso quodlibet* (Theorem 10), we have that
\(\{\forall v\neg \theta_n (x|v),\theta_n\}\vdash \phi\) and
\(\{\forall v\neg \theta_n (x|v), \theta_n\}\vdash \neg \phi\). So
with (2), we have that \(\Gamma_n, \forall v\neg \theta_n (x|v)\vdash
\phi\) and \(\Gamma_n, \forall v\neg \theta_n (x|v)\vdash \neg \phi\),
by \((\exists\)E). By Cut (Theorem 11), \(\Gamma_n \vdash \phi\) and
\(\Gamma_n \vdash \neg \phi\). So \(\Gamma_n\) is inconsistent,
contradicting the assumption. So \(\Gamma'\) is consistent.

Applying the Lindenbaum Lemma (Theorem 13), let \(\Gamma''\) be a maximally consistent set of sentences (of the expanded language) that contains \(\Gamma'\). So, of course, \(\Gamma''\) contains \(\Gamma\). We can now define an interpretation \(M\) such that \(M\) satisfies every member of \(\Gamma''\).

If we did not have a sign for identity in the language, we would let the domain of \(M\) be the collection of new constants \(\{c_0, c_1, \ldots \}\). But as it is, there may be a sentence in the form \(c_{i}=c_{j}\), with \(i\ne j\), in \(\Gamma''\). If so, we cannot have both \(c_i\) and \(c_j\) in the domain of the interpretation (as they are distinct constants). So we define the domain \(d\) of \(M\) to be the set \(\{c_i\) | there is no \(j\lt i\) such that \(c_{i}=c_{j}\) is in \(\Gamma''\}\). In other words, a constant \(c_i\) is in the domain of \(M\) if \(\Gamma''\) does not declare it to be identical to an earlier constant in the list. Notice that for each new constant \(c_i\), there is exactly one \(j\le i\) such that \(c_j\) is in \(d\) and the sentence \(c_{i}=c_{j}\) is in \(\Gamma''\).

We now define the interpretation function \(I\). Let \(a\) be any constant in the expanded language. By \((=\)I) and \((\exists\)I), \(\Gamma''\vdash \exists x x=a\), and so \(\exists x x=a \in \Gamma''\). By the construction of \(\Gamma'\), there is a sentence in the form \((\exists x x=a \rightarrow c_i =a)\) in \(\Gamma''\). We have that \(c_i =a\) is in \(\Gamma''\). As above, there is exactly one \(c_j\) in \(d\) such that \(c_{i}=c_{j}\) is in \(\Gamma''\). Let \(I(a)=c_j\). Notice that if \(c_i\) is a constant in the domain \(d\), then \(I\)(c\(_i)=c_i\). That is each \(c_i\) in \(d\) denotes itself.

Let \(P\) be a zero-place predicate letter in \(K\). Then \(I(P)\) is truth if \(P\) is in \(\Gamma''\) and \(I(P)\) is falsehood otherwise. Let \(Q\) be a one-place predicate letter in \(K\). Then \(I(Q)\) is the set of constants \(\{\)c\(_i | c_i\) is in \(d\) and the sentence \(Qc\) is in \(\Gamma''\}\). Let \(R\) be a binary predicate letter in \(K\). Then \(I(R)\) is the set of pairs of constants \(\{\langle c_i,c_j\rangle | c_i\) is in \(d, c_j\) is in \(d\), and the sentence \(Rc_{i}c_{j}\) is in \(\Gamma''\}\). Three-place predicates, etc. are interpreted similarly. In effect, \(I\) interprets the non-logical terminology as they are in \(\Gamma''\).

The final item in this proof is a lemma that for every sentence \(\theta\) in the expanded language, \(M\vDash \theta\) if and only if \(\theta\) is in \(\Gamma''\). This proceeds by induction on the complexity of \(\theta\). The case where \(\theta\) is atomic follows from the definitions of \(M\) (i.e., the domain \(d\) and the interpretation function \(I\)). The other cases follow from the various clauses in the definition of satisfaction.

Since \(\Gamma \subseteq \Gamma''\), we have that \(M\) satisfies every member of \(\Gamma\). By Theorem 15, the restriction of \(M\) to the original language \(\LKe\) and \(s\) also satisfies every member of \(\Gamma\). Thus \(\Gamma\) is satisfiable.

A converse to Soundness (Theorem 18) is a straightforward corollary:

**Theorem 21.** For any sentence \(\theta\) and set
\(\Gamma\) of sentences, if \(\Gamma \vDash \theta\), then \(\Gamma
\vdash_D \theta\).

**Proof:** Suppose that \(\Gamma \vDash \theta\). Then
there is no interpretation \(M\) such
that *M* satisfies every member of \(\Gamma\) but does not
satisfy \(\theta\). So the set \(\Gamma,\neg \theta\) is not
satisfiable. By Completeness (Theorem 20), \(\Gamma,\neg \theta\) is
inconsistent. So there is a sentence \(\phi\) such that \(\Gamma,\neg
\theta \vdash \phi\) and \(\Gamma,\neg \theta \vdash \neg \phi\). By
\((\neg\)I), \(\Gamma \vdash \neg \neg \theta\), and by (DNE) \(\Gamma
\vdash \theta\).

Our next item is a corollary of Theorem 9, Soundness (Theorem 18), and Completeness:

**Corollary 22. Compactness.** A set \(\Gamma\) of
sentences is satisfiable if and only if every finite subset of
\(\Gamma\) is satisfiable.

**Proof:** If \(M\) satisfies every member of
\(\Gamma\), then \(M\) satisfies every member of each finite subset
of \(\Gamma\). For the converse, suppose that \(\Gamma\) is not
satisfiable. Then we show that some finite subset of \(\Gamma\) is not
satisfiable. By Completeness (Theorem 20), \(\Gamma\) is
inconsistent. By Theorem 9 (and Weakening), there is a finite subset
\(\Gamma'\subseteq \Gamma\) such that \(\Gamma'\) is inconsistent. By
Corollary \(19, \Gamma'\) is not satisfiable.

Soundness and completeness together entail that an argument is deducible if and only if it is valid, and a set of sentences is consistent if and only if it is satisfiable. So we can go back and forth between model-theoretic and proof-theoretic notions, transferring properties of one to the other. Compactness holds in the model theory because all derivations use only a finite number of premises.

Recall that in the proof of Completeness (Theorem 20), we made the simplifying assumption that the set \(K\) of non-logical constants is either finite or denumerably infinite. The interpretation we produced was itself either finite or denumerably infinite. Thus, we have the following:

**Corollary 23. Löwenheim-Skolem Theorem.** Let
\(\Gamma\) be a satisfiable set of sentences of the language
\(\LKe\). If \(\Gamma\) is either finite or denumerably infinite, then
\(\Gamma\) has a model whose domain is either finite or denumerably
infinite.

In general, let \(\Gamma\) be a satisfiable set of sentences of \(\LKe\), and let \(\kappa\) be the larger of the size of \(\Gamma\) and denumerably infinite. Then \(\Gamma\) has a model whose domain is at most size \(\kappa\).

There is a stronger version of Corollary 23. Let \(M_1 =\langle
d_1,I_1\rangle\) and \(M_2 =\langle d_2,I_2\rangle\) be
interpretations of the language \(\LKe\). Define \(M_1\) to be
a *submodel* of \(M_2\) if \(d_1 \subseteq d_2, I_1 (c) = I_2
(c)\) for each constant \(c\), and \(I_1\) is the restriction of
\(I_2\) to \(d_1\). For example, if \(R\) is a binary relation letter
in \(K\), then for all \(a,b\) in \(d_1\), the pair \(\langle
a,b\rangle\) is in \(I_1 (R)\) if and only if \(\langle a,b\rangle\)
is in \(I_2 (R)\). If we had included function letters among the
non-logical terminology, we would also require that \(d_1\) be closed
under their interpretations in \(M_2\). Notice that if \(M_1\) is a
submodel of \(M_2\), then any variable-assignment on \(M_1\) is also a
variable-assignment on \(M_2\).

Say that two interpretations \(M_1 =\langle d_1,I_1\rangle, M_2
=\langle d_2,I_2\rangle\) are *equivalent* if one of them is a
submodel of the other, and for any formula of the language and any
variable-assignment \(s\) on the submodel, \(M_1,s\vDash \theta\) if
and only if \(M_2,s\vDash \theta\). Notice that if two
interpretations are equivalent, then they satisfy the same
sentences.

**Theorem 25. Downward Löwenheim-Skolem Theorem.**
Let \(M = \langle d,I\rangle\) be an interpretation of the language
\(\LKe\). Let \(d_1\) be any subset of \(d\), and let \(\kappa\) be
the maximum of the size of \(K\), the size of \(d_1\), and denumerably
infinite. Then there is a submodel \(M' = \langle d',I'\rangle\) of
\(M\) such that (1) \(d'\) is not larger than \(\kappa\), and (2)
\(M\) and \(M'\) are equivalent. In particular, if the set \(K\) of
non-logical terminology is either finite or denumerably infinite, then
any interpretation has an equivalent submodel whose domain is either
finite or denumerably infinite.

**Proof:** Like completeness, this proof is complex, and
we rest content with a sketch. The downward Löwenheim-Skolem
theorem invokes the axiom of choice, and indeed, is equivalent to the
axiom of choice (see the entry on
the axiom of choice).
So let \(C\) be a choice function on the powerset of \(d\), so that
for each non-empty subset \(e\subseteq d, C(e)\) is a member of
\(e\). We stipulate that if \(e\) is the empty set, then \(C(e)\) is
\(C(d)\).

Let \(s\) be a variable-assignment on \(M\), let \(\theta\) be a
formula of \(\LKe\), and let \(v\) be a variable. Define the
\(v\)-*witness of* \(\theta\) *over s*, written \(w_v
(\theta,s)\), as follows: Let \(q\) be the set of all elements \(c\in
d\) such that there is a variable-assignment \(s'\) on \(M\) that
agrees with \(s\) on every variable except possibly \(v\), such that
\(M,s'\vDash \theta\), and \(s'(v)=c\). Then \(w_v (\theta,s) =
C(q)\). Notice that if \(M,s\vDash \exists v\theta\), then \(q\) is
the set of elements of the domain that can go for \(v\) in
\(\theta\). Indeed, \(M,s\vDash \exists v\theta\) if and only if \(q\)
is non-empty. So if \(M,s\vDash \exists v\theta\), then \(w_v
(\theta,s)\) (i.e., \(C(q))\) is a chosen element of the domain that
can go for \(v\) in \(\theta\). In a sense, it is a
“witness” that verifies \(M,s\vDash \exists v\theta\).

If \(e\) is a non-empty subset of the domain \(d\), then define a
variable-assignment \(s\) to be an \(e\)-*assignment* if for
all variables \(u, s(u)\) is in \(e\). That is, \(s\) is an
\(e\)-assignment if \(s\) assigns an element of \(e\) to each
variable. Define \(sk(e)\), the
*Skolem-hull* of \(e\), to be the set:

That is, the Skolem-Hull of \(e\) is the set \(e\) together with every \(v\)-witness of every formula over every \(e\)-assignment. Roughly, the idea is to start with \(e\) and then throw in enough elements to make each existentially quantified formula true. But we cannot rest content with the Skolem-hull, however. Once we throw the “witnesses” into the domain, we need to deal with \(sk(e)\) assignments. In effect, we need a set which is its own Skolem-hull, and also contains the given subset \(d_1\).

We define a sequence of non-empty sets \(e_0, e_1,\ldots\) as follows: if the given subset \(d_1\) of \(d\) is empty and there are no constants in \(K\), then let \(e_0\) be \(C(d)\), the choice function applied to the entire domain; otherwise let \(e_0\) be the union of \(d_1\) and the denotations under \(I\) of the constants in \(K\). For each natural number \(n, e_{n+1}\) is \(sk(e_n)\). Finally, let \(d'\) be the union of the sets \(e_n\), and let \(I'\) be the restriction of \(I\) to \(d'\). Our interpretation is \(M' = \langle d',I'\rangle\).

Clearly, \(d_1\) is a subset of \(d'\), and so \(M'\) is a submodel of \(M\). Let \(\kappa\) be the maximum of the size of \(K\), the size of \(d_1\), and denumerably infinite. A calculation reveals that the size of \(d'\) is at most \(\kappa\), based on the fact that there are at most \(\kappa\)-many formulas, and thus, at most \(\kappa\)-many witnesses at each stage. Notice, incidentally, that this calculation relies on the fact that a denumerable union of sets of size at most \(\kappa\) is itself at most \(\kappa\). This also relies on the axiom of choice.

The final item is to show that \(M'\) is equivalent to \(M\): For every formula \(\theta\) and every variable-assignment \(s\) on \(M'\),

\[ M,s\vDash \theta \text{ if and only if } M',s\vDash \theta. \]The proof proceeds by induction on the complexity of \(\theta\). Unfortunately, space constraints require that we leave this step as an exercise.

Another corollary to Compactness (Corollary 22) is the opposite of the Löwenheim-Skolem theorem:

**Theorem 26. Upward Löwenheim-Skolem Theorem.**
Let \(\Gamma\) be any set of sentences of \(\LKe,\) such that for each
natural number \(n\), there is an interpretation \(M_n = \langle
d_n,I_n\rangle\), such that
\(d_n\) has at least \(n\) elements, and \(M_n\) satisfies every
member of \(\Gamma\). In other words, \(\Gamma\) is satisfiable and
there is no finite upper bound to the size of the interpretations that
satisfy every member of \(\Gamma\). Then for any infinite cardinal
\(\kappa\), there is an interpretation \(M=\langle d,I\rangle\), such that the size of \(d\) is *at
least* \(\kappa\) and \(M\) satisfies every member of
\(\Gamma\).

**Proof:** Add a collection of new constants
\(\{c_{\alpha} | \alpha \lt \kappa \}\), of size \(\kappa\), to the
language, so that if \(c\) is a constant in \(K\), then \(c_{\alpha}\)
is different from \(c\), and if \(\alpha \lt \beta \lt \kappa\), then
\(c_{\alpha}\) is a different constant than \(c_{\beta}\). Consider
the set of sentences \(\Gamma'\) consisting of \(\Gamma\) together with
the set \(\{\neg c_{\alpha}=c_{\beta} | \alpha \ne \beta \}\).
That is, \(\Gamma'\) consists of \(\Gamma\) together with statements
to the effect that any two different new constants denote different
objects. Let \(\Gamma''\) be any finite subset of \(\Gamma'\), and let
\(m\) be the number of new constants that occur in \(\Gamma''\). Then
expand the interpretation \(M_m\) to an interpretation \(M_m'\) of the
new language, by interpreting each of the new constants in
\(\Gamma''\) as a different member of the domain \(d_m\). By
hypothesis, there are enough members of \(d_m\) to do this. One can
interpret the other new constants at will. So \(M_m\) is a restriction
of \(M_m'\). By hypothesis (and Theorem 15), \(M'_m\) satisfies
every member of \(\Gamma\). Also \(M'_m\) satisfies the members of
\(\{\neg c_{\alpha}=c_{\beta} | \alpha \ne \beta \}\) that are in
\(\Gamma''\). So \(M'_m\) satisfies every member of \(\Gamma''\).
By compactness, there is an interpretation \(M = \langle d,I\rangle\)
such that \(M\) satisfies every
member of \(\Gamma'\). Since \(\Gamma'\) contains every member of
\(\{\neg c_{\alpha}=c_{\beta} | \alpha \ne \beta \}\), the domain
\(d\) of \(M\) must be of size at least \(\kappa\), since each of the
new constants must have a different denotation. By Theorem 15, the
restriction of \(M\) to the original language \(\LKe\) satisfies every
member of \(\Gamma\).

Combined, the proofs of the downward and upward Löwenheim-Skolem
theorems show that for any satisfiable set \(\Gamma\) of sentences, if
there is no finite bound on the models of \(\Gamma\), then for any
infinite cardinal \(\kappa\), there is a model of \(\Gamma\) whose
domain has size *exactly* \(\kappa\). Moreover, if \(M\) is any
interpretation whose domain is infinite, then for any infinite
cardinal \(\kappa\), there is an interpretation \(M'\) whose domain
has size exactly \(\kappa\) such that \(M\) and \(M'\) are
equivalent.

These results indicate a weakness in the expressive resources of first-order languages like \(\LKe\). No satisfiable set of sentences can guarantee that its models are all denumerably infinite, nor can any satisfiable set of sentences guarantee that its models are uncountable. So in a sense, first-order languages cannot express the notion of “denumerably infinite”, at least not in the model theory. (See the entry on second-order and higher-order logic.)

Let \(A\) be any set of sentences in a first-order language \(\LKe\),
where \(K\) includes terminology for arithmetic, and assume that every
member of \(A\) is true of the natural numbers. We can even let \(A\)
be the set of all sentences in \(\LKe\) that are true of the natural
numbers. Then \(A\) has uncountable models, indeed models of any
infinite cardinality. Such interpretations are among those that are
sometimes called *unintended*, or *non-standard* models
of arithmetic. Let \(B\) be any set of first-order sentences that are
true of the real numbers, and let \(C\) be any first-order
axiomatization of set theory. Then if \(B\) and \(C\) are satisfiable
(in infinite interpretations), then each of them has denumerably
infinite models. That is, any first-order, satisfiable set theory or
theory of the real numbers, has (unintended) models the size of the
natural numbers. This is despite the fact that a sentence (seemingly)
stating that the universe is uncountable is provable in most
set-theories. This situation, known as the
*Skolem paradox*, has generated much discussion, but we must
refer the reader elsewhere for a sample of it (see the entry on
Skolem’s paradox and
Shapiro 1996).

## 6. The One Right Logic?

Logic and reasoning go hand in hand. We say that someone has reasoned poorly about something if they have not reasoned logically, or that an argument is bad because it is not logically valid. To date, research has been devoted to exactly just what types of logical systems are appropriate for guiding our reasoning. Traditionally, classical logic has been the logic suggested as the ideal for guiding reasoning (for example, see Quine [1986], Resnik [1996] or Rumfitt [2015]). For this reason, classical logic has often been called “the one right logic”. See Priest [2006a] for a description of how being the best reasoning-guiding logic could make a logic the one right logic.

That classical logic has been given as the answer to which logic ought to guide reasoning is not unexpected. It has rules which are more or less intuitive, and is surprisingly simple for how strong it is. Plus, it is both sound and complete, which is an added bonus. There are some issues, though. As indicated in Section 5, there are certain expressive limitations to classical logic. Thus, much literature has been written challenging this status quo. This literature in general stems from three positions. The first is that classical logic is not reason-guiding because some other single logic is. Examples of this type of argument can be found in Brouwer [1949], Heyting [1956] and Dummett [2000] who argue that intuitionistic logic is correct, and Anderson and Belnap [1975], who argue relevance logic is correct, among many others. Further, some people propose that an extension of classical logic which can express the notion of “denumerably infinite” (see Shapiro [1991]). The second objection to the claim that classical logic is the one right logic comes from a different perspective: logical pluralists claim that classical logic is not the (single) one right logic, because more than one logic is right. See Beall and Restall [2006] and Shapiro [2014] for examples of this type of view (see also the entry on logical pluralism). Finally, the last objection to the claim that classical logic is the one right logic is that logic(s) is not reasoning-guiding, and so there is no one right logic.

Suffice it to say that, though classical logic has traditionally been thought of as “the one right logic”, this is not accepted by everyone. An interesting feature of these debates, though, is that they demonstrate clearly the strengths and weaknesses of various logics (including classical logic) when it comes to capturing reasoning.

## Bibliography

- Anderson, A. and N. Belnap [1975],
*Entailment: The logic of relevance and necessity I*, Princeton: Princeton University Press. - Anderson, A. and N. Belnap, and M. Dunn [1992],
*Entailment: The logic of relevance and necessity II*, Princeton: Princeton University Press. - Beall, Jc and G. Restall [2006],
*Logical Pluralism*, Oxford: Oxford University Press. - Brouwer, L.E.J. [1949], “Consciousness, Philosophy and
Mathematics”,
*Journal of Symbolic Logic*, 14(2): 132–133. - Cook, R. [2002], “Vagueness and mathematical
precision”,
*Mind*, 111: 227–247. - Corcoran, J. [1973], “Gaps between logical theory and
mathematical practice”,
*The methodological unity of science*, ed. by M. Bunge, Dordrecht: D. Reidel, 23–50. - Davidson, D. [1984],
*Inquiries into truth and interpretation*, Oxford: Clarendon Press. - Dummett, M. [2000],
*Elements of intuitionism*, second edition, Oxford: Oxford University Press. - Gödel, K. [1930], “Die Vollständigkeit der Axiome des
logischen Funktionenkalkuls”,
*Montatshefte für Mathematik und Physik 37*, 349–360; translated as “The completeness of the axioms of the functional calculus of logic”, in van Heijenoort [1967], 582–591. - Heyting, A. [1956],
*Intuitionism*, Amsterdam: North-Holland Publishing. - Lycan, W. [1984],
*Logical form in natural language*, Cambridge, MA: The MIT Press. - Montague, R. [1974],
*Formal philosophy*, ed. by R. Thomason, New Haven: Yale University Press. - Priest, G. [2006],
*In contradiction, a study of the transconsistent*, second, revised edition, Oxford: Clarendon Press. - ––– [2006a],
*Doubt truth to be a liar*, Oxford: Clarendon Press. - Quine, W. V. O. [1960],
*Word and object*, Cambridge, MA: The MIT Press. - ––– [1986],
*Philosophy of logic*, second edition, Englewood Cliffs: Prentice-Hall. - Resnik, M. [1996], “Ought there to be but one true
logic”, in
*Logic and Reality: Essays on the Legacy of Arthur Prior*, J. Copeland (ed.), Oxford: Oxford University Press, 489-517. - Rumfitt, I. [2015],
*The Boundary Stones of Thought: An Essay in the Philosophy of Logic*, Oxford: Oxford University Press. - Shapiro, S. [1991],
*Foundations without Foundationalism*, Oxford: Clarendon Press. - ––– [1996],
*The limits of logic: Second-order logic and the Skolem paradox*,*The international research library of philosophy*, Dartmouth Publishing Company, 1996. (An anthology containing many of the significant later papers on the Skolem paradox.) - ––– [1998], “Logical consequence: models and
modality”, in
*The philosophy of mathematics today*, edited by M. Schirn, Oxford: Oxford University Press, 131-156. - ––– [2014],
*Varieties of Logic*, Oxford: Oxford University Press. - Tennant, N. [1997],
*The taming of the true*, Oxford: Clarendon Press. - Van Heijenoort, J [1967],
*From Frege to Gödel*, Cambridge, Massachusetts: Harvard University Press. An anthology containing many of the major historical papers on mathematical logic in the early decades of the twentieth century.

### Further Reading

There are many fine textbooks on mathematical logic. A sample follows.- Boolos, G., J. P. Burgess, and R. Jeffrey
[2007],
*Computability and logic*, fifth edition, Cambridge, England: Cambridge University Press. Elementary and intermediate level. - Bergmann, M., J. Moor and J. Nelson [2013],
*The logic book*, sixth edition, New York: McGraw-Hill. Elementary and intermediate level. - Church, A. [1956],
*Introduction to mathematical logic*, Princeton: Princeton University Press. Classic textbook. - Enderton, H. [1972],
*A mathematical introduction to logic*, New York: Academic Press. Textbook in mathematical logic, aimed at a mathematical audience. - Forbes, G. [1994],
*Modern Logic*, Oxford: Oxford University Press. Elementary textbook. - Mendelson, E. [1987],
*Introduction to mathematical logic*, third edition, Princeton: van Nostrand. Intermediate.

## Academic Tools

How to cite this entry. Preview the PDF version of this entry at the Friends of the SEP Society. Look up this entry topic at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers, with links to its database.

## Other Internet Resources

[Please contact the author with suggestions.]