Classical Logic
Typically, a logic consists of a formal or informal language together with a deductive system and/or a model-theoretic semantics. The language has components that correspond to a part of a natural language like English or Greek. The deductive system is to capture, codify, or simply record arguments that are valid for the given language, and the semantics is to capture, codify, or record the meanings, or truth-conditions for at least part of the language.
The following sections provide the basics of a typical logic, sometimes called “classical elementary logic” or “classical first-order logic”. Section 2 develops a formal language, with a rigorous syntax and grammar. The formal language is a recursively defined collection of strings on a fixed alphabet. As such, it has no meaning, or perhaps better, the meaning of its formulas is given by the deductive system and the semantics. Some of the symbols have counterparts in ordinary language. We define an argument to be a non-empty collection of sentences in the formal language, one of which is designated to be the conclusion. The other sentences (if any) in an argument are its premises. Section 3 sets up a deductive system for the language, in the spirit of natural deduction. An argument is derivable if there is a deduction from some or all of its premises to its conclusion. Section 4 provides a model-theoretic semantics. An argument is valid if there is no interpretation (in the semantics) in which its premises are all true and its conclusion false. This reflects the longstanding view that a valid argument is truth-preserving.
In Section 5, we turn to relationships between the deductive system and the semantics, and in particular, the relationship between derivability and validity. We show that an argument is derivable only if it is valid. This pleasant feature, called soundness, entails that no deduction takes one from true premises to a false conclusion. Thus, deductions preserve truth. Then we establish a converse, called completeness, that an argument is valid only if it is derivable. This shows that the deductive system is rich enough to provide a deduction for every valid argument. So there are enough deductions: all and only valid arguments are derivable. We briefly indicate other features of the logic, some of which are corollaries to soundness and completeness.
The final section, Section 6, is devoted to the a brief examination of the philosophical position that classical logic is “the one right logic”.
- 1. Introduction
- 2. Language
- 3. Deduction
- 4. Semantics
- 5. Meta-theory
- 6. The One Right Logic?
- Bibliography
- Academic Tools
- Other Internet Resources
- Related Entries
1. Introduction
Today, logic is a branch of mathematics and a branch of philosophy. In most large universities, both departments offer courses in logic, and there is usually a lot of overlap between them. Formal languages, deductive systems, and model-theoretic semantics are mathematical objects and, as such, the logician is interested in their mathematical properties and relations. Soundness, completeness, and most of the other results reported below are typical examples. Philosophically, logic is at least closely related to the study of correct reasoning. Reasoning is an epistemic, mental activity. So logic is at least closely allied with epistemology. Logic is also a central branch of computer science, due, in part, to interesting computational relations in logical systems, and, in part, to the close connection between formal deductive argumentation and reasoning (see the entries on recursive functions, computability and complexity, and philosophy of computer science).
This raises questions concerning the philosophical relevance of the various mathematical aspects of logic. How do deducibility and validity, as properties of formal languages – sets of strings on a fixed alphabet – relate to correct reasoning? What do the mathematical results reported below have to do with the original philosophical issues concerning valid reasoning? This is an instance of the philosophical problem of explaining how mathematics applies to non-mathematical reality.
Typically, ordinary deductive reasoning takes place in a natural language, or perhaps a natural language augmented with some mathematical symbols. So our question begins with the relationship between a natural language and a formal language. Without attempting to be comprehensive, it may help to sketch several options on this matter.
One view is that the formal languages accurately exhibit actual features of certain fragments of a natural language. Some philosophers claim that declarative sentences of natural language have underlying logical forms and that these forms are displayed by formulas of a formal language. Other writers hold that (successful) declarative sentences express propositions; and formulas of formal languages somehow display the forms of these propositions. On views like this, the components of a logic provide the underlying deep structure of correct reasoning. A chunk of reasoning in natural language is correct if the forms underlying the sentences constitute a valid or deducible argument. See for example, Montague [1974], Davidson [1984], Lycan [1984] (and the entry on logical form).
Another view, held at least in part by Gottlob Frege and Wilhelm Leibniz, is that because natural languages are fraught with vagueness and ambiguity, they should be replaced by formal languages. A similar view, held by W. V. O. Quine (e.g., [1960], [1986]), is that a natural language should be regimented, cleaned up for serious scientific and metaphysical work. One desideratum of the enterprise is that the logical structures in the regimented language should be transparent. It should be easy to “read off” the logical properties of each sentence. A regimented language is similar to a formal language regarding, for example, the explicitly presented rigor of its syntax and its truth conditions.
On a view like this, deducibility and validity represent idealizations of correct reasoning in natural language. A chunk of reasoning is correct to the extent that it corresponds to, or can be regimented by, a valid or deducible argument in a formal language.
When mathematicians and many philosophers engage in deductive reasoning, they occasionally invoke formulas in a formal language to help disambiguate, or otherwise clarify what they mean. In other words, sometimes formulas in a formal language are used in ordinary reasoning. This suggests that one might think of a formal language as an addendum to a natural language. Then our present question concerns the relationship between this addendum and the original language. What do deducibility and validity, as sharply defined on the addendum, tell us about correct deductive reasoning in general?
Another view is that a formal language is a mathematical model of a natural language in roughly the same sense as, say, a collection of point masses is a model of a system of physical objects, and the Bohr construction is a model of an atom. In other words, a formal language displays certain features of natural languages, or idealizations thereof, while ignoring or simplifying other features. The purpose of mathematical models is to shed light on what they are models of, without claiming that the model is accurate in all respects or that the model should replace what it is a model of. On a view like this, deducibility and validity represent mathematical models of (perhaps different aspects of) correct reasoning in natural languages. Correct chunks of deductive reasoning correspond, more or less, to valid or deducible arguments; incorrect chunks of reasoning roughly correspond to invalid or non-deducible arguments. See, for example, Corcoran [1973], Shapiro [1998], and Cook [2002].
There is no need to adjudicate this matter here. Perhaps the truth lies in a combination of the above options, or maybe some other option is the correct, or most illuminating one. We raise the matter only to lend some philosophical perspective to the formal treatment that follows.
2. Language
Here we develop the basics of a formal language, or to be precise, a class of formal languages. Again, a formal language is a recursively defined set of strings on a fixed alphabet. Some aspects of the formal languages correspond to, or have counterparts in, natural languages like English. Technically, this “counterpart relation” is not part of the formal development, but we will mention it from time to time, to motivate some of the features and results.
2.1 Building blocks
We begin with analogues of singular terms, linguistic items whose function is to denote a person or object. We call these terms. We assume a stock of individual constants. These are lower-case letters, near the beginning of the Roman alphabet, with or without numerical subscripts:
a,a1,b23,c,d22,etc.We envisage a potential infinity of individual constants. In the present system each constant is a single character, and so individual constants do not have an internal syntax. Thus we have an infinite alphabet. This could be avoided by taking a constant like d22, for example, to consist of three characters, a lowercase “d” followed by a pair of subscript “2”s.
We also assume a stock of individual variables. These are lower-case letters, near the end of the alphabet, with or without numerical subscripts:
w,x,y12,z,z4,etc.In ordinary mathematical reasoning, there are two functions terms need to fulfill. We need to be able to denote specific, but unspecified (or arbitrary) objects, and sometimes we need to express generality. In our system, we use some constants in the role of unspecified reference and variables to express generality. Both uses are recapitulated in the formal treatment below. Some logicians employ different symbols for unspecified objects (sometimes called “individual parameters”) and variables used to express generality.
Constants and variables are the only terms in our formal language, so all of our terms are simple, corresponding to proper names and some uses of pronouns. We call a term closed if it is not a variable. In general, we use v to represent variables, and t to represent a closed term, an individual constant. Some authors also introduce function letters, which allow complex terms corresponding to: “7+4” and “the wife of Bill Clinton”, or complex terms containing variables, like “the father of x” and “x/y”. Logic books aimed at mathematicians are likely to contain function letters, probably due to the centrality of functions in mathematical discourse. Books aimed at a more general audience (or at philosophy students), may leave out function letters, since it simplifies the syntax and theory. We follow the latter route here. This is an instance of a general tradeoff between presenting a system with greater expressive resources, at the cost of making its formal treatment more complex.
For each natural number n, we introduce a stock of n-place predicate letters. These are upper-case letters at the beginning or middle of the alphabet. A superscript indicates the number of places, and there may or may not be a subscript. For example,
A3,B32,P3,etc.are three-place predicate letters. We often omit the superscript, when no confusion will result. We also add a special two-place predicate symbol “=” for identity.
Zero-place predicate letters are sometimes called “sentence letters”. They correspond to free-standing sentences whose internal structure does not matter. One-place predicate letters, called “monadic predicate letters”, correspond to linguistic items denoting properties, like “being a man”, “being red”, or “being a prime number”. Two-place predicate letters, called “binary predicate letters”, correspond to linguistic items denoting binary relations, like “is a parent of” or “is greater than”. Three-place predicate letters correspond to three-place relations, like “lies on a straight line between”. And so on.
The non-logical terminology of the language consists of its individual constants and predicate letters. The symbol “=”, for identity, is not a non-logical symbol. In taking identity to be logical, we provide explicit treatment for it in the deductive system and in the model-theoretic semantics. Most authors do the same, but there is some controversy over the issue (Quine [1986, Chapter 5]). If K is a set of constants and predicate letters, then we give the fundamentals of a language L1K= built on this set of non-logical terminology. It may be called the first-order language with identity on K. A similar language that lacks the symbol for identity (or which takes identity to be non-logical) may be called L1K, the first-order language without identity on K.
2.2 Atomic formulas
If V is an n-place predicate letter in K, and t1,…,tn are terms of K, then Vt1…tn is an atomic formula of L1K=. Notice that the terms t1,…,tn need not be distinct. Examples of atomic formulas include:
P4xaab,C1x,C1a,D0,A3abc.The last one is an analogue of a statement that a certain relation (A) holds between three objects (a,b,c). If t1 and t2 are terms, then t1=t2 is also an atomic formula of L1K=. It corresponds to an assertion that t1 is identical to t2.
If an atomic formula has no variables, then it is called an atomic sentence. If it does have variables, it is called open. In the above list of examples, the first and second are open; the rest are sentences.
2.3 Compound formulas
We now introduce the final items of the lexicon:
¬,&,∨,→,∀,∃,(,)We give a recursive definition of a formula of L1K=:
- All atomic formulas of L1K= are formulas of L1K=.
- If θ is a formula of L1K=, then so is ¬θ.
A formula corresponding to ¬θ thus says that it is not the case that θ. The symbol “¬” is called “negation”, and is a unary connective.
- If θ and ψ are formulas of L1K=, then so is (θ&ψ).
The ampersand “&” corresponds to the English “and” (when “and” is used to connect sentences). So (θ&ψ) can be read “θ and ψ”. The formula (θ&ψ) is called the “conjunction” of θ and ψ.
- If θ and ψ are formulas of L1K=, then so is (θ∨ψ).
The symbol “∨” corresponds to “either … or … or both”, so (θ∨ψ) can be read “θ or ψ”. The formula (θ∨ψ) is called the “disjunction” of θ and ψ.
- If θ and ψ are formulas of L1K=, then so is (θ→ψ).
The arrow “→” roughly corresponds to “if … then … ”, so (θ→ψ) can be read “if θ then ψ” or “θ only if ψ”.
The symbols “&”, “∨”, and “→” are called “binary connectives”, since they serve to “connect” two formulas into one. Some authors introduce (θ↔ψ) as an abbreviation of ((θ→ψ)&(ψ→θ)). The symbol “↔” is an analogue of the locution “if and only if”.
- If θ is a formula of L1K= and v is a variable, then ∀vθ is a formula of L1K=.
The symbol “∀” is called a universal quantifier, and is an analogue of “for all”; so ∀vθ can be read “for all v,θ”.
- If θ is a formula of L1K= and v is a variable, then ∃vθ is a formula of L1K=.
The symbol “∃” is called an existential quantifier, and is an analogue of “there exists” or “there is”; so ∃vθ can be read “there is a v such that θ”.
- That’s all folks. That is, all formulas are constructed in accordance with rules (1)–(7).
Clause (8) allows us to do inductions on the complexity of formulas. If a certain property holds of the atomic formulas and is closed under the operations presented in clauses (2)–(7), then the property holds of all formulas. Here is a simple example:
Theorem 1. Every formula of L1K= has the same number of left and right parentheses. Moreover, each left parenthesis corresponds to a unique right parenthesis, which occurs to the right of the left parenthesis. Similarly, each right parenthesis corresponds to a unique left parenthesis, which occurs to the left of the given right parenthesis. If a parenthesis occurs between a matched pair of parentheses, then its mate also occurs within that matched pair. In other words, parentheses that occur within a matched pair are themselves matched.
Proof: By clause (8), every formula is built up from the atomic formulas using clauses (2)–(7). The atomic formulas have no parentheses. Parentheses are introduced only in clauses (3)–(5), and each time they are introduced as a matched set. So at any stage in the construction of a formula, the parentheses are paired off.
We next define the notion of an occurrence of a variable being free or bound in a formula. A variable that immediately follows a quantifier (as in “∀x” and “∃y”) is neither free nor bound. We do not even think of those as occurrences of the variable. All variables that occur in an atomic formula are free. If a variable occurs free (or bound) in θ or in ψ, then that same occurrence is free (or bound) in ¬θ,(θ&ψ),(θ∨ψ), and (θ→ψ). That is, the (unary and binary) connectives do not change the status of variables that occur in them. All occurrences of the variable v in θ are bound in ∀vθ and ∃vθ. Any free occurrences of v in θ are bound by the initial quantifier. All other variables that occur in θ are free or bound in ∀vθ and ∃vθ, as they are in θ.
For example, in the formula (∀x(Axy ∨Bx)&Bx), the occurrences of “x” in Axy and in the first Bx are bound by the quantifier. The occurrence of “y” and last occurrence of “x” are free. In ∀x(Ax→∃xBx), the “x” in Ax is bound by the initial universal quantifier, while the other occurrence of x is bound by the existential quantifier. The above syntax allows this “double-binding”. Although it does not create any ambiguities (see below), we will avoid such formulas, as a matter of taste and clarity.
The syntax also allows so-called vacuous binding, as in ∀xBc. These, too, will be avoided in what follows. Some treatments of logic rule out vacuous binding and double binding as a matter of syntax. That simplifies some of the treatments below, and complicates others.
Free variables correspond to place-holders, while bound variables are used to express generality. If a formula has no free variables, then it is called a sentence. If a formula has free variables, it is called open.
2.4 Features of the syntax
Before turning to the deductive system and semantics, we mention a few features of the language, as developed so far. This helps draw the contrast between formal languages and natural languages like English.
We assume at the outset that all of the categories are disjoint. For example, no connective is also a quantifier or a variable, and the non-logical terms are not also parentheses or connectives. Also, the items within each category are distinct. For example, the sign for disjunction does not do double-duty as the negation symbol, and perhaps more significantly, no two-place predicate is also a one-place predicate.
One difference between natural languages like English and formal languages like L1K= is that the latter are not supposed to have any ambiguities. The policy that the different categories of symbols do not overlap, and that no symbol does double-duty, avoids the kind of ambiguity, sometimes called “equivocation”, that occurs when a single word has two meanings: “I’ll meet you at the bank.” But there are other kinds of ambiguity. Consider the English sentence:
John is married, and Mary is single, or Joe is crazy.
It can mean that John is married and either Mary is single or Joe is crazy, or else it can mean that either both John is married and Mary is single, or else Joe is crazy. An ambiguity like this, due to different ways to parse the same sentence, is sometimes called an “amphiboly”. If our formal language did not have the parentheses in it, it would have amphibolies. For example, there would be a “formula” A&B∨ C. Is this supposed to be ((A&B)∨C), or is it (A&(B∨C))? The parentheses resolve what would be an amphiboly.
Can we be sure that there are no other amphibolies in our language? That is, can we be sure that each formula of L1K= can be put together in only one way? Our next task is to answer this question.
Let us temporarily use the term “unary marker” for the negation symbol (¬) or a quantifier followed by a variable (e.g., ∀x,∃z).
Lemma 2. Each formula consists of a string of zero or more unary markers followed by either an atomic formula or a formula produced using a binary connective, via one of clauses (3)–(5).
Proof: We proceed by induction on the complexity of the formula or, in other words, on the number of formation rules that are applied. The Lemma clearly holds for atomic formulas. Let n be a natural number, and suppose that the Lemma holds for any formula constructed from n or fewer instances of clauses (2)–(7). Let θ be a formula constructed from n+1 instances. The Lemma holds if the last clause used to construct θ was either (3), (4), or (5). If the last clause used to construct θ was (2), then θ is ¬ψ. Since ψ was constructed with n instances of the rule, the Lemma holds for ψ (by the induction hypothesis), and so it holds for θ. Similar reasoning shows the Lemma to hold for θ if the last clause was (6) or (7). By clause (8), this exhausts the cases, and so the Lemma holds for θ, by induction.
Lemma 3. If a formula θ contains a left parenthesis, then it ends with a right parenthesis, which matches the leftmost left parenthesis in θ.
Proof: Here we also proceed by induction on the number of instances of (2)–(7) used to construct the formula. Clearly, the Lemma holds for atomic formulas, since they have no parentheses. Suppose, then, that the Lemma holds for formulas constructed with n or fewer instances of (2)–(7), and let θ be constructed with n+1 instances. If the last clause applied was (3)–(5), then the Lemma holds since θ itself begins with a left parenthesis and ends with the matching right parenthesis. If the last clause applied was (2), then θ is ¬ψ, and the induction hypothesis applies to ψ. Similarly, if the last clause applied was (6) or (7), then θ consists of a quantifier, a variable, and a formula to which we can apply the induction hypothesis. It follows that the Lemma holds for θ.
Lemma 4. Each formula contains at least one atomic formula.
The proof proceeds by induction on the number of instances of (2)–(7) used to construct the formula, and we leave it as an exercise.
Theorem 5. Let α,β be nonempty sequences of characters on our alphabet, such that αβ (i.e α followed by β) is a formula. Then α is not a formula.
Proof: By Theorem 1 and Lemma 3, if α contains a left parenthesis, then the right parenthesis that matches the leftmost left parenthesis in αβ comes at the end of αβ, and so the matching right parenthesis is in β. So, α has more left parentheses than right parentheses. By Theorem 1,α is not a formula. So now suppose that α does not contain any left parentheses. By Lemma 2,αβ consists of a string of zero or more unary markers followed by either an atomic formula or a formula produced using a binary connective, via one of clauses (3)–(5). If the latter formula was produced via one of clauses (3)–(5), then it begins with a left parenthesis. Since α does not contain any parentheses, it must be a string of unary markers. But then α does not contain any atomic formulas, and so by Lemma 4,α is not a formula. The only case left is where αβ consists of a string of unary markers followed by an atomic formula, either in the form t1=t2 or Pt1…tn. Again, if α just consisted of unary markers, it would not be a formula, and so α must consist of the unary markers that start αβ, followed by either t1 by itself, t1= by itself, or the predicate letter P, and perhaps some (but not all) of the terms t1,…,tn. In the first two cases, α does not contain an atomic formula, by the policy that the categories do not overlap. Since P is an n-place predicate letter, by the policy that the predicate letters are distinct, P is not an m-place predicate letter for any m≠n. So the part of α that consists of P followed by the terms is not an atomic formula. In all of these cases, then, α does not contain an atomic formula. By Lemma 4,α is not a formula.
We are finally in position to show that there is no amphiboly in our language.
Theorem 6. Let θ be any formula of L1K=. If θ is not atomic, then there is one and only one among (2)–(7) that was the last clause applied to construct θ. That is, θ could not be produced by two different clauses. Moreover, no formula produced by clauses (2)–(7) is atomic.
Proof: By Clause (8), either θ is atomic or it was produced by one of clauses (2)–(7). Thus, the first symbol in θ must be either a predicate letter, a term, a unary marker, or a left parenthesis. If the first symbol in θ is a predicate letter or term, then θ is atomic. In this case, θ was not produced by any of (2)–(7), since all such formulas begin with something other than a predicate letter or term. If the first symbol in θ is a negation sign “¬”, then was θ produced by clause (2), and not by any other clause (since the other clauses produce formulas that begin with either a quantifier or a left parenthesis). Similarly, if θ begins with a universal quantifier, then it was produced by clause (6), and not by any other clause, and if θ begins with an existential quantifier, then it was produced by clause (7), and not by any other clause. The only case left is where θ begins with a left parenthesis. In this case, it must have been produced by one of (3)–(5), and not by any other clause. We only need to rule out the possibility that θ was produced by more than one of (3)–(5). To take an example, suppose that θ was produced by (3) and (4). Then θ is (ψ1&ψ2) and θ is also (ψ3∨ψ4), where ψ1,ψ2,ψ3, and ψ4 are themselves formulas. That is, (ψ1&ψ2) is the very same formula as (ψ3∨ψ4). By Theorem 5,ψ1 cannot be a proper part of ψ3, nor can ψ3 be a proper part of ψ1. So ψ1 must be the same formula as ψ3. But then “&” must be the same symbol as “∨”, and this contradicts the policy that all of the symbols are different. So θ was not produced by both Clause (3) and Clause (4). Similar reasoning takes care of the other combinations.
This result is sometimes called “unique readability”. It shows that each formula is produced from the atomic formulas via the various clauses in exactly one way. If θ was produced by clause (2), then its main connective is the initial “¬”. If θ was produced by clauses (3), (4), or (5), then its main connective is the introduced “&”, “∨”, or “→”, respectively. If θ was produced by clauses (6) or (7), then its main connective is the initial quantifier. We apologize for the tedious details. We included them to indicate the level of precision and rigor for the syntax.
3. Deduction
We now introduce a deductive system, D, for our languages. As above, we define an argument to be a non-empty collection of sentences in the formal language, one of which is designated to be the conclusion. If there are any other sentences in the argument, they are its premises.[1] By convention, we use “Γ”, “Γ′”, “Γ1”, etc, to range over sets of sentences, and we use the letters “ϕ”, “ψ”, “θ”, uppercase or lowercase, with or without subscripts, to range over single sentences. We write “Γ,Γ′” for the union of Γ and Γ′, and “Γ,ϕ” for the union of Γ with {ϕ}.
We write an argument in the form ⟨Γ,ϕ⟩, where Γ is a set of sentences, the premises, and ϕ is a single sentence, the conclusion. Remember that Γ may be empty. We write Γ⊢ϕ to indicate that ϕ is deducible from Γ, or, in other words, that the argument ⟨Γ,ϕ⟩ is deducible in D. We may write Γ⊢Dϕ to emphasize the deductive system D. We write ⊢ϕ or ⊢Dϕ to indicate that ϕ can be deduced (in D) from the empty set of premises.
The rules in D are chosen to match logical relations concerning the English analogues of the logical terminology in the language. Again, we define the deducibility relation by recursion. We start with a rule of assumptions:
- (As)
- If ϕ is a member of Γ, then Γ⊢ϕ.
We thus have that {ϕ}⊢ϕ; each premise follows from itself. We next present two clauses for each connective and quantifier. The clauses indicate how to “introduce” and “eliminate” sentences in which each symbol is the main connective.
First, recall that “&” is an analogue of the English connective “and”. Intuitively, one can deduce a sentence in the form (θ&ψ) if one has deduced θ and one has deduced ψ. Conversely, one can deduce θ from (θ&ψ) and one can deduce ψ from (θ&ψ):
- (&I)
- If Γ1⊢θ and Γ2⊢ψ, then Γ1,Γ2⊢(θ&ψ).
- (&E)
- If Γ⊢(θ&ψ) then Γ⊢θ; and if Γ⊢(θ&ψ) then Γ⊢ψ.
The name “&I” stands for “&-introduction”; “&E” stands for “&-elimination”.
Since, the symbol “∨” corresponds to the English “or”, (θ∨ψ) should be deducible from θ, and (θ∨ψ) should also be deducible from ψ:
- (∨I)
- If Γ⊢θ then Γ⊢(θ∨ψ); if Γ⊢ψ then Γ⊢(θ∨ψ).
The elimination rule is a bit more complicated. Suppose that “θ or ψ” is true. Suppose also that ϕ follows from θ and that ϕ follows from ψ. One can reason that if θ is true, then ϕ is true. If instead ψ is true, we still have that ϕ is true. So either way, ϕ must be true.
- (∨E)
- If Γ1⊢(θ∨ψ),Γ2,θ⊢ϕ and Γ3,ψ⊢ϕ, then Γ1,Γ2,Γ3⊢ϕ.
For the next clauses, recall that the symbol, “→”, is an analogue of the English “if … then … ” construction. If one knows, or assumes (θ→ψ) and also knows, or assumes θ, then one can conclude ψ. Conversely, if one deduces ψ from an assumption θ, then one can conclude that (θ→ψ).
- (→I)
- If Γ,θ⊢ψ, then Γ⊢(θ→ψ).
- (→E)
- If Γ1⊢(θ→ψ) and Γ2⊢θ, then Γ1,Γ2⊢ψ.
This elimination rule is sometimes called “modus ponens”. In some logic texts, the introduction rule is proved as a “deduction theorem”.
Our next clauses are for the negation sign, “¬”. The underlying idea is that a sentence ψ is inconsistent with its negation ¬ψ. They cannot both be true. We call a pair of sentences ψ,¬ψ contradictory opposites. If one can deduce such a pair from an assumption θ, then one can conclude that θ is false, or, in other words, one can conclude ¬θ.
- (¬I)
- If Γ1,θ⊢ψ and Γ2,θ⊢¬ψ, then Γ1,Γ2⊢¬θ.
By (As), we have that {A,¬A}⊢A and {A,¬A}⊢¬A. So by ¬I we have that {A}⊢¬¬A. However, we do not have the converse yet. Intuitively, ¬¬θ corresponds to “it is not the case that it is not the case that” . One might think that this last is equivalent to θ, and we have a rule to that effect:
- (DNE)
- If Γ⊢¬¬θ, then Γ⊢θ.
The name DNE stands for “double-negation elimination”. There is some controversy over this inference. It is rejected by philosophers and mathematicians who do not hold that each meaningful sentence is either true or not true. Intuitionistic logic does not sanction the inference in question (see, for example Dummett [2000], or the entry on intuitionistic logic, or history of intuitionistic logic), but, again, classical logic does.
To illustrate the parts of the deductive system D presented thus far, we show that ⊢(A∨¬A):
- {¬(A∨¬A),A}⊢¬(A∨¬A), by (As)
- {¬(A∨¬A),A}⊢A, by (As).
- {¬(A∨¬A),A}⊢(A∨¬A), by (∨I), from (ii).
- {¬(A∨¬A)}⊢¬A, by (¬I), from (i) and (iii).
- {¬(A∨¬A),¬A}⊢¬(A∨¬A), by (As)
- {¬(A∨¬A),¬A}⊢¬A, by (As)
- {¬(A∨¬A),¬A}⊢(A∨¬A), by (∨I), from (vi).
- {¬(A∨¬A)}⊢¬¬A, by (¬I), from (v) and (vii).
- ⊢¬¬(A∨¬A), by (¬I), from (iv) and (viii).
- ⊢(A∨¬A), by (DNE), from (ix).
The principle (θ∨¬θ) is sometimes called the law of excluded middle. It is not valid in intuitionistic logic.
Let θ,¬θ be a pair of contradictory opposites, and let ψ be any sentence at all. By (As) we have {θ,¬θ,¬ψ}⊢θ and {θ,¬θ,¬ψ}⊢¬θ. So by (¬I), {θ,¬θ}⊢¬¬ψ. So, by (DNE) we have {θ,¬θ}⊢ψ . That is, anything at all follows from a pair of contradictory opposites. Some logicians introduce a rule to codify a similar inference:
If Γ1⊢θ and Γ2⊢¬θ, then for any sentence ψ,Γ1,Γ2⊢ψ
The inference is sometimes called ex falso quodlibet or, more colorfully, explosion. Some call it “¬-elimination”, but perhaps this stretches the notion of “elimination” a bit. We do not officially include ex falso quodlibet as a separate rule in D, but as will be shown below (Theorem 10), each instance of it is derivable in our system D.
Some logicians object to ex falso quodlibet, on the ground that the sentence ψ may be irrelevant to any of the premises in Γ. Suppose, for example, that one starts with some premises Γ about human nature and facts about certain people, and then deduces both the sentence “Clinton had extra-marital sexual relations” and “Clinton did not have extra-marital sexual relations”. One can perhaps conclude that there is something wrong with the premises Γ. But should we be allowed to then deduce anything at all from Γ? Should we be allowed to deduce “The economy is sound”?
A small minority of logicians, called dialetheists, hold that some contradictions are actually true. For them, ex falso quodlibet is not truth-preserving.
Deductive systems that demur from ex falso quodlibet are called paraconsistent. Most relevant logics are paraconsistent. See the entries on relevance logic, paraconsistent logic, and dialetheism. Or see Anderson and Belnap [1975], Anderson, Belnap, and Dunn [1992], and Tennant [1997] for fuller overviews of relevant logic; and Priest [2006a,b], for dialetheism. Deep philosophical issues concerning the nature of logical consequence are involved. Far be it for an article in a philosophy encyclopedia to avoid philosophical issues, but space considerations preclude a fuller treatment of this issue here. Suffice it to note that the inference ex falso quodlibet is sanctioned in systems of classical logic, the subject of this article. It is essential to establishing the balance between the deductive system and the semantics (see §5 below).
The next pieces of D are the clauses for the quantifiers. Let θ be a formula, v a variable, and t a term (i.e., a variable or a constant). Then define θ(v|t) to be the result of substituting t for each free occurrence of v in θ. So, if θ is (Qx&∃xPxy), then θ(x|c) is (Qc&∃xPxy). The last occurrence of x is not free.
A sentence in the form ∀vθ is an analogue of the English “for every v,θ holds”. So one should be able to infer θ(v|t) from ∀vθ for any closed term t. Recall that the only closed terms in our system are constants.
- (∀E)
- If Γ⊢∀vθ, then Γ⊢θ(v|t), for any closed term t.
The idea here is that if ∀vθ is true, then θ should hold of t, no matter what t is.
The introduction clause for the universal quantifier is a bit more complicated. Suppose that a sentence θ contains a closed term t, and that θ has been deduced from a set of premises Γ. If the closed term t does not occur in any member of Γ, then θ will hold no matter which object t may denote. That is, ∀vθ follows.
- (∀I)
- For any closed term t, if Γ⊢θ(v|t), then Γ⊢∀vθ provided that t is not in Γ or θ.
This rule (∀I) corresponds to a common inference in mathematics. Suppose that a mathematician says “let n be a natural number” and goes on to show that n has a certain property P, without assuming anything about n (except that it is a natural number). She then reminds the reader that n is “arbitrary”, and concludes that P holds for all natural numbers. The condition that the term t not occur in any premise is what guarantees that it is indeed “arbitrary”. It could be any object, and so anything we conclude about it holds for all objects.
The existential quantifier is an analogue of the English expression “there exists”, or perhaps just “there is”. If we have established (or assumed) that a given object t has a given property, then it follows that there is something that has that property.
- (∃I)
- For any closed term t, if Γ⊢θ(v|t) then Γ⊢∃vθ.
The elimination rule for ∃ is not quite as simple:
- (∃E)
- For any closed term t, if Γ1⊢∃vθ and Γ2,θ(v|t)⊢ϕ, then Γ1,Γ2⊢ϕ, provided that t does not occur in ϕ, Γ2 or θ.
This elimination rule also corresponds to a common inference. Suppose that a mathematician assumes or somehow concludes that there is a natural number with a given property P. She then says “let n be such a natural number, so that Pn”, and goes on to establish a sentence ϕ, which does not mention the number n. If the derivation of ϕ does not invoke anything about n (other than the assumption that it has the given property P), then n could have been any number that has the property P. That is, n is an arbitrary number with property P. It does not matter which number n is. Since ϕ does not mention n, it follows from the assertion that something has property P. The provisions added to (∃E) are to guarantee that t is “arbitrary”.
The final items are the rules for the identity sign “=”. The introduction rule is about a simple as can be:
- (=I)
- Γ⊢t=t, where t is any closed term.
This “inference” corresponds to the truism that everything is identical to itself. The elimination rule corresponds to a principle that if a is identical to b, then anything true of a is also true of b.
- (=E)
- For any closed terms t1 and t2, if Γ1⊢t1=t2 and Γ2⊢θ, then Γ1,Γ2⊢θ′, where θ′ is obtained from θ by replacing one or more occurances of t1 with t2.
The rule (=E) indicates a certain restriction in the expressive resources of our language. Suppose, for example, that Harry is identical to Donald (since his mischievous parents gave him two names). According to most people’s intuitions, it would not follow from this and “Dick knows that Harry is wicked” that “Dick knows that Donald is wicked”, for the reason that Dick might not know that Harry is identical to Donald. Contexts like this, in which identicals cannot safely be substituted for each other, are called “opaque”. We assume that our language L1K= has no opaque contexts.
One final clause completes the description of the deductive system D:
- (*)
- That’s all folks. Γ⊢θ only if θ follows from members of Γ by the above rules.
Again, this clause allows proofs by induction on the rules used to establish an argument. If a property of arguments holds of all instances of (As) and (=I), and if the other rules preserve the property, then every argument that is deducible in D enjoys the property in question.
Before moving on to the model theory for L1K=, we pause to note a few features of the deductive system. To illustrate the level of rigor, we begin with a lemma that if a sentence does not contain a particular closed term, we can make small changes to the set of sentences we prove it from without problems. We allow ourselves the liberty here of extending some previous notation: for any terms t and t′, and any formula θ, we say that θ(t|t′) is the result of replacing all free occurrences of t in θ with t′.
Lemma 7. If Γ1 and Γ2 differ only in that wherever Γ1 contains θ, Γ2 contains θ(t|t′), then for any sentence ϕ not containing t or t′, if Γ1⊢ϕ then Γ2⊢ϕ.
Proof: The proof proceeds by induction on the number of steps in the proof of ϕ. Crucial to this proof is the fact that θ=θ(t|t′) whenever θ does not contain t or t′. When the number of steps in the proof of ϕ is one, this means that the last (and only) rule applied is (As) or (=I). Then, since ϕ does not contain t or t′, if Γ1⊢ϕ we simply apply the same rule ((As) or (=I)) to Γ2 to get Γ2⊢ϕ. Assume that there are n>1 steps in the proof of ϕ, and that Lemma 7 holds for any proof with less than n steps. Suppose that the nth rule applied to Γ1 was (&I). Then ϕ is ψ&χ, and Γ1⊢ϕ&χ. But then we know that previous steps in the proof include Γ1⊢ψ and Γ1⊢χ, and by induction, we have Γ2⊢ψ and Γ2⊢χ, since neither ψ nor χ contain t or t′. So, we simply apply (&I) to Γ2 to get Γ2⊢ψ&χ as required. Suppose now that the last step applied in the proof of Γ1⊢ϕ was (&E). Then, at a previous step in the proof of ϕ, we know Γ1⊢ϕ&ψ for some sentence ψ. If ψ does not contain t, then we simply apply (&E) to Γ2 to obtain the desired result. The only complication is if ψ contains t. Then we would have that Γ2⊢(ϕ&ψ)(t|t′). But, since (ϕ&ψ)(t|t′) is ϕ(t|t′)&ψ(t|t′), and ϕ(t|t′) is just ϕ, we can just apply (&E) to get Γ2⊢ϕ as required. The cases for the other rules are similar.
Theorem 8. The rule of Weakening. If Γ1⊢ϕ and Γ1⊆Γ2, then Γ2⊢ϕ.
Proof: Again, we proceed by induction on the number of rules that were used to arrive at Γ1⊢ϕ. Suppose that n>0 is a natural number, and that the theorem holds for any argument that was derived using fewer than n rules. Suppose that Γ1⊢ϕ using exactly n rules. If n=1, then the rule is either (As) or (=I). In these cases, Γ2⊢ϕ by the same rule. If the last rule applied was (&I), then ϕ has the form (θ&ψ), and we have Γ3⊢θ and Γ4⊢ψ, with Γ1=Γ3,Γ4. We apply the induction hypothesis to the deductions of θ and ψ, to get Γ2⊢θ and Γ2⊢ψ. and then apply (&I) to the result to get Γ2⊢ϕ. Most of the other cases are exactly like this. Slight complications arise only in the rules (∀I) and (∃E), because there we have to pay attention to the conditions for the rules.
Suppose that the last rule applied to get Γ1⊢ϕ is (∀I). So ϕ is a sentence of the form ∀vθ, and we have Γ1⊢θ(v|t) and t does not occur in any member of Γ1 or in θ. The problem is that t may occur in a member of Γ2, and so we cannot just invoke the induction hypothesis and apply (∀I) to the result. So, let t′ be a term not occurring in any sentence in Γ2. Let Γ′ be the result of substituting t′ for all t in Γ2. Then, since t does not occur in Γ1, Γ1⊆Γ′. So, the induction hypothesis gives us Γ′⊢θ(v|t), and we know that Γ′ does not contain t, so we can apply (∀I) to get Γ′⊢∀vθ. But ∀vθ does not contain t or t′, so Γ2⊢∀vθ by Lemma 7.
Suppose that the last rule applied was (∃E), we have Γ3⊢∃vθ and Γ4,θ(v|t)⊢ϕ, with Γ1 being Γ3,Γ4, and t not in ϕ, Γ4 or θ. If t does not occur free in Γ2, we apply the induction hypothesis to get Γ2⊢∃vθ, and then (∃E) to end up with Γ2⊢ϕ. If t does occur free in Γ2, then we follow a similar proceedure to ∀I, using Lemma 7.
Theorem 8 allows us to add on premises at will. It follows that Γ⊢ϕ if and only if there is a subset Γ′⊆Γ such that Γ′⊢ϕ. Some systems of relevant logic do not have weakening, nor does substructural logic (See the entries on relevance logic, substructural logics, and linear logic).
By clause (*), all derivations are established in a finite number of steps. So we have
Theorem 9. Γ⊢ϕ if and only if there is a finite Γ′⊆Γ such that Γ′⊢ϕ.
Theorem 10. The rule of ex falso quodlibet is a “derived rule” of D: if Γ1⊢θ and Γ2⊢¬θ, then Γ1,Γ2⊢ψ, for any sentence ψ.
Proof: Suppose that Γ1⊢θ and Γ2⊢¬θ. Then by Theorem 8,Γ1,¬ψ⊢θ, and Γ2,¬ψ⊢¬θ. So by (¬I), Γ1,Γ2⊢¬¬ψ. By (DNE), Γ1,Γ2⊢ψ.
Theorem 11. The rule of Cut. If Γ1⊢ψ and Γ2,ψ⊢θ, then Γ1,Γ2⊢θ.
Proof: Suppose Γ1⊢ψ and Γ2,ψ⊢θ. We proceed by induction on the number of rules used to establish Γ2,ψ⊢θ. Suppose that n is a natural number, and that the theorem holds for any argument that was derived using fewer than n rules. Suppose that Γ2,ψ⊢θ was derived using exactly n rules. If the last rule used was (=I), then Γ1,Γ2⊢θ is also an instance of (=I). If Γ2,ψ⊢θ is an instance of (As), then either θ is ψ, or θ is a member of Γ2. In the former case, we have Γ1⊢θ by supposition, and get Γ1,Γ2⊢θ by Weakening (Theorem 8). In the latter case, Γ1,Γ2⊢θ is itself an instance of (As). Suppose that Γ2,ψ⊢θ was obtained using (&E). Then we have Γ2,ψ⊢(θ&ϕ). The induction hypothesis gives us Γ1,Γ2⊢(θ&ϕ), and (&E) produces Γ1,Γ2⊢θ. The remaining cases are similar.
Theorem 11 allows us to chain together inferences. This fits the practice of establishing theorems and lemmas and then using those theorems and lemmas later, at will. The cut principle is, some think, essential to reasoning. In some logical systems, the cut principle is a deep theorem; in others it is invalid. The system here was designed, in part, to make the proof of Theorem 11 straightforward.
If Γ⊢Dθ, then we say that the sentence θ is a deductive consequence of the set of sentences Γ, and that the argument ⟨Γ,θ⟩ is deductively valid. A sentence θ is a logical theorem, or a deductive logical truth, if ⊢Dθ. That is, θ is a logical theorem if it is a deductive consequence of the empty set. A set Γ of sentences is consistent if there is no sentence θ such that Γ⊢Dθ and Γ⊢D¬θ. That is, a set is consistent if it does not entail a pair of contradictory opposite sentencess.
Theorem 12. A set Γ is consistent if and only if there is a sentence θ such that it is not the case that Γ⊢θ.
Proof: Suppose that Γ is consistent and let θ be any sentence. Then either it is not the case that Γ⊢θ or it is not the case that Γ⊢¬θ. For the converse, suppose that Γ is inconsistent and let ψ be any sentence. We have that there is a sentence such that both Γ⊢θ and Γ⊢¬θ. By ex falso quodlibet (Theorem 10), Γ⊢ψ.
Define a set Γ of sentences of the language L1K= to be maximally consistent if Γ is consistent and for every sentence θ of L1K=, if θ is not in Γ, then Γ,θ is inconsistent. In other words, Γ is maximally consistent if Γ is consistent, and adding any sentence in the language not already in Γ renders it inconsistent. Notice that if Γ is maximally consistent then Γ⊢θ if and only if θ is in Γ.
Theorem 13. The Lindenbaum Lemma. Let Γ be any consistent set of sentences of L1K=. Then there is a set Γ′ of sentences of L1K= such that Γ⊆Γ′ and Γ′ is maximally consistent.
Proof: Although this theorem holds in general, we assume here that the set K of non-logical terminology is either finite or denumerably infinite (i.e., the size of the natural numbers, usually called ℵ0). It follows that there is an enumeration θ0,θ1,… of the sentences of L1K=, such that every sentence of L1K= eventually occurs in the list. Define a sequence of sets of sentences, by recursion, as follows: Γ0 is Γ; for each natural number n, if Γn,θn is consistent, then let Γn+1=Γn,θn. Otherwise, let Γn+1=Γn. Let Γ′ be the union of all of the sets Γn. Intuitively, the idea is to go through the sentences of L1K=, throwing each one into Γ′ if doing so produces a consistent set. Notice that each Γn is consistent. Suppose that Γ′ is inconsistent. Then there is a sentence θ such that Γ′⊢θ and Γ′⊢¬θ. By Theorem 9 and Weakening (Theorem 8), there is finite subset Γ″ of \Gamma' such that \Gamma''\vdash \theta and \Gamma''\vdash \neg \theta. Because \Gamma'' is finite, there is a natural number n such that every member of \Gamma'' is in \Gamma_n. So, by Weakening again, \Gamma_n \vdash \theta and \Gamma_n \vdash \neg \theta. So \Gamma_n is inconsistent, which contradicts the construction. So \Gamma' is consistent. Now suppose that a sentence \theta is not in \Gamma'. We have to show that \Gamma', \theta is inconsistent. The sentence \theta must occur in the aforementioned list of sentences; say that \theta is \theta_m. Since \theta_m is not in \Gamma', then it is not in \Gamma_{m+1}. This happens only if \Gamma_m, \theta_m is inconsistent. So a pair of contradictory opposites can be deduced from \Gamma_m,\theta_m. By Weakening, a pair of contradictory opposites can be deduced from \Gamma', \theta_m. So \Gamma', \theta_m is inconsistent. Thus, \Gamma' is maximally consistent.
Notice that this proof uses a principle corresponding to the law of excluded middle. In the construction of \Gamma', we assumed that, at each stage, either \Gamma_n is consistent or it is not. Intuitionists, who demur from excluded middle, do not accept the Lindenbaum lemma.
4. Semantics
Let K be a set of non-logical terminology. An interpretation for the language \LKe is a structure M = \langle d,I\rangle, where d is a non-empty set, called the domain-of-discourse, or simply the domain, of the interpretation, and I is an interpretation function. Informally, the domain is what we interpret the language \LKe to be about. It is what the variables range over. The interpretation function assigns appropriate extensions to the non-logical terms. In particular,
If c is a constant in K, then I(c) is a member of the domain d.
Thus we assume that every constant denotes something. Systems where this is not assumed are called free logics (see the entry on free logic). Continuing,
If P^0 is a zero-place predicate letter in K, then I(P) is a truth value, either truth or falsehood.
If Q^1 is a one-place predicate letter in K, then I(Q) is a subset of d. Intuitively, I(Q) is the set of members of the domain that the predicate Q holds of. For example, I(Q) might be the set of red members of the domain.
If R^2 is a two-place predicate letter in K, then I(R) is a set of ordered pairs of members of d. Intuitively, I(R) is the set of pairs of members of the domain that the relation R holds between. For example, I(R) might be the set of pairs \langle a,b\rangle such that a and b are the members of the domain for which a loves b.
In general, if S^n is an n-place predicate letter in K, then I(S) is a set of ordered n-tuples of members of d.
Define s to be a variable-assignment, or simply an assignment, on an interpretation M, if s is a function from the variables to the domain d of M. The role of variable-assignments is to assign denotations to the free variables of open formulas. (In a sense, the quantifiers determine the “meaning” of the bound variables.)
Let t be a term of \LKe. We define the denotation of t in M under s, in terms of the interpretation function and variable-assignment:
If t is a constant, then D_{M,s}(t) is I(t), and if t is a variable, then D_{M,s}(t) is s(t).
That is, the interpretation M assigns denotations to the constants, while the variable-assignment assigns denotations to the (free) variables. If the language contained function symbols, the denotation function would be defined by recursion.
We now define a relation of satisfaction between interpretations, variable-assignments, and formulas of \LKe. If \phi is a formula of \LKe, M is an interpretation for \LKe, and s is a variable-assignment on M, then we write M,s\vDash \phi for M satisfies \phi under the assignment s. The idea is that M,s\vDash \phi is an analogue of “\phi comes out true when interpreted as in M via s”.
We proceed by recursion on the complexity of the formulas of \LKe.
If t_1 and t_2 are terms, then M,s\vDash t_1 =t_2 if and only if D_{M,s}(t_1) is the same as D_{M,s}(t_2).
This is about as straightforward as it gets. An identity t_1 =t_2 comes out true if and only if the terms t_1 and t_2 denote the same thing.
If P^0 is a zero-place predicate letter in K, then M,s\vDash P if and only if I(P) is truth.
If S^n is an n-place predicate letter in K and t_1, \ldots,t_n are terms, then M,s\vDash St_1 \ldots t_n if and only if the n-tuple \langle D_{M,s}(t_1), \ldots,D_{M,s}(t_n)\rangle is in I(S).
This takes care of the atomic formulas. We now proceed to the compound formulas of the language, more or less following the meanings of the English counterparts of the logical terminology.
M,s\vDash \neg \theta if and only if it is not the case that M,s\vDash \theta.
M,s\vDash(\theta \amp \psi) if and only if both M,s\vDash \theta and M,s\vDash \psi.
M,s\vDash(\theta \vee \psi) if and only if either M,s\vDash \theta or M,s\vDash \psi.
M,s\vDash(\theta \rightarrow \psi) if and only if either it is not the case that M,s\vDash \theta, or M,s\vDash \psi.
M,s\vDash \forall v\theta if and only if M,s'\vDash \theta, for every assignment s' that agrees with s except possibly at the variable v.
The idea here is that \forall v\theta comes out true if and only if \theta comes out true no matter what is assigned to the variable v. The final clause is similar.
M,s\vDash \exists v\theta if and only if M,s'\vDash \theta, for some assignment s' that agrees with s except possibly at the variable v.
So \exists v\theta comes out true if there is an assignment to v that makes \theta true.
Theorem 6, unique readability, assures us that this definition is coherent. At each stage in breaking down a formula, there is exactly one clause to be applied, and so we never get contradictory verdicts concerning satisfaction.
As indicated, the role of variable-assignments is to give denotations to the free variables. We now show that variable-assignments play no other role.
Theorem 14. For any formula \theta, if s_1 and s_2 agree on the free variables in \theta, then M,s_1 \vDash \theta if and only if M,s_2 \vDash \theta.
Proof: We proceed by induction on the complexity of the formula \theta. The theorem clearly holds if \theta is atomic, since in those cases only the values of the variable-assignments at the variables in \theta figure in the definition. Assume, then, that the theorem holds for all formulas less complex than \theta. And suppose that s_1 and s_2 agree on the free variables of \theta. Assume, first, that \theta is a negation, \neg \psi. Then, by the induction hypothesis, M,s_1 \vDash \psi if and only if M,s_2 \vDash \psi. So, by the clause for negation, M,s_1 \vDash \neg \psi if and only if M,s_2 \vDash \neg \psi. The cases where the main connective in \theta is binary are also straightforward. Suppose that \theta is \exists v\psi, and that M,s_1 \vDash \exists v\psi. Then there is an assignment s_1' that agrees with s_1 except possibly at v such that M,s_1'\vDash \psi. Let s_2' be the assignment that agrees with s_2 on the free variables not in \psi and agrees with s_1' on the others. Then, by the induction hypothesis, M,s_2'\vDash \psi. Notice that s_2' agrees with s_2 on every variable except possibly v. So M,s_2 \vDash \exists v\psi. The converse is the same, and the case where \theta begins with a universal quantifier is similar.
By Theorem 14, if \theta is a sentence, and s_1, s_2, are any two variable-assignments, then M,s_1 \vDash \theta if and only if M,s_2 \vDash \theta. So we can just write M\vDash \theta if M,s\vDash \theta for some, or all, variable-assignments s. So we define
M\vDash \theta where \theta is a sentence just in case M,s\vDash\theta for all variable assignments s.
In this case, we call M a model of \theta.
Suppose that K'\subseteq K are two sets of non-logical terms. If M = \langle d,I\rangle is an interpretation of \LKe, then we define the restriction of M to \mathcal{L}1K'{=} to be the interpretation M'=\langle d,I'\rangle such that I' is the restriction of I to K'. That is, M and M' have the same domain and agree on the non-logical terminology in K'. A straightforward induction establishes the following:
Theorem 15. If M' is the restriction of M to \mathcal{L}1K'{=}, then for every sentence \theta of \mathcal{L}1K', M\vDash\theta if and only if M'\vDash \theta.
Theorem 16. If two interpretations M_1 and M_2 have the same domain and agree on all of the non-logical terminology of a sentence \theta, then M_1\vDash\theta if and only if M_2\vDash \theta.
In short, the satisfaction of a sentence \theta only depends on the domain of discourse and the interpretation of the non-logical terminology in \theta.
We say that an argument \langle \Gamma,\theta \rangle is semantically valid, or just valid, written \Gamma \vDash \theta, if for every interpretation M of the language, if M\vDash\psi, for every member \psi of \Gamma, then M\vDash\theta. If \Gamma \vDash \theta, we also say that \theta is a logical consequence, or semantic consequence, or model-theoretic consequence of \Gamma. The definition corresponds to the informal idea that an argument is valid if it is not possible for its premises to all be true and its conclusion false. Our definition of logical consequence also sanctions the common thesis that a valid argument is truth-preserving – to the extent that satisfaction represents truth. Officially, an argument in \LKe is valid if its conclusion comes out true under every interpretation of the language in which the premises are true. Validity is the model-theoretic counterpart to deducibility.
A sentence \theta is logically true, or valid, if M\vDash \theta, for every interpretation M. A sentence is logically true if and only if it is a consequence of the empty set. If \theta is logically true, then for any set \Gamma of sentences, \Gamma \vDash \theta. Logical truth is the model-theoretic counterpart of theoremhood.
A sentence \theta is satisfiable if there is an interpretation M such that M\vDash \theta. That is, \theta is satisfiable if there is an interpretation that satisfies it. A set \Gamma of sentences is satisfiable if there is an interpretation M such that M\vDash\theta, for every sentence \theta in \Gamma. If \Gamma is a set of sentences and if M\vDash \theta for each sentence \theta in \Gamma, then we say that M is a model of \Gamma. So a set of sentences is satisfiable if it has a model. Satisfiability is the model-theoretic counterpart to consistency.
Notice that \Gamma \vDash \theta if and only if the set \Gamma,\neg \theta is not satisfiable. It follows that if a set \Gamma is not satisfiable, then if \theta is any sentence, \Gamma \vDash \theta. This is a model-theoretic counterpart to ex falso quodlibet (see Theorem 10). We have the following, as an analogue to Theorem 12:
Theorem 17. Let \Gamma be a set of sentences. The following are equivalent: (a) \Gamma is satisfiable; (b) there is no sentence \theta such that both \Gamma \vDash \theta and \Gamma \vDash \neg \theta; (c) there is some sentence \psi such that it is not the case that \Gamma \vDash \psi.
Proof: (a)\Rightarrow(b): Suppose that \Gamma is satisfiable and let \theta be any sentence. There is an interpretation M such that M\vDash \psi for every member \psi of \Gamma. By the clause for negations, we cannot have both M\vDash \theta and M\vDash \neg \theta. So either \langle \Gamma,\theta \rangle is not valid or else \langle \Gamma,\neg \theta \rangle is not valid. (b)\Rightarrow(c): This is immediate. (c)\Rightarrow(a): Suppose that it is not the case that \Gamma \vDash \psi. Then there is an interpretation M such that M\vDash \theta, for every sentence \theta in \Gamma and it is not the case that M\vDash \psi. A fortiori, M satisfies every member of \Gamma, and so \Gamma is satisfiable.
5. Meta-theory
We now present some results that relate the deductive notions to their model-theoretic counterparts. The first one is probably the most straightforward. We motivated both the various rules of the deductive system D and the various clauses in the definition of satisfaction in terms of the meaning of the English counterparts to the logical terminology (more or less, with the same simplifications in both cases). So one would expect that an argument is deducible, or deductively valid, only if it is semantically valid.
Theorem 18. Soundness. For any sentence \theta and set \Gamma of sentences, if \Gamma \vdash_D \theta, then \Gamma \vDash \theta.
Proof: We proceed by induction on the number of clauses used to establish \Gamma \vdash \theta. So let n be a natural number, and assume that the theorem holds for any argument established as deductively valid with fewer than n steps. And suppose that \Gamma \vdash \theta was established using exactly n steps. If the last rule applied was (=I) then \theta is a sentence in the form t=t, and so \theta is logically true. A fortiori, \Gamma \vDash \theta. If the last rule applied was (As), then \theta is a member of \Gamma, and so of course any interpretation that satisfies every member of \Gamma also satisfies \theta. Suppose the last rule applied is (&I). So \theta has the form (\phi \amp \psi), and we have \Gamma_1 \vdash \phi and \Gamma_2 \vdash \psi, with \Gamma = \Gamma_1, \Gamma_2. The induction hypothesis gives us \Gamma_1 \vDash \phi and \Gamma_2 \vDash \psi. Suppose that M satisfies every member of \Gamma. Then M satisfies every member of \Gamma_1, and so M satisfies \phi. Similarly, M satisfies every member of \Gamma_2, and so M satisfies \psi. Thus, by the clause for “\amp” in the definition of satisfaction, M satisfies \theta. So \Gamma \vDash \theta.
Suppose the last clause applied was (\exists\mathrm{E}). So we have \Gamma_1 \vdash \exists v\phi and \Gamma_2, \phi(v|t) \vdash \theta, where \Gamma = \Gamma_1, \Gamma_2, and t does not occur in \phi , \theta , or in any member of \Gamma_2.
We need to show that \Gamma\vDash\theta. By the induction hypothesis, we have that \Gamma_1\vDash\exists v\phi and \Gamma_2, \phi(v|t)\vDash\theta. Let M be an interpretation such that M makes every member of \Gamma true. So, M makes every member of \Gamma_1 and \Gamma_2 true. Then M,s\vDash\exists v\phi for all variable assignments s, so there is an s' such that M,s'\vDash\phi. Let M' differ from M only in that I_{M'}(t)=s'(v). Then, M',s'\vDash\phi(v|t) and M',s'\vDash\Gamma_2 since t does not occur in \phi or \Gamma_2. So, M',s'\vDash\theta. Since t does not occur in \theta and M' differs from M only with respect to I_{M'}(t), M,s'\vDash\theta. Since \theta is a sentence, s' doesn't matter, so M\vDash\theta as desired. Notice the role of the restrictions on (\existsE) here. The other cases are about as straightforward.
Corollary 19. Let \Gamma be a set of sentences. If \Gamma is satisfiable, then \Gamma is consistent.
Proof: Suppose that \Gamma is satisfiable. So let M be an interpretation such that M satisfies every member of \Gamma. Assume that \Gamma is inconsistent. Then there is a sentence \theta such that \Gamma \vdash \theta and \Gamma \vdash \neg \theta. By soundness (Theorem 18), \Gamma \vDash \theta and \Gamma \vDash \neg \theta. So we have that M\vDash \theta and M\vDash \neg \theta. But this is impossible, given the clause for negation in the definition of satisfaction.
Even though the deductive system D and the model-theoretic semantics were developed with the meanings of the logical terminology in mind, one should not automatically expect the converse to soundness (or Corollary 19) to hold. For all we know so far, we may not have included enough rules of inference to deduce every valid argument. The converses to soundness and Corollary 19 are among the most important and influential results in mathematical logic. We begin with the latter.
Theorem 20. Completeness. Gödel [1930]. Let \Gamma be a set of sentences. If \Gamma is consistent, then \Gamma is satisfiable.
Proof: The proof of completeness is rather complex. We only sketch it here. Let \Gamma be a consistent set of sentences of \LKe. Again, we assume for simplicity that the set K of non-logical terminology is either finite or countably infinite (although the theorem holds even if K is uncountable). The task at hand is to find an interpretation M such that M satisfies every member of \Gamma. Consider the language obtained from \LKe by adding a denumerably infinite stock of new individual constants c_0, c_1,\ldots We stipulate that the constants, c_0, c_1,\ldots, are all different from each other and none of them occur in K. One interesting feature of this construction, due to Leon Henkin, is that we build an interpretation of the language from the language itself, using some of the constants as members of the domain of discourse. Let \theta_0 (x), \theta_1 (x),\ldots be an enumeration of the formulas of the expanded language with at most one free variable, so that each formula with at most one free variable occurs in the list eventually. Define a sequence \Gamma_0, \Gamma_1,\ldots of sets of sentences (of the expanded language) by recursion as follows: \Gamma_0 = \Gamma; and \Gamma_{n+1} = \Gamma_n,(\exists x\theta_n \rightarrow \theta_{n}(x|c_i)), where c_i is the first constant in the above list that does not occur in \theta_n or in any member of \Gamma_n. The underlying idea here is that if \exists x\theta_nis true, then c_i is to be one such x. Let \Gamma be the union of the sets \Gamma_n.
We sketch a proof that \Gamma' is consistent. Suppose that \Gamma' is inconsistent. By Theorem 9, there is a finite subset of \Gamma that is inconsistent, and so one of the sets \Gamma_m is inconsistent. By hypothesis, \Gamma_0 = \Gamma is consistent. Let n be the smallest number such that \Gamma_n is consistent, but \Gamma_{n+1} = \Gamma_n,(\exists x\theta_n \rightarrow \theta_{n}(x|c_i)) is inconsistent. By (\negI), we have that
\tag{1} \Gamma_n \vdash \neg(\exists x\theta_n \rightarrow \theta_n(x|c_i)).By ex falso quodlibet (Theorem 10), \Gamma_n, \neg \exists x\theta_n, \exists x\theta_n \vdash \theta_n (x|c_i). So by (\rightarrowI), \Gamma_n, \neg \exists x\theta_n \vdash(\exists x\theta_n \rightarrow \theta_n (x|c_i)). From this and (1), we have \Gamma_n \vdash \neg \neg \exists x\theta_n, by (\negI), and by (DNE) we have
\tag{2} \Gamma_n \vdash \exists x\theta_n .By (As), \Gamma_n, \theta_n (x|c_i), \exists x\theta_n \vdash \theta_n (x|c_i). So by (\rightarrowI), \Gamma_n, \theta_n (x|c_i)\vdash(\exists x\theta_{n} \rightarrow \theta_{n}(x|c_i)). From this and (1), we have \Gamma_n \vdash \neg \theta_n (x|c_i), by (\negI). Let t be a term that does not occur in \theta_n or in any member of \Gamma_n. By uniform substitution of t for c_i, we can turn the derivation of \Gamma_n \vdash \neg \theta_n (x|c_i) into \Gamma_n \vdash \neg \theta_n (x|t). By (\forallI), we have
\tag{3} \Gamma_n \vdash \forall v\neg \theta_n (x|v).By (As) we have \{\forall v\neg \theta_n (x|v),\theta_n\}\vdash \theta_n and by (\forallE) we have \{\forall v\neg \theta_n (x|v), \theta_n\}\vdash \neg \theta_n. So \{\forall v\neg \theta_n (x|v), \theta_n\} is inconsistent. Let \phi be any sentence of the language. By ex falso quodlibet (Theorem 10), we have that \{\forall v\neg \theta_n (x|v),\theta_n\}\vdash \phi and \{\forall v\neg \theta_n (x|v), \theta_n\}\vdash \neg \phi. So with (2), we have that \Gamma_n, \forall v\neg \theta_n (x|v)\vdash \phi and \Gamma_n, \forall v\neg \theta_n (x|v)\vdash \neg \phi, by (\existsE). By Cut (Theorem 11), \Gamma_n \vdash \phi and \Gamma_n \vdash \neg \phi. So \Gamma_n is inconsistent, contradicting the assumption. So \Gamma' is consistent.
Applying the Lindenbaum Lemma (Theorem 13), let \Gamma'' be a maximally consistent set of sentences (of the expanded language) that contains \Gamma'. So, of course, \Gamma'' contains \Gamma. We can now define an interpretation M such that M satisfies every member of \Gamma''.
If we did not have a sign for identity in the language, we would let the domain of M be the collection of new constants \{c_0, c_1, \ldots \}. But as it is, there may be a sentence in the form c_{i}=c_{j}, with i\ne j, in \Gamma''. If so, we cannot have both c_i and c_j in the domain of the interpretation (as they are distinct constants). So we define the domain d of M to be the set \{c_i | there is no j\lt i such that c_{i}=c_{j} is in \Gamma''\}. In other words, a constant c_i is in the domain of M if \Gamma'' does not declare it to be identical to an earlier constant in the list. Notice that for each new constant c_i, there is exactly one j\le i such that c_j is in d and the sentence c_{i}=c_{j} is in \Gamma''.
We now define the interpretation function I. Let a be any constant in the expanded language. By (=I) and (\existsI), \Gamma''\vdash \exists x x=a, and so \exists x x=a \in \Gamma''. By the construction of \Gamma', there is a sentence in the form (\exists x x=a \rightarrow c_i =a) in \Gamma''. We have that c_i =a is in \Gamma''. As above, there is exactly one c_j in d such that c_{i}=c_{j} is in \Gamma''. Let I(a)=c_j. Notice that if c_i is a constant in the domain d, then I(c_i)=c_i. That is each c_i in d denotes itself.
Let P be a zero-place predicate letter in K. Then I(P) is truth if P is in \Gamma'' and I(P) is falsehood otherwise. Let Q be a one-place predicate letter in K. Then I(Q) is the set of constants \{c_i | c_i is in d and the sentence Qc is in \Gamma''\}. Let R be a binary predicate letter in K. Then I(R) is the set of pairs of constants \{\langle c_i,c_j\rangle | c_i is in d, c_j is in d, and the sentence Rc_{i}c_{j} is in \Gamma''\}. Three-place predicates, etc. are interpreted similarly. In effect, I interprets the non-logical terminology as they are in \Gamma''.
The final item in this proof is a lemma that for every sentence \theta in the expanded language, M\vDash \theta if and only if \theta is in \Gamma''. This proceeds by induction on the complexity of \theta. The case where \theta is atomic follows from the definitions of M (i.e., the domain d and the interpretation function I). The other cases follow from the various clauses in the definition of satisfaction.
Since \Gamma \subseteq \Gamma'', we have that M satisfies every member of \Gamma. By Theorem 15, the restriction of M to the original language \LKe and s also satisfies every member of \Gamma. Thus \Gamma is satisfiable.
A converse to Soundness (Theorem 18) is a straightforward corollary:
Theorem 21. For any sentence \theta and set \Gamma of sentences, if \Gamma \vDash \theta, then \Gamma \vdash_D \theta.
Proof: Suppose that \Gamma \vDash \theta. Then there is no interpretation M such that M satisfies every member of \Gamma but does not satisfy \theta. So the set \Gamma,\neg \theta is not satisfiable. By Completeness (Theorem 20), \Gamma,\neg \theta is inconsistent. So there is a sentence \phi such that \Gamma,\neg \theta \vdash \phi and \Gamma,\neg \theta \vdash \neg \phi. By (\negI), \Gamma \vdash \neg \neg \theta, and by (DNE) \Gamma \vdash \theta.
Our next item is a corollary of Theorem 9, Soundness (Theorem 18), and Completeness:
Corollary 22. Compactness. A set \Gamma of sentences is satisfiable if and only if every finite subset of \Gamma is satisfiable.
Proof: If M satisfies every member of \Gamma, then M satisfies every member of each finite subset of \Gamma. For the converse, suppose that \Gamma is not satisfiable. Then we show that some finite subset of \Gamma is not satisfiable. By Completeness (Theorem 20), \Gamma is inconsistent. By Theorem 9 (and Weakening), there is a finite subset \Gamma'\subseteq \Gamma such that \Gamma' is inconsistent. By Corollary 19, \Gamma' is not satisfiable.
Soundness and completeness together entail that an argument is deducible if and only if it is valid, and a set of sentences is consistent if and only if it is satisfiable. So we can go back and forth between model-theoretic and proof-theoretic notions, transferring properties of one to the other. Compactness holds in the model theory because all derivations use only a finite number of premises.
Recall that in the proof of Completeness (Theorem 20), we made the simplifying assumption that the set K of non-logical constants is either finite or denumerably infinite. The interpretation we produced was itself either finite or denumerably infinite. Thus, we have the following:
Corollary 23. Löwenheim-Skolem Theorem. Let \Gamma be a satisfiable set of sentences of the language \LKe. If \Gamma is either finite or denumerably infinite, then \Gamma has a model whose domain is either finite or denumerably infinite.
In general, let \Gamma be a satisfiable set of sentences of \LKe, and let \kappa be the larger of the size of \Gamma and denumerably infinite. Then \Gamma has a model whose domain is at most size \kappa.
There is a stronger version of Corollary 23. Let M_1 =\langle d_1,I_1\rangle and M_2 =\langle d_2,I_2\rangle be interpretations of the language \LKe. Define M_1 to be a submodel of M_2 if d_1 \subseteq d_2, I_1 (c) = I_2 (c) for each constant c, and I_1 is the restriction of I_2 to d_1. For example, if R is a binary relation letter in K, then for all a,b in d_1, the pair \langle a,b\rangle is in I_1 (R) if and only if \langle a,b\rangle is in I_2 (R). If we had included function letters among the non-logical terminology, we would also require that d_1 be closed under their interpretations in M_2. Notice that if M_1 is a submodel of M_2, then any variable-assignment on M_1 is also a variable-assignment on M_2.
Say that two interpretations M_1 =\langle d_1,I_1\rangle, M_2 =\langle d_2,I_2\rangle are equivalent if one of them is a submodel of the other, and for any formula of the language and any variable-assignment s on the submodel, M_1,s\vDash \theta if and only if M_2,s\vDash \theta. Notice that if two interpretations are equivalent, then they satisfy the same sentences.
Theorem 25. Downward Löwenheim-Skolem Theorem. Let M = \langle d,I\rangle be an interpretation of the language \LKe. Let d_1 be any subset of d, and let \kappa be the maximum of the size of K, the size of d_1, and denumerably infinite. Then there is a submodel M' = \langle d',I'\rangle of M such that (1) d' is not larger than \kappa, and (2) M and M' are equivalent. In particular, if the set K of non-logical terminology is either finite or denumerably infinite, then any interpretation has an equivalent submodel whose domain is either finite or denumerably infinite.
Proof: Like completeness, this proof is complex, and we rest content with a sketch. The downward Löwenheim-Skolem theorem invokes the axiom of choice, and indeed, is equivalent to the axiom of choice (see the entry on the axiom of choice). So let C be a choice function on the powerset of d, so that for each non-empty subset e\subseteq d, C(e) is a member of e. We stipulate that if e is the empty set, then C(e) is C(d).
Let s be a variable-assignment on M, let \theta be a formula of \LKe, and let v be a variable. Define the v-witness of \theta over s, written w_v (\theta,s), as follows: Let q be the set of all elements c\in d such that there is a variable-assignment s' on M that agrees with s on every variable except possibly v, such that M,s'\vDash \theta, and s'(v)=c. Then w_v (\theta,s) = C(q). Notice that if M,s\vDash \exists v\theta, then q is the set of elements of the domain that can go for v in \theta. Indeed, M,s\vDash \exists v\theta if and only if q is non-empty. So if M,s\vDash \exists v\theta, then w_v (\theta,s) (i.e., C(q)) is a chosen element of the domain that can go for v in \theta. In a sense, it is a “witness” that verifies M,s\vDash \exists v\theta.
If e is a non-empty subset of the domain d, then define a variable-assignment s to be an e-assignment if for all variables u, s(u) is in e. That is, s is an e-assignment if s assigns an element of e to each variable. Define sk(e), the Skolem-hull of e, to be the set:
\begin{align*} e \cup \{w_v (\theta,s)|& \theta \text{ is a formula in } \LKe, \\ & v \text{ is a variable, and } \\ & s \text{ is an } e\text{-assignment} \}. \end{align*}That is, the Skolem-Hull of e is the set e together with every v-witness of every formula over every e-assignment. Roughly, the idea is to start with e and then throw in enough elements to make each existentially quantified formula true. But we cannot rest content with the Skolem-hull, however. Once we throw the “witnesses” into the domain, we need to deal with sk(e) assignments. In effect, we need a set which is its own Skolem-hull, and also contains the given subset d_1.
We define a sequence of non-empty sets e_0, e_1,\ldots as follows: if the given subset d_1 of d is empty and there are no constants in K, then let e_0 be C(d), the choice function applied to the entire domain; otherwise let e_0 be the union of d_1 and the denotations under I of the constants in K. For each natural number n, e_{n+1} is sk(e_n). Finally, let d' be the union of the sets e_n, and let I' be the restriction of I to d'. Our interpretation is M' = \langle d',I'\rangle.
Clearly, d_1 is a subset of d', and so M' is a submodel of M. Let \kappa be the maximum of the size of K, the size of d_1, and denumerably infinite. A calculation reveals that the size of d' is at most \kappa, based on the fact that there are at most \kappa-many formulas, and thus, at most \kappa-many witnesses at each stage. Notice, incidentally, that this calculation relies on the fact that a denumerable union of sets of size at most \kappa is itself at most \kappa. This also relies on the axiom of choice.
The final item is to show that M' is equivalent to M: For every formula \theta and every variable-assignment s on M',
M,s\vDash \theta \text{ if and only if } M',s\vDash \theta.The proof proceeds by induction on the complexity of \theta. Unfortunately, space constraints require that we leave this step as an exercise.
Another corollary to Compactness (Corollary 22) is the opposite of the Löwenheim-Skolem theorem:
Theorem 26. Upward Löwenheim-Skolem Theorem. Let \Gamma be any set of sentences of \LKe, such that for each natural number n, there is an interpretation M_n = \langle d_n,I_n\rangle, such that d_n has at least n elements, and M_n satisfies every member of \Gamma. In other words, \Gamma is satisfiable and there is no finite upper bound to the size of the interpretations that satisfy every member of \Gamma. Then for any infinite cardinal \kappa, there is an interpretation M=\langle d,I\rangle, such that the size of d is at least \kappa and M satisfies every member of \Gamma.
Proof: Add a collection of new constants \{c_{\alpha} | \alpha \lt \kappa \}, of size \kappa, to the language, so that if c is a constant in K, then c_{\alpha} is different from c, and if \alpha \lt \beta \lt \kappa, then c_{\alpha} is a different constant than c_{\beta}. Consider the set of sentences \Gamma' consisting of \Gamma together with the set \{\neg c_{\alpha}=c_{\beta} | \alpha \ne \beta \}. That is, \Gamma' consists of \Gamma together with statements to the effect that any two different new constants denote different objects. Let \Gamma'' be any finite subset of \Gamma', and let m be the number of new constants that occur in \Gamma''. Then expand the interpretation M_m to an interpretation M_m' of the new language, by interpreting each of the new constants in \Gamma'' as a different member of the domain d_m. By hypothesis, there are enough members of d_m to do this. One can interpret the other new constants at will. So M_m is a restriction of M_m'. By hypothesis (and Theorem 15), M'_m satisfies every member of \Gamma. Also M'_m satisfies the members of \{\neg c_{\alpha}=c_{\beta} | \alpha \ne \beta \} that are in \Gamma''. So M'_m satisfies every member of \Gamma''. By compactness, there is an interpretation M = \langle d,I\rangle such that M satisfies every member of \Gamma'. Since \Gamma' contains every member of \{\neg c_{\alpha}=c_{\beta} | \alpha \ne \beta \}, the domain d of M must be of size at least \kappa, since each of the new constants must have a different denotation. By Theorem 15, the restriction of M to the original language \LKe satisfies every member of \Gamma.
Combined, the proofs of the downward and upward Löwenheim-Skolem theorems show that for any satisfiable set \Gamma of sentences, if there is no finite bound on the models of \Gamma, then for any infinite cardinal \kappa, there is a model of \Gamma whose domain has size exactly \kappa. Moreover, if M is any interpretation whose domain is infinite, then for any infinite cardinal \kappa, there is an interpretation M' whose domain has size exactly \kappa such that M and M' are equivalent.
These results indicate a weakness in the expressive resources of first-order languages like \LKe. No satisfiable set of sentences can guarantee that its models are all denumerably infinite, nor can any satisfiable set of sentences guarantee that its models are uncountable. So in a sense, first-order languages cannot express the notion of “denumerably infinite”, at least not in the model theory. (See the entry on second-order and higher-order logic.)
Let A be any set of sentences in a first-order language \LKe, where K includes terminology for arithmetic, and assume that every member of A is true of the natural numbers. We can even let A be the set of all sentences in \LKe that are true of the natural numbers. Then A has uncountable models, indeed models of any infinite cardinality. Such interpretations are among those that are sometimes called unintended, or non-standard models of arithmetic. Let B be any set of first-order sentences that are true of the real numbers, and let C be any first-order axiomatization of set theory. Then if B and C are satisfiable (in infinite interpretations), then each of them has denumerably infinite models. That is, any first-order, satisfiable set theory or theory of the real numbers, has (unintended) models the size of the natural numbers. This is despite the fact that a sentence (seemingly) stating that the universe is uncountable is provable in most set-theories. This situation, known as the Skolem paradox, has generated much discussion, but we must refer the reader elsewhere for a sample of it (see the entry on Skolem’s paradox and Shapiro 1996).
6. The One Right Logic?
Surely, logic has something to do with correct reasoning, or at least correct deductive reasoning. The details of the connection are subtle, and controversial – see Harman [1984] for an influential study. It is common to say that someone has reasoned poorly if they have not reasoned logically, or that a given (deductive) argument is bad, and must be retracted, if it is shown to be invalid.
Some philosophers and logicians have maintained that there is a single logical system that is uniquely correct, in its role of characterizing validity. Among those, some, perhaps most, favor classical, first-order logic as uniquely correct, as the One True Logic. See, for example, Quine [1986], Resnik [1996], Rumfitt [2015], Williamson [2017], and a host of others.
That classical, first-order logic should be given this role is perhaps not surprising. It has rules which are more or less intuitive, and is simple for how strong it is. As we have seen in section 5, classical, first-order logic has interesting and important meta-theoretic properties, such as soundness and completeness, that have lead to many important mathematical and logical studies.
However, as noted, the main meta-theoretic properties of classical, first-order logic lead to expressive limitations of the formal languages and model-theoretic semantics. Key notions, like finitude, countability, minimal closure, natural number, and the like cannot be expressed.
Barwise [1985, 5] once remarked:
As logicians, we do our subject a disservice by convincing others that logic is first-order and then convincing them that almost none of the concepts of modern mathematics can really be captured in first-order logic.
And Wang [1974, 154]:
When we are interested in set theory or classical analysis, the Löwenheim-Skolem theorem is usually taken as a sort of defect... of the first-order logic... [W]hat is established [by these theorems] is not that first-order logic is the only possible logic but rather that it is the only possible logic when we in a sense deny reality to the concept of [the] uncountable...
Other criticisms of classical, first-order logic have also been lodged. There are issues with its ability to deal with certain paradoxes (see, for example, the entry on Russel’s paradox ), its apparent overgeneration of beliefs (see the entry on (the normative status of logic), and some argue that it has some arguments that do not match with the way we normally think we think (see for example, the entry on relevance logic).
There are two main options available to those who are critical of classical, first-order logic, as the One True Logic. One is to propose some other logic as the One True Logic. Priest [2006a] describes the methodology one might use to settle in the One True Logic.
The other main option is to simply deny that there is a single logic that qualifies as the One True Logic. One instance of this is a kind of logical nihilism, a thesis that there is no correct logic. Another is a logical pluralism, the thesis that a variety of different logical all qualify as correct, or best, or even the true logic, at least in various contexts.
Of course, this is not the place to pursue this matter in detail. See Beall and Restall [2006] and Shapiro [2014] for examples of pluralism, and the entry on logical pluralism for an overview of the terrain for both logical pluralism and logical nihilism.
We close with brief sketches of some of the main alternatives to classical, first-order logic, providing references to other work and entries to this Encyclopedia. See also the second half of Shapiro and Kouri Kissel [2022].
6.1 Rivals to classical, first-order logic
Approximations
In recent years, some work has been done to "approximate" classical logic. The idea is to get as close to classical logic as possible, in order to preserve some of the benefits, while at the same time removing some limitations of classical logic, like being closer to intuitive inference or applying to things like vagueness and paradoxes.
For example, Barrio, Pailos and Szmuc [2020] show that we can approximate classical logic in something called the ST-hierarchy (ST for strict-tolerant, from Cobreros, Egre, Ripley and van Rooij [2012a,b]). This allows them to avoid certain classical problems at each level of the hierarchy, like some of the paradoxes, while at the same time maintaining many of the benefits of the strength of classical logic when considering the full hierarchy.
Dave Ripley [2013] provides a multi-sequent calculus version of “classical logic” that he argues solves some of the paradoxes. Notably, he claims it solves at least the Sorites and Liar Paradoxes (see the entries on the sorites paradox and liar Paradox). The system conservatively extends classical logic. Ripley claims that this is what makes it classical. However, the system is not transitive, and does not have a Cut rule.
There are, of course, some questions about whether these new logics are really classical, but it is informative work nonetheless.
Expansions
One way to extend classical, first-order logic is to add additional operators to the underlying formal language. Modal logic adds operators which designate necessity and possibility. So, we can say that a proposition is possibly true, or necessarily true, rather than just true.
W. V. O Quine [1953] once argued that it is not coherent for quantifiers to bind variables inside modal operators, but opinion on this matter has since changed considerably (see, for example, Barcan [1990]). There is now a thriving industry of developing modal logics to capture various kinds of modality and temporal operators. See the entry on modal logic.
All of the formal languages sketched above have only one sort of variable. These are sometimes called first-order variables. Each interpretation of the language has a domain, which is the range of these first-order variables. It is what the language is about, according to the given interpretation. Second-order variables range over properties, sets, classes, relations, or functions of the items in that domain. Third-order variables range over properties, classes, relations of whatever is in the range of the second-order variables. And it goes on from there.
A formal language is called second-order if it has second-order variables and first-order variables, and no others; Third-order if it has third-order, second-order, and first-order variables and no others, etc. A formal language is higher-order if it is at least second-order.
As noted, it is not an exaggeration to say that classical, first-order logic is the paradigm of contemporary logical theory. Most textbooks do not mention higher-order languages at all, and most of the rest give it scant treatment.
A number of different deductive systems and model-theoretic semantics have been proposed for second- and higher-order languages. For the semantics, the main additional feature of the model-theory is to specify a range of the higher-order variables.
In Henkin semantics, each interpretation specifies a specific range of the higher-order variables. For monadic second-order variables, each interpretation specifies a non-empty subset of the powerset of the domain, for two-place second-order variables, a non-empty set of ordered pairs of members of the domain, etc. The system has all of the above limitative meta-theoretic results. There is a deductive system that is sound and complete for Henkin semantics; the logic is compact; and the downward and upward Löwenheim-Skolem theorems all hold.
In so-called standard semantics, sometimes called full semantics, monadic second-order variables range over the entire powerset of the domain; two-place second-order variables range over the entire class of ordered pairs of members of the domain, etc. It can be shown that second-order languages, with standard semantics, can characterize many mathematical notions and structures, up to isomorphism. Examples include the notions of finitude, countability, well-foundedness, minimal closure, and structures like the natural numbers, the real numbers, and the complex numbers. As a result, none of the limitative theorems of classical, first-order logic hold: there is no effective deductive system is both sound and complete, the logic is not compact, and both Löwenheim-Skolem theorems fail. Some, such as Quine [1986], argue that second-order logic, with standard semantics is not really logic, but is a form of mathematics, set theory in particular. For more on this, see Shapiro [1991] and the entry on higher-order logic, along with the many references cited there.
One might also consider generalized quantifiers as an expansion of classical first-order logic (see the entry on generalized quantifiers). These quantifiers allow from an expansion between the classical “all” and “some” , and can accommodate quantifiers like “most” , “less than half” , “usually” , etc. They are useful from both a logical and linguistic perspective. For example, Kennedy and Väänänen [2021] use generalized quantifiers to argue that “ uncountable” is a logical notion.
6.2 Sublogics of classical, first-order logic
Some philosophers and logicians argue that classical, first-order logic is too strong: it declares that some argument-forms are valid which are not. Here we sketch two kinds of proposals.
Intuitionistic logic
Advocates of intuitionistic logic reject the validity of the (so-called) Law of Excluded Middle:
\Phi \vee \neg \Phi,and other inferences related to this, such as Double Negation Elimination (DNE):
{\rm If}\ \Gamma \vdash \neg\neg\Phi \ {\rm then}\ \Gamma \vdash \PhiRoughly speaking, there are two main motivations for these restrictions. The traditional intuitionists L. E. J. Brouwer (e.g., [1964a], [1964b]) and Arend Heyting (e.g. [1956]) held that the essence of mathematics is idealized mental construction. Consider, for example, the proposition that for every natural number n, there is a prime number m \gt n such that m \lt n!+2. For Brouwer, this proposition invokes a procedure that, given any natural number n, produces a prime number m that is greater than n but less than n!+2. The proposition expresses the existence of such a procedure. Given this orientation, we have no reason to hold that for any mathematical proposition \Phi, we can establish either the procedure associated with \Phi or the procedure associated with \neg \Phi.
Michael Dummett (e.g., [1978]) provides general arguments concerning how language functions, as a vehicle of communication, to argue that intuitionistic logic is uniquely correct, the One True Logic, not just for mathematics.
For an overview of intuitionistic logic, and its philosophical motivation, see the entry on intuitionistic logic.
Relevance and paraconsistency
This time the target inference, to be declared invalid, is the one we above call ex falso quodlibet, abbreviated (EFQ): {\rm If} \ \Gamma_1 \vdash \Theta \ {\rm and} \ \Gamma_2 \vdash \neg\Theta \ {\rm then} \ \Gamma_1, \Gamma_2 \vdash \Psi We can focus attention one kind of instance of this: \Phi, \neg\Phi \vdash \Psi, sometimes colorfully called “explosion”. It says that anything at all follows from a contradiction.
Logics that regard (EFQ) as invalid are called paraconsistent. Broadly speaking, there are two camps of logicians advocating for paraconsistent systems, either as candidates for the One True Logic or as instances of pluralism. One camp consists of logicians who insist that in a valid argument, the premises must be relevant to the conclusion. Typically, relevance logicians also demur from certain classical logical truths called paradoxes of material implication, such as (\Phi \rightarrow (\Psi \rightarrow \Phi)) and (\Phi \rightarrow (\Psi \rightarrow \Psi)).
For more, see the entry on relevance logic, or Kerr [2019]. Classic works include Anderson and Belnap [1975], Anderson Belnap and Dunn [1992], and Read [1988]. Neil Tennant’s [2017] core logic is both relevant and intuitionistic.
The other main camp of logicians who prefer a paraconsistent logic (or paraconsistent logics) are advocates of dialetheism, the view that some contradictions, some sentences in the form (\Phi \wedge \neg \Phi), are true. One supposed example is when \Phi is a statement of a semantic paradoxes, such as the Liar. Consider, for example, a sentence \Phi that says that \Phi is not true.
In a system in which (EFQ) holds, any true contradiction would entail every sentence of the formal language, thus rendering the language and theory trivial. So, clearly, any logic for dialetheism would have to be paraconsistent. See the entry on dialetheism. The classic work here is Priest [2006a].
Of course, the small sample presented here does not include every logical system proposed as a rival to classical, first-order logic, again either as a candidate for the One True Logic, or as a further instance of logical pluralism. See, for example, the entries on substructural logics, fuzzy logic, and many others.
Bibliography
- Anderson, Alan and Nuel Belnap, 1975, Entailment: The logic of relevance and necessity I, Princeton: Princeton University Press.
- Anderson, Alan, Nuel Belnap, and J. Michael Dunn, 1992, Entailment: The logic of relevance and necessity II, Princeton: Princeton University Press.
- Barcan Marcus, Ruth. 1990, “A Backwards Look at Quine’s Animadversions on Modalities,” in R. Bartrett and R. Gibson (eds.), Perspectives on Quine, Cambridge: Blackwell.
- Barrio, Eduardo Alejandro., Federico Pailos, and Damian Szmuc, 2020, “A Hierarchy of Classical and Paraconsistent Logics”, J Philos Logic, 49: 93–120. doi:10.1007/s10992-019-09513-z
- Barwise, Jon, 1985, “Model-theoretic logics: background and aims”, in Model-theoretic logics, edtied by Jon Barwise and Soloman Feferman (eds.), New York, Springer-Verlag, pp. 3–23.
- Beall, Jc and Greg Restall, 2006, Logical Pluralism, Oxford: Oxford University Press.
- Brouwer, L.E.J., 1949, “Consciousness, Philosophy and Mathematics”, Journal of Symbolic Logic, 14(2): 132–133.
- –––, 1964b, “Intuitionism and Formalism”, in Philosophy of mathematics: selected readings edited by P. Benacerraf and H. Putnam, Englewood Cliffs, NJ, Cambridge University Press, (eds.), 77–89.
- Cobreros, Pablo, Paul Egré, David Ripley, and Robert van Rooij, 2012, “Tolerance and Mixed Consequence in the S’valuationist Setting”, Studia logica, 100(4), 855–877.
- –––, 2012, “Tolerant, classical, strict”, Journal of Philosophical Logic, 41(2), 347–385.
- Cook, Roy, 2002, “Vagueness and mathematical precision”, Mind, 111: 227–247.
- Corcoran, John, 1973, “Gaps between logical theory and mathematical practice”, The methodological unity of science, ed. by M. Bunge, Dordrecht: D. Reidel, 23–50.
- Davidson, Donald, 1984, Inquiries into truth and interpretation, Oxford: Clarendon Press.
- Dummett, Michael, 2000, Elements of intuitionism, second edition, Oxford: Oxford University Press.
- –––, 1978, “The philosophical basis of intuitionistic logic”, in Truth and other enigmas, Cambridge, MA, Harvard University Press, 215–247.
- Gödel, Kurt, 1930, “Die Vollständigkeit der Axiome des logischen Funktionenkalkuls”, Montatshefte für Mathematik und Physik 37, 349–360; translated as “The completeness of the axioms of the functional calculus of logic”, in van Heijenoort 1967, 582–591.
- Harman, Gilbert, 1984, “Logic and reasoning”, Synthese, 60, 107–127.
- Heyting, A., 1956, Intuitionism, Amsterdam: North-Holland Publishing.
- Kerr, Alison Duncan, 2019, “A plea for KR”, Synthese, 198(4): 3047–3071.
- Lycan, William, 1984, Logical form in natural language, Cambridge, MA: The MIT Press.
- Montague, Richard, 1974, Formal philosophy, ed. by R. Thomason, New Haven: Yale University Press.
- Kennedy, Juliette, and Jouko Väänänen, 2021, Logicality and modelclasses. Bulletin of Symbolic Logic, 27(4): 385–414.
- Priest, Graham, 2006a, In contradiction, a study of the transconsistent, second, revised edition, Oxford: Clarendon Press.
- –––, 2006b, Doubt truth to be a liar, Oxford: Clarendon Press.
- Quine, W. V. O., 1960, Word and object, Cambridge, MA: The MIT Press.
- –––, 1953, “Three grades of modal involvement”, Proceedings of the XIth International Congress of Philosophy, 14, Amsterdam, North Holland Publishing Company, 65–81.
- –––, 1986, Philosophy of logic, second edition, Cambridge, Massachusetts: Harvard University Press.
- –––, 1986, Philosophy of logic, second edition, Englewood Cliffs: Prentice-Hall.
- Read, Stephen, 1988, Relevant logic, Oxford: Oxford University Press.
- Resnik, Michael, 1996, “Ought there to be but one true logic”, in Logic and Reality: Essays on the Legacy of Arthur Prior, J. Copeland (ed.), Oxford: Oxford University Press, 489–517.
- Ripley, David, 2013, “Paradoxes and Failures of Cut”, Australasian Journal of Philosophy, 91(1): 139–164.
- Rumfitt, Ian, 2015, The Boundary Stones of Thought: An Essay in the Philosophy of Logic, Oxford: Oxford University Press.
- Shapiro, Stewart, 1991, Foundations without Foundationalism, Oxford: Clarendon Press.
- –––, 1996, The limits of logic: Second-order logic and the Skolem paradox, The international research library of philosophy, Dartmouth Publishing Company, 1996. (An anthology containing many of the significant later papers on the Skolem paradox.)
- –––, 1998, “Logical consequence: models and modality”, in The philosophy of mathematics today, edited by M. Schirn, Oxford: Oxford University Press, 131–156.
- –––, 2014, Varieties of Logic, Oxford: Oxford University Press.
- Shapiro, Stewart and Teresa Kouri Kissel, Classical, first order logic, Cambridge Elements, Cambridge, Cambridge University Press.
- Tennant, Neil, 1997, The taming of the true, Oxford: Clarendon Press.
- Van Heijenoort, Jean, 1967, From Frege to Gödel, Cambridge, Massachusetts: Harvard University Press. An anthology containing many of the major historical papers on mathematical logic in the early decades of the twentieth century.
- Wang, Hao, 1974, From Mathematics to Philosophy, London, Routledge and Kegan Paul.
- Williamson, Timothy, 2017, “Semantic paradoxes and abductive methodology”, in Reflections on the liar edited by Bradley Armour-Garb, Oxford, Oxford University Press, 325–346.
Further Reading
There are many fine textbooks on mathematical logic. A sample follows.
- Boolos, George S., John P. Burgess, and Richard C. Jeffrey, 2007, Computability and logic, fifth edition, Cambridge, England: Cambridge University Press. Elementary and intermediate level.
- Bergmann, Merrie, James Moor, and Jack Nelson, 2013, The logic book, sixth edition, New York: McGraw-Hill. Elementary and intermediate level.
- Church, Alonzo, 1956, Introduction to mathematical logic, Princeton: Princeton University Press. Classic textbook.
- Enderton, Herbert, 1972, A mathematical introduction to logic, New York: Academic Press. Textbook in mathematical logic, aimed at a mathematical audience.
- Forbes, Graeme, 1994, Modern Logic, Oxford: Oxford University Press. Elementary textbook.
- Mendelson, Elliott, 1987, Introduction to mathematical logic, third edition, Princeton: van Nostrand. Intermediate.
Academic Tools
How to cite this entry. Preview the PDF version of this entry at the Friends of the SEP Society. Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers, with links to its database.
Other Internet Resources
[Please contact the author with suggestions.]