# The Notation in *Principia Mathematica*

*First published Thu Aug 19, 2004; substantive revision Tue Jun 28, 2011*

*Principia Mathematica* [PM] by A.N. Whitehead and Bertrand
Russell, published 1910–1913 in three volumes by Cambridge
University Press, contains a derivation of large portions of
mathematics using notions and principles of symbolic logic. The
notation in that work has been superseded by the subsequent
development of logic during the 20^{th} century, to the extent
that the beginner has trouble reading PM at all. This article provides
an introduction to the symbolism of PM, showing how that symbolism can
be translated into a more contemporary notation which should be
familiar to anyone who has had a first course in symbolic logic. This
translation is offered as an aid to learning the original notation,
which itself is a subject of scholarly dispute, and embodies
substantive logical doctrines so that it cannot simply be replaced by
contemporary symbolism. Learning the notation, then, is a first step
to learning the distinctive logical doctrines of *Principia
Mathematica*.

- 1. Why Learn the Symbolism in
*Principia Mathematica?* - 2. Primitive Symbols
- 3. The Use of Dots for Punctuation
- 4. Propositional Functions
- 5. The Missing Notation for Types and Orders
- 6. Variables
- 7. Predicative Functions and Identity
- 8. Definite Descriptions
- 9. Classes
- 10. Going On—Logic with Relations in Extension and Descriptive Functions
- Bibliography
- Academic Tools
- Other Internet Resources
- Related Entries

## 1. Why Learn the Symbolism in *Principia Mathematica?*

*Principia Mathematica* [PM] was written jointly by Alfred
North Whitehead and Bertrand Russell over several years, and published
in three volumes, which appeared between 1910 and 1913. It presents a
system of symbolic logic and then turns to the foundations of
mathematics to carry out the logicist project of defining mathematical
notions in terms of logical notions and proving the fundamental axioms
of mathematics as theorems of logic. While hugely important in the
development of logic, philosophy of mathematics and more broadly of
“Early Analytic Philosophy”, the work itself is no longer
studied for these topics. As a result the very notation of the work
has become alien to contemporary students of logic, and that has
become a barrier to the study of *Principia Mathematica*.

This entry is intended to assist the student of PM in reading the symbolic portion of the work. What follows is a partial translation of the symbolism into a more contemporary notation, which should be familiar from other articles in this Encyclopedia, and which is quite standard in contemporary textbooks of symbolic logic. No complete algorithm is supplied, rather various suggestions are intended to help the reader learn the symbolism of PM. Many issues of interpretation would be prejudged by only using contemporary notation, and many details that are unique to PM depend on that notation. It will be seen below, with some of the more contentious aspects of the notation, that doctrines of substance are built into the notation of PM. Replacing the notation with a more modern symbolism would drastically alter the very content of the book.

## 2. Primitive Symbols

Below the reader will find, in the order in which they are introduced in PM, the following symbols, which are briefly described. More detail is provided in what follows:

∗ | pronounced “star”; indicates a number, or chapter, as in ∗1, or ∗20. | ||||||||||||

· | a centered dot (an old British decimal point); indicates a numbered sentence in the order by first digit (all the 0s preceding all the 1s etc.), then second digit, and so on. The first definitions and propositions of ∗1 illustrate this “lexicographical” ordering: 1·01, 1·1, 1·11, 1·2, 1·3, 1·4, 1·5, 1·6, 1·7, 1·71, 1·72. | ||||||||||||

\(\vdash\) | the assertion-sign; indicates an assertion, either an
axiom (i.e., a primitive proposition, which are also
annotated “\(\Pp\)”) or a theorem. | ||||||||||||

\(\Df\) | the definition sign; follows a definition. | ||||||||||||

\(.\), \( :\), \(:.\), \(::\), etc. | are dots used for delimiting punctuation; in contemporary logic, we use ( ), [ ], \(\{\ \}\), etc. | ||||||||||||

\(p, q, r\), etc. | are propositional variables. | ||||||||||||

\(\lor\), \(\supset\), \(\osim \), \(\equiv\), \(\sdot\) | are the familiar sentential connectives, corresponding
to “or”,
“if-then”, “not”,
“if and only if” and “and”,
respectively. [In the Second Edition of PM, 1925–27, the Sheffer
Stroke “\(\mid\)” is the one primitive connective. It
means “not both … and ___”.]
| ||||||||||||

\(x, y, z\), etc. | are individual variables, which are to be read with
“typical ambiguity”, i.e., with their logical
types to be filled in (see below). | ||||||||||||

\(a, b, c\), etc. | are individual constants, and stand for individuals (of
the lowest type). These occur only in the Introduction to PM, and not
in the official system. | ||||||||||||

\(xRy, aRb, R(x)\), etc. | are atomic predications, in which the objects named by
the variables or constants stand in the relation \(R\) or have the
property \(R\). These occur only in the Introduction.
“\(a\)” and “\(b\)” occur as constants only in
the Second Edition. The predications \(R(x), R(x,y)\), etc., are used
only in the Second Edition. | ||||||||||||

\(\phi\), \(\psi\), \(\chi\), etc.,
and \(f, g\), etc. |
are variables which range over propositional functions,
no matter whether those functions are simple or complex. | ||||||||||||

\(\phi x\), \(\psi x\), \(\phi(x,y)\), etc. | open atomic formulas in which both “\(x\)” and “\(\phi\)” are free. [An alternative interpretation is to view “\(\phi x\)” as a schematic letter standing for a formula in which the variable “\(x\)” is free.] | ||||||||||||

\(\hat{\phantom{x}}\) | the circumflex; when placed over a variable in an open formula (as in “\(\phi \hat{x}\)”) results in a term for a function. [This matter is controversial. See Landini 1998.] When the circumflected variable precedes a complex variable, the result indicates a class, as in \(\hat{x}\phi x\). | ||||||||||||

\(\phi\hat{x}, \psi\hat{x}, \phi(\hat{x},\hat{z}),\) etc. | Terms for propositional functions. Here are examples of such terms which are constants: “\(\hat{x}\) is happy”, “\(\hat{x}\) is bald and \(\hat{x}\) is happy”, “\(4 \lt \hat{x} \lt 6\)”, etc. If we apply, for example, the function “\(\hat{x}\) is bald and \(\hat{x}\) is happy” to the particular individual \(b\), the result is the proposition “\(b\) is bald and \(b\) is happy”. | ||||||||||||

\(\exists\) and ( ) | are the quantifiers “there exists” and
“for all” (“every”), respectively. For
example, where \(\phi x\) is a simple or complex open formula,
[These were used by Peano. More recently, \(\forall\) has been added for symmetry with \(\exists\). Some scholars see the quantfiers \((\phi)\) and \((\exists \phi\)) as substitutional.] | ||||||||||||

\(\phi x \supset_x \psi x\)
\(\phi x \equiv_x \psi x\) |
This is notation that is used to abbreviate universally quantified variables. In modern notation, these become \(\forall x(\phi x \supset \psi x)\) and \(\forall x(\phi x \equiv \psi x)\), respectively. See the definitions for this notation at the end of Section 3.2 below. | ||||||||||||

\(\bang\) | pronounced “shriek”; indicates that a function is
predicative, as in \(\phi \bang x\) or \(\phi\bang \hat{x}\).
See
Section 7. |
||||||||||||

= | the identity symbol; expresses identity, which is a
defined notion in PM, not primitive as in contemporary logic. |
||||||||||||

\(\atoi\) | read as “the”; is the inverted iota or
description operator and is used in expressions for definite
descriptions, such as \((\atoi x)\phi x\) (which is read: the \(x\)
such that \(\phi x\)). | ||||||||||||

[\((\atoi x)\phi x\)] | a definite description in brackets; this is a scope
indicator for definite descriptions. | ||||||||||||

\(E\bang \) | is defined at ∗14·02, in the context \(E\bang (\atoi
x)\phi x\), to mean that the description \((\atoi x)\phi x\) is
proper, i.e., there is exactly one \(\phi\). | ||||||||||||

\(\exists\bang \) | is defined at ∗24·03, in the context \(\exists \bang
\alpha\), to mean that the class \(\alpha\) is non-empty,
i.e., has a member. |

## 3. The Use of Dots for Punctuation

An immediate obstacle to reading PM is the unfamiliar use of dots for punctuation, instead of the more common parentheses and brackets. The system is precise, and can be learned with just a little practice. The use of dots for punctuation is not unique to PM. Originating with Peano, it was later used in works by Alonzo Church, W.V.O.Quine, and others, but it has now largely disappeared. The best way to learn to use it is to look at a few samples which are translated to formulae using parentheses, and thus to get the feel for it. What follows is an explanation as presented in PM, pages 9–10, followed by a number of examples which illustrate each of its clauses:

The use of dots. Dots on the line of the symbols have two uses, one to bracket off propositions, the other to indicate the logical product of two propositions. Dots immediately preceded or followed by “\(\lor\)” or “\(\supset\)” or “\(\equiv\)” or “\(\vdash\)”, or by “\((x)\)”, “\((x,y)\)”, “\((x,y,z)\)” … or “\((\exists x)\)”, “\((\exists x,y)\)”, “\((\exists x,y,z)\)” … or “\([(\atoi x)(\phi x)]\)” or “\([R‘y]\)” or analogous expressions, serve to bracket off a proposition; dots occurring otherwise serve to mark a logical product. The general principle is that a larger number of dots indicates an outside bracket, a smaller number indicates an inside bracket. The exact rule as to the scope of the bracket indicated by dots is arrived at by dividing the occurrences of dots into three groups which we will name I, II, and III. Group I consists of dots adjoining a sign of implication \((\supset)\) or equivalence \((\equiv)\) or of disjunction \(\lor)\) or of equality by definition \((=\Df)\). Group II consists of dots following brackets indicative of an apparent variable, such as \((x)\) or \((x,y)\) or \((\exists x)\) or \((\exists x,y)\) or \([(\atoi x)(\phi x)]\) or analogous expressions. Group III consists of dots which stand between propositions in order to indicate a logical product. Group I is of greater force than Group II, and Group II than Group III. The scope of the bracket indicated by any collection of dots extends backwards or forwards beyond any smaller number of dots, or any equal number from a group of less force, until we reach either the end of the asserted proposition or a greater number of dots or an equal number belonging to a group of equal or superior force. Dots indicating a logical product have a scope which works both backwards and forwards; other dots only work away from the adjacent sign of disjunction, implication, or equivalence, or forward from the adjacent symbol of one of the other kinds enumerated in Group II. Some examples will serve to illustrate the use of dots. (PM, 9–10)

### 3.1 Some Basic Examples

Consider the following series of extended examples, in which we examine propositions in PM and then discuss how to translate them step by step into modern notation. (Symbols below are sometimes used as names for themselves, thus avoiding some otherwise needed quotation marks. Russell is often accused of confusing use and mention, so there may well be some danger in this practice.)

#### Example 1

\[\tag*{∗1·2} {\vdash} \colon p \lor p \ldot {\supset} \ldot p \quad\Pp \]This is the second assertion of “star” 1. It is in fact an axiom or “Primitive Proposition” as indicated by the ’\(\Pp\)’. That this is an assertion (axiom or theorem) and not a definition is indicated by the use of “\(\vdash\)”. (By contrast, a definition would omit the assertion sign but conclude with a “\(\Df\)” sign.) Now the first step in the process of translating ∗1·2 into modern notation is to note the colon. Recall, from the above quoted passage, that “a larger number of dots indicates an outside bracket, a smaller number indicates an inside bracket”. Thus, the colon here (which consists of a larger number of dots than the single dots occurring on the line in ∗1·2) represents an outside bracket. So, the first step is to translate ∗1·2 to:

\[ \vdash[ p \lor p \ldot {\supset} \ldot p] \]So the brackets “[” and “]” represent the colon in ∗1·2. The scope of the colon thus extends past any smaller number of dots (i.e., one dot) to the end of the formula.

Next, the dots around the “\(\supset\)” are represented in modern notation by the parenthesis around the antecedent and consequent. Recall, in the above passage, we find “… dots only work away from the adjacent sign of disjunction, implication, or equivalence …”. Thus, the next step in the translation process is to move to the formula: \[ \vdash [(p \lor p) \supset(p)] \]

Finally, standard modern conventions allow us to delete the outer brackets and the parentheses around single letters, yielding:

\[ \vdash(p \lor p) \supset p \]Our next example involves conjunction, which is indicated by simple juxtaposition of atomic sentences, or with a dot when a substitution instance might be considered, as in the definition of conjunction in the following:

#### Example 2

\[ \tag*{∗3·01} p \sdot q \ldot {=} \ldot \osim(\osim p \lor \osim q) \quad\Df \]Here we have a case in which dots occur indicate both a “logical product” (i.e., conjunction) and delimiting brackets. As a first step in translating ∗3·01 into modern notation, we replace the first dot by an ampersand (and its corresponding scope delimiters) and replace “\(\ldot {=} \ldot\)” by “\(=_{df}\)”, to yield:

\[ (p \amp q) =_{df} [\osim (\osim p \lor \osim q)] \]The above step clearly illustrates how a “dot indicating a logical product has a scope which works both backwards and forwards”. Note that the first dot in ∗3·01, i.e., between the \(p\) and \(q\), is really optional, given the above quotation from PM. However, since we may sometimes want to substitute entire formulas for \(p\) and \(q\), the dot indicates the extent of the substituted formulas. Thus, we might have, as a substitution instance: \(r \lor s \sdot q \supset s\) (in PM notation) or \((r \lor s) \amp(q \supset s)\) (in contemporary symbols).

Finally, our modern conventions allow us to eliminate the outer parentheses from the definiendum and the brackets “[” and “]” from the definiens, yielding:

\[ p \amp q =_{df} \osim (\osim p \lor \osim q) \]Notice that the scope of the negation sign “\(\osim \)” in ∗3·01 is not indicated with dots, even in the PM system, but rather requires parentheses.

#### Example 3

\[ \tag*{∗9·01} \osim \{(x) \sdot \phi x\} \ldot {=} \ldot (\exists x) \sdot \osim \phi x \quad\Df \]If we apply the rule “dots only work away from the adjacent sign of disjunction, implication, or equivalence, or forward from the adjacent symbol of one of the other kinds enumerated in Group II” (where Group II includes “\((\exists x)\)”), then the modern equivalent would be: \[ \osim (x)\phi x =_{df} (\exists x)\osim \phi x \] or \[ \osim \forall x\phi x =_{df} \exists x\osim \phi x \]

### 3.2 The Force of Connectives

The ranking of connectives in terms of relative “force”,
or *scope*, is a standard convention in contemporary logic. If
there are no explicit parentheses to indicate the scope of a
connective those which have precedence in the ranking are presumed to
be the principal connective, and so on for subformulas. Thus, instead
formulating the following DeMorgan’s law as the cumbersome:

we nowadays write it as:

\[ \osim p \lor \osim q \equiv\osim (p \amp q) \]This simpler formulation is natural because \(\equiv\) takes precedence over (has wider “scope” than) \(\lor\) and &, and the latter take precedence over \(\osim \). Indeed parentheses are often unneeded around \(\equiv\), given a further convention on which \(\equiv\) takes precedence over \(\supset\). Thus, the formula \(p \supset q \equiv\osim p\lor q\) becomes unambiguous. We might represent these conventions by listing the connectives in groups with those with widest scope at the top:

\[\begin{array}{c} \equiv \\ \supset \\ \amp, \lor \\ \osim \end{array}\]For Whitehead and Russell, however, the symbols \(\supset\), \(\equiv\), \(\lor\) and \(\ldots =\ldots \Df\), in Group I, are of equal force. Group II consists of the variable binding expressions, quantifiers and scope indicators for definite descriptions, and Group III consists of conjunctions. Negation is below all of these. So the ranking in PM would be:

\[\begin{array}{c} \supset, \equiv, \lor \text{ and } \ldots =\ldots \quad\Df \\ (x), (x,y) \ldots (\exists x), (\exists x,y) \ldots [(\atoi x)\phi x] \\ p \sdot q \quad \text{(conjunction)} \\ \osim \end{array}\]This is what Whitehead and Russell seem to mean when they say “Group I is of greater force than Group II, and Group II than Group III.” Consider the following:

#### Example 4

\[ \tag*{∗3·12} {\vdash} \colon \osim p \ldot {\lor} \ldot \osim q \ldot {\lor} \ldot p \sdot q \]This theorem illustrates how to read multiple uses of the same number of dots within one formula. The first two dots around the \(\lor\) simply “work away” from the connective. The second “extends” until it meets with the next of the same number (the third single dot). That third dot, and the fourth “work away” from the second \(\lor\), and the final dot indicates a conjunction with narrowest scope. The result, formulated with all possible punctuation for maximum explicitness, is:

\[ \{[(\osim p) \lor (\osim q)] \lor (p \amp q)\} \]If we employ all the standard conventions for dropping parentheses, this becomes:

\[ (\osim p \lor \osim q) \lor (p \amp q) \]This illustrates the passage in the above quotation which says “The scope of the bracket indicated by any collection of dots extends backwards or forwards beyond any smaller number of dots, or any equal number from a group of less force, until we reach either the end of the asserted proposition or a greater number of dots or an equal number belonging to a group of equal or superior force.”

Before we look at a wider range of examples, a detailed example involving quantified variables will prove to be instructive. Whitehead and Russell follow Peano’s practice of expressing universally quantified conditionals (such as “All \(\phi\)s are \(\psi\)s”) with the bound variable subscripted under the conditional sign. Similarly with universally quantified biconditionals (“All and only \(\phi\)s are \(\psi\)s”). That is, the expressions “\(\phi x \supset_x \psi x\)” and “\(\phi x \equiv_x \psi x\)” are defined as follows:

\[ \tag*{∗10·02} \phi x \supset_x \psi x \ldot {=} \ldot (x) \ldot \phi x \supset \psi x \quad\Df \] \[ \tag*{∗10·03} \phi x \equiv_x \psi x \ldot {=} \ldot (x) \ldot \phi x \equiv \psi x \quad\Df \]and correspond to the following more modern formulas, respectively:

\[ \forall x(\phi x \supset \psi x) \] \[ \forall x(\phi x \equiv \psi x) \]As an exercise the reader might be inclined to formulate a rigorous algorithm for converting PM into a particular contemporary symbolism (with conventions for dropping parentheses), but the best way to learn the system is to look over a few more examples of translations, and then simply begin to read formulae directly.

### 3.3 More Examples

In the examples below, each formula number is followed first by
*Principia* notation and then its modern translation. Notice
that in ∗1·5 parentheses are used for punctuation in addition
to dots. (Primitive Propositions ∗1·2, ∗1·3,
∗1·4, ∗1·5, and ∗1·6 together constitute the
axioms for propositional logic in PM. ) Proposition ∗1·5 was
shown to be redundant by Paul Bernays in 1926. It can be derived from
appropriate instances of the others and the rule of modus ponens.

∗1·3 | \({\vdash} \colon q \ldot {\supset} \ldot p \lor q \quad\Pp\)
\(q \supset p \lor q\) |

∗1·4 | \({\vdash} \colon p \lor q \ldot {\supset} \ldot q \lor p \quad\Pp\)
\(p \lor q \supset q \lor p\) |

∗1·5 | \({\vdash} \colon p \lor (q \lor r ) \ldot {\supset} \ldot q
\lor (p \lor r ) \quad\Pp\)
\(p \lor (q \lor r ) \supset q \lor (p \lor r )\) |

∗1·6 | \({\vdash} \colondot q \supset r \ldot {\supset} \colon
p \lor q \ldot {\supset} \ldot p \lor r \quad\Pp\)
\((q \supset r ) \supset(p \lor q \supset p \lor r )\) |

∗2·03 | \({\vdash} \colon p \supset \osim q \ldot {\supset} \ldot q
\supset\osim p \)
\((p \supset\osim q) \supset(q \supset\osim p)\) |

∗3·3 | \({\vdash} \colondot p \sdot q \ldot {\supset} \ldot r
\colon {\supset} \colon p \ldot {\supset} \ldot q \supset r\)
\([(p \amp q) \supset r] \supset [p \supset(q \supset r)]\) |

∗4·15 | \({\vdash} \colondot p \sdot q \ldot {\supset} \ldot
\osim r \colon {\equiv} \colon q \sdot r \ldot {\supset}
\ldot \osim p\)
\(p \amp q \supset\osim r \equiv q \amp r \supset\osim p\) |

∗5·71 | \({\vdash} \colondot q \supset\osim r \ldot {\supset} \colon
p \lor q \sdot r \ldot {\equiv} \ldot p \sdot
r\)
\((q \supset\osim r) \supset [(p \lor q) \amp r \equiv p \amp r]\) |

∗9·04 | \( p \ldot {\lor} \ldot (x) \ldot \phi x
\colon {=} \ldot (x) \ldot \phi x \lor p
\quad\Df\)
\(p \lor \forall x\phi x =_{df} \forall x(\phi x \lor p)\) |

∗9·521 | \({\vdash} \colons (\exists x) \ldot \phi x \ldot
{\supset} \ldot q \colon {\supset} \colondot (\exists x)
\ldot \phi x \ldot {\lor} \ldot r \colon {\supset} \ldot q \lor
r\)
[\((\exists x\phi x) \supset q] \supset [((\exists x\phi x) \lor r) \supset (q \lor r)\)] |

∗10·55 | \({\vdash} \colondot (\exists x) \ldot \phi x \sdot \psi x \colon
\phi x \supset_x \psi x \colon {\equiv} \colon (\exists x) \ldot \phi
x \colon \phi x \supset_x \psi x\)
\(\exists x(\phi x \amp \psi x) \amp \forall x(\phi x \supset \psi x) \equiv \exists x\phi x \amp \forall x(\phi x \supset \psi x)\) |

## 4. Propositional Functions

There are two kinds of functions in PM. Propositional functions such as “\(\hat{x}\) is a natural number” are to be distinguished from the more familiar mathematical functions, which are called “descriptive functions” (PM, 31). Descriptive functions are defined using relations and definite descriptions. Examples of descriptive functions are \(x + y\) and “the successor of \(n\)”.

Focusing on propositional functions, Whitehead and Russell distinguish
between expressions with a free variable (such as “\(x\) is
hurt”) and names of functions (such as “\(\hat{x}\) is
hurt”) (PM, 14–15). The propositions which result from the
formula by assigning allowable values to the free variable
“x” are said to be the “ambiguous values” of
the function. Expressions using the circumflex notation, such as
\(\phi \hat{x}\) only occur in the introductory material in the
technical sections of PM and not in the technical sections themselves
(with the exception of the sections on the theory of classes),
prompting some scholars to say that such expressions do not really
occur in the formal system of PM. This issue is distinct from that
surrounding the interpretation of such symbols. Are they
“term-forming operators” which turn an open formula into a
name for a function, or simply a syntactic device, a placeholder, for
indicating the variable for which a substitution can made in an open
formula? If they are to be treated as term-forming operators, the
modern notation for \(\phi \hat{x}\) would be “\(\lambda x\phi
x\)”. The \(\lambda\)-notation has the advantage of clearly
revealing that the variable \(x\) is *bound* by the
term-forming operator \(\lambda\), which takes a predicate \(\phi\)
and yields a term \(\lambda x\phi x\) (which in some logics is a
singular term that can occur in the subject position of a sentence,
while in other logics is a complex predicative expression). Unlike
\(\lambda\)-notation, the PM notation using the circumflex cannot
indicate scope. The function expression
“\(\phi(\hat{x},\hat{z}\))” is ambiguous between
“\(\lambda x\lambda y\phi xy\)” and “\(\lambda
y\lambda x\phi xy\)”, without some further convention. Indeed,
Whitehead and Russell specified this convention for relations in
extension (on p. 200 in the introductory material of ∗21, in terms of
the order of the variables), but the ambiguity it brought out most
clearly by using \(\lambda\) notation: the first denotes the relation
of being an \(x\) and \(y\) such that \(\phi xy\) and the second
denotes the converse relation of being a \(y\) and \(x\) such that
\(\phi xy\).

## 5. The Missing Notation for Types and Orders

This section explains notation that is not in *Principia
Mathematica*. Except for some notation for “relative”
types in Volume II, there are famously no symbols for types in
*Principia Mathematica*! Sentences are generally to be taken as
“typically ambiguous” and so standing for expressions of a
whole range of types and so just as there are no individual or
predicate constants, there are no particular functions of any specific
type. So not only does one not see how to symbolize the argument:

All men are mortal

Socrates is a man

Therefore, Socrates is mortal

but also there is no indication of the logical type of the function “\(\hat{x}\) is mortal”. The project of PM is to reduce mathematics to logic, and part of the view of logic behind this project is that logical truths are all completely general. The derivation of truths of mathematics from definitions and truths of logic will thus not involve any particular constants other than those introduced by definition from purely logical notion. As a result no notation is included in PM for describing those types. Those of us who wish to consider PM as a logic which can be applied, must supplement it with some indication of types.

Readers should note that the explanation of types outlined below is
not going to correspond with the statements about types in the text of
PM. Alonzo Church [1976] developed a simple, rational reconstruction
of the notation for both the simple and ramified theory of types as
implied by the text of PM. (There are alternative, equivalent
notations for the theory of types.) The full theory can be seen as a
development of the *simple theory of types*.

### 5.1 Simple Types

A definition of the simple types can be given as follows:

- \(\iota\) (Greek iota) is the type for an
*individual*. - Where \(\tau_1,\ldots,\tau_n\) are any types, then \(\ulcorner(\tau_1,\ldots,\tau_n)\urcorner\) is the type of a propositional function whose arguments are of types \(\tau_1,\ldots,\tau_n\), respectively.
- \(\ulcorner\)( )\(\urcorner\) is the type of propositions.

Here are some intuitive ways to understand the definition of type.
Suppose that “Socrates” names an individual. (We are here
ignoring Russell’s considered opinion that such ordinary
individuals are in fact classes of classes of sense data, and so of a
much higher type.) Then the individual constant “Socrates”
would be of type \(\iota\). A monadic propositional function which
takes individuals as arguments is of type \((\iota)\). Suppose that
“is mortal” is a predicate expressing such a function. The
function “\(\hat{x}\) is mortal” will also be of type
\((\iota)\). A two-place or *binary* relation between
individuals is of type \((\iota,\iota)\). Thus, a relation expression
like “parent of” and the function “\(\hat{x}\) is a
parent of \(\hat{z}\)” will be of type \((\iota,\iota)\).

Propositional functions of type \((\iota)\) are often called “first order”; hence the name “first order logic” for the familiar logic where the variables only range over arguments of first order functions. A monadic function of arguments of type \(\tau\) are of type \((\tau)\) and so functions of such functions are of type \(((\tau))\). “Second order logic” will have variables for the arguments of such functions (as well as variables for individuals). Binary relations between functions of type \(\tau\) are of type \((\tau,\tau)\), and so on, for relations of having more than 2 arguments. Mixed types are defined by the above. A relation between an individual and a proposition (such as “\(\hat{x}\) believes that \(\hat{P}\)”) will be of type \((\iota\),( )).

### 5.2 Ramified Types

To construct a notation for the full ramified theory of types of PM,
another piece of information must be encoded in the symbols. Church
calls the resulting system one of *r-types*. The key idea of
ramified types is that any function defined using quantification over
functions of some given type has to be of a higher “order”
than those functions. To use Russell’s example:

\(\hat{x}\) has all the qualities that great generals have

is a function true of persons (i.e., individuals), and from the point
of view of *simple* type theory, it has the same simple logical
type as particular qualities of individuals (such as bravery and
decisiveness). However, in ramified type theory, the above function
will be of a higher order than those particular qualities of
individuals, since unlike those particular qualities, it involves a
quantification over those qualities. So, whereas the expression
“\(\hat{x}\) is brave” denotes a function of r-type
\((\iota)/1\), the expression “\(\hat{x}\) has all the qualities
that great generals have” will have r-type \((\iota)/2\). In
these r-types, the number after the “/” indicates the
*level* of the function. The order of the functions will be
defined and computed given the following definitions.

Church defines the r-types as follows:

- \(\iota\) (Greek iota) is the r-type for an
*individual*. - Where \(\tau_1,\ldots,\tau_m\) are any r-types,
\(\ulcorner(\tau_1,\ldots,\tau_m)/n\urcorner\) is an r-type; this is
the r-type of a \(m\)-ary propositional function of
*level*\(n\), which has arguments of r-types \(\tau_1,\ldots,\tau_m\).

The *order* of an entity is defined as follows (here we no
longer follow Church, for he defines orders for variables, i.e.,
expressions, instead of orders for the things the variables range
over):

- the order of an individual (of r-type \(\iota)\) is 0,
- the order of a function of r-type \((\tau_1,\ldots,\tau_m)/n\) is \(n+N\), where \(N\) is the greatest of the order of the arguments \(\tau_1,\ldots,\tau_m\).

These two definitions are supplemented with a principle which identifies the levels of particular defined functions, namely, that the level of a defined function should be one higher than the highest order entity having a name or variable that appears in the definition of that function.

To see how these definitions and principles can be used to compute the
order of the function “\(\hat{x}\) has all the qualities that
great generals have”, note that the function can be represented
as follows, where “\(x, y\)” are variables ranging over
individuals of r-type \(\iota\) (order 0),
“GreatGeneral\((y)\)” is a predicate denoting a
propositional function of r-type \((\iota)/1\) (and so of order 1),
and “\(\phi\)” is a variable ranging over propositional
functions of r-type \((\iota)/1\) (and so of order 1) such as
*great general*, *bravery*, *leadership*,
*skill*, *foresight*, etc.:

We first note that given the above principle, the r-type of this function is \((\iota)/2\); the level is 2 because the level of the r-type of this function has to be one higher than the highest order of any entity named (or in the range of a variable used) in the definition. In this case, the denotation of GreatGeneral, and the range of the variable “\(\phi\)”, is of order 1, and no other expression names or ranges over an entity of higher order. Thus, the level of the function named above is defined to be 2. Finally, we compute the order of the function denoted above as it was defined: the sum of the level plus the greatest of the orders of the arguments of the above function. Since the only arguments in the above function are individuals (of order 0), the order of our function is just 2.

Quantifying over functions of r-type \((\tau)/n\) of order \(k\) in a
definition of a new function yields a function of r-type
\((\tau)/n+1\), and so a function of order one higher, \(k+1\). Two
kinds of functions, then, can be of the *second order*: (1)
functions of first-order functions of individuals, of r-type
\(((\iota)/1)/1\), and (2) functions of r-type \((\iota)/2\), such as
our example “\(\hat{x}\) has all the qualities that great
generals have”. This latter will be a function true of
individuals such as Napoleon, but of a higher order than simple
functions such as “\(\hat{x}\) is brave”, which are of
r-type \((\iota)/1\).

Logicians today use a different notion of “order”. Today,
first-order logic is a logic with only variables for individuals.
Second order logic is a logic with variables for both individuals and
properties of individuals. Third-order logic is a logic with variables
for individuals, properties of individuals, and properties of
properties of individuals. And so forth. By contrast, Church would
call these logics, respectively, the logic of functions of the types
\((\iota)/1\) and \((\iota,\ldots,\iota)/1\), the logic of functions
of the types \(((\iota)/1)/1\) and
\(((\iota,\ldots,\iota)/1,\ldots,(\iota,\ldots,\iota)/1)/1\), and the
logic of functions of the types \((((\iota)/1)/1)/1\) etc. (i.e., the
level-one functions of the functions of the preceding type). Given
Church’s definitions, these are logics of first-, second- and
third-order functions, respectively, thus coinciding with the modern
terminology of “\(n\)^{th}-order logic”.

## 6. Variables

As mentioned previously, there are no individual or predicate constants in the formal system of PM, only variables. The Introduction, however, makes use of the example “\(a\) standing in the relation \(R\) to \(b\)” in a discussion of atomic facts (PM, 43). Although “\(R\)” is later used as a variable that ranges over relations in extension, and “\(a,b,c,\ldots\)” are individual variables, let us temporarily add them to the system as predicate and individual constants, respectively, in order to discuss the use of variables in PM.

PM makes special use of the distinction between “real”, or
free, variables and “apparent”, or bound, variables. Since
“\(x\)” is a variable, “\(xRy\)” will be an
atomic formula in our extended language, with “\(x\)” and
“\(y\)” real variables. When such formulae are combined
with the propositional connectives \(\osim\), \(\lor\), etc., the
result is a *matrix*. For example, “\(aRx \ldot {\lor}
\ldot xRy\)” would be a matrix.

As we saw earlier, there are also variables which range over functions: “\(\phi\), \(\psi\), \(\ldots,f, g\)”, etc. The expression “\(\phi x\)” thus contains two variables and stands for a proposition, in particular, the result of applying the function \(\phi\) to the individual \(x\).

Theorems are stated with real variables, which gives them a special significance with regard to the theory. For example,

\[ \tag*{∗10·1} \vdash \colon (x) \ldot \phi x \ldot {\supset} \ldot \phi y \quad\Pp \]is a fundamental axiom of the quantificational theory of PM. In this Primitive Proposition the variables “\(\phi\)” and “\(y\)” are real (free), and the “\(x\)” is apparent (bound). As there are no constants in the system, this is the closest that PM comes to a rule of universal instantiation.

Whitehead and Russell interpret “\((x) \sdot \phi x\)” as
“the proposition which asserts *all* the values for
\(\phi \hat{x}\)” (PM 41). The use of the word “all”
has special significance within the theory of types. They present the
“vicious circle principle”, which underlies the theory of
types, as asserting that

… generally, given any set of objects such that, if we suppose the set to have a total, it will contain members which presuppose this total, then such as set cannot have a total. By saying that the set has “no total”, we mean, primarily, that no significant statement can be made about “all its members”. (PM, 37)

Specifically, then, a quantified expression, since it talks about “all” the members of a totality, must range over a specific logical type in order to observe the vicious circle principle. Thus, when interpreting a bound variable, we must assume that it ranges over a specific type of entity, and so types must be assigned to the other entities represented by expressions in the formula, in observance with the theory of types.

A question arises, however, once one realizes that the statements of
primitive propositions and theorems in PM such as ∗10·1 are
taken to be “typically ambiguous” (i.e., ambiguous with
respect to type). These statements are actually schematic and
represent all the possible specific assertions which can be derived
from them by interpreting types appropriately. But if statements like
∗10·1 are schemata and yet have bound variables, how do we
assign types to the entities over which the bound variables range? The
answer is to first decide which type of thing the free variables in
the statement range over. For example, assuming that the variable
\(y\) in ∗10·1 ranges over individuals (of type \(\iota)\),
then the variable \(\phi\) must range over functions of type
\((\iota)/n\), for some \(n\). Then the bound variable \(x\) will also
range over individuals. If, however, we assume that the variable \(y\)
in ∗10·1 ranges over *functions* of type \((\iota)/1\),
then the variable \(\phi\) must range over functions of type
\(((\iota)/1)/m\), for some \(m\). In this case, the bound variable
\(x\) will range over functions of type \((\iota)/1\).

So \(y\) and \(\phi\) are called “real” variables in ∗10·1 not only because they are free but also because they can range over any type. Whitehead and Russell frequently say that real variables are taken to ambiguously denote “any” of their instances, while bound variables (which also ambiguously denote) range over “all” of their instances (within a legitimate totality, i.e. type).

## 7. Predicative Functions and Identity

The exclamation mark “!” following a variable for a
function and preceding the argument, as in “\(f\bang
\hat{x}\)”, “\(\phi \bang x\)”, “\(\phi\bang
\hat{x}\)”, indicates that the function is *predicative*,
that is, of the lowest order which can apply to its arguments. In
Church’s notation, this means that predicative functions are all
of the first level, with types of the form \((\ldots)/1\). As a
result, predicative functions will be of order one more than the
highest order of any of their arguments. This analysis is based on
quotations like the following, in the *Introduction* to PM:

We will define a function of one variable as predicative when it is of the next order above that of its argument, i.e., of the lowest order compatible with its having that argument. (PM, 53)

Unfortunately in the summary of ∗12, we find “A predicative
function is one which contains no apparent variables, i.e., is a
matrix” [PM, 167]. Reconciling this statement with that
definition in the *Introduction* is a problem for scholars.

To see the shriek notation in action, consider the following definition of identity:

\[ \tag*{∗13·01} x = y \ldot {=} \colon (\phi) \colon \phi \bang x \ldot {\supset} \ldot \phi \bang y \quad\Df \]That is, \(x\) is identical with \(y\) if and only if \(y\) has every predicative function \(\phi\) which is possessed by \(x\). (Of course the second occurrence of “=” indicates a definition, and does not independently have meaning. It is the first occurrence, relating individuals \(x\) and \(y\), which is defined.)

To see how this definition reduces to the more familiar definition of identity (on which objects are identical iff they share the same properties), we need the Axiom of Reducibility. The Axiom of Reducibility states that for any function there is an equivalent function (i.e., one true of all the same arguments) which is predicative:

To see how this axiom implies the more familiar definition of identity, note that the more familiar definition of identity is:

\[ x = y \ldot {=} \colon (\phi) \colon \phi x \ldot {\supset} \ldot \phi y \quad\Df \]
for \(\phi\) of “any” type. (Note that this differs from
∗13·01 in that the shriek no longer appears.) Now to prove
this, assume both ∗13·01 and the Axiom of Reducibility, and
suppose, for proof by *reductio*, that \(x = y\), and \(\phi
x\), and not \(\phi y\), for some function \(\phi\) of arbitrary type.
Then, the Axiom of Reducibility ∗12·1 guarantees that there
will be a predicative function \(\psi \bang \), which is coextensive
with \(\phi\) such that \(\psi \bang x\) but not \(\psi \bang y\),
which contradicts ∗13·01.

## 8. Definite Descriptions

The inverted Greek letter iota “\(\atoi\)” is used in PM,
always followed by a variable, to begin a definite description.
\((\atoi x) \phi x\) is read as “the \(x\) such that \(x\) is
\(\phi\)”, or more simply, as “the \(\phi\)”. Such
expressions may occur in subject position, as in \(\psi(\atoi x) \phi
x\), read as “the \(\phi\) is \(\psi\)”. The formal part
of Russell’s famous “theory of definite
descriptions” consists of a definition of all formulas
“…\(\psi(\atoi x) \phi x\)…” in which a
description occurs. To distinguish the portion \(\psi\) from the rest
of a larger sentence (indicated by the ellipses above) in which the
expression \(\psi(\atoi x) \phi x\) occurs, the *scope* of the
description is indicated by repeating the definite description within
brackets:

The notion of scope is meant to explain a distinction which Russell famously discusses in “On Denoting” (1905). Russell says that the sentence “The present King of France is not bald” is ambiguous between two readings: (1) the reading where it says of the present King of France that he is not bald, and (2) the reading on which denies that the present King of France is bald. The former reading requires that there be a unique King of France on the list of things that are not bald, whereas the latter simply says that there is not a unique King of France that appears on the list of bald things. Russell says the latter, but not the former, can be true in a circumstance in which there is no King of France. Russell analyzes this difference as a matter of the scope of the definite description, though as we shall see, some modern logicians tend to think of this situation as a matter of the scope of the negation sign. Thus, Russell introduces a method for indicating the scope of the definite description.

To see how Russell’s method of scope works for this case, we must understand the definition which introduces definite descriptions (i.e., the inverted iota operator). Whitehead and Russell define:

\[ \tag*{∗14·01} [(\atoi x) \phi x] \sdot \psi(\atoi x) \phi x \ldot {=} \colon (\exists b) \colon \phi x \ldot {\equiv_x} \ldot x=b \colon \psi b \quad\Df \]
This kind of definition is called a *contextual definition*,
which are to be contrasted with *explicit* definitions. An
explicit definition of the definition description would have to look
something like the following:

which would allow the definite description to be replaced in any context by whichever defining expression fills in the ellipsis. By contrast, ∗14·01 shows how a sentence, in which there is occurrence of a description \((\atoi x)(\phi x)\) in a context \(\psi\), can be replaced by some other sentence (involving \(\phi\) and \(\psi\)) which is equivalent. To develop an instance of this definition, start with the following example:

Example.

The present King of France is bald.

Using \(PKFx\) to represent the propositional function of being a present King of France and \(B\) to represent the propositional function of being bald, Whitehead and Russell would represent the above claim as:

\[ [(\atoi x)(PKFx)] \sdot B(\atoi x)(PKFx) \]which by ∗14·01 means:

\[ (\exists b) \colon PKFx \ldot {\equiv_x} \ldot x=b \colon Bb \]In words, there is one and only one \(b\) which is a present King of France and which is bald. In modern symbols, using \(b\) non-standardly, as a variable, this becomes:

\[ (\exists b)[\forall x(PKFx \equiv x=b) \amp Bb] \]Now we return to the example which shows how the scope of the description makes a difference:

Example.

The present King of France is not bald.

There are two options for representing this sentence.

\[ [(\atoi x)(Kx)] \sdot \osim B(\atoi x)(Kx) \]and

\[ \osim [(\atoi x)(Kx)] \sdot B(\atoi x)(Kx) \]In the first, the description has “wide” scope, and in the second, the description has “narrow” scope. Russell says that the description has “primary occurrence” in the former, and “secondary occurrence” in the latter. Given the definition ∗14·01, the two PM formulas immediately above become expanded into primitive notation as:

\[ \begin{align} (\exists b) \colon PKFx \equiv_x x=b \colon \osim Bb\\ \osim (\exists b) \colon PKFx \equiv_x x=b \colon Bb \end{align} \]In modern notation these become:

\[ \begin{align} \exists x[\forall y(PKFy \equiv y=x) \amp \osim Bx]\\ \osim \exists x[\forall y(PKFy \equiv y=x) \amp Bx] \end{align} \]The former says that there is one and only one object which is a present King of France and which is not bald; i.e., there is exactly one present King of France and he is not bald. This reading is false, given that there is no present King of France. The latter says it is not the case that there is exactly one present King of France which is bald. This reading is true.

Although Whitehead and Russell take the descriptions in these examples to be the expressions which have scope, the above readings in both expanded PM notation and in modern notation suggest why some modern logicians take the difference in readings here to be a matter of the scope of the negation sign.

## 9. Classes

The circumflex “ˆ” over a variable preceding a formula is used to indicate a class, thus \(\hat{x} \psi x\) is the class of things \(x\) which are such that \(\psi x\). In modern notation we represent this class as \(\{x \mid \psi x\}\), which is read: the class of \(x\) which are such that \(x\) has \(\psi\). Recall that “\(\phi \hat{x}\)”, with the circumflex over a variable after the predicate variable, expresses the propositional function of being an \(x\) such that \(\phi x\). In the type theory of PM, the class \(\hat{x} \phi x\) has the same logical type as the function \(\phi \hat{x}\). This makes it appropriate to use the following contextual definition, which allows one to eliminate the class term \(\hat{x} \psi x\) from occurrences in the context \(f\): \[ \tag*{∗20·01} f\{ \hat{z}(\psi z)\} \ldot {=} \colon (\exists \phi) \colon \phi \bang x \ldot {\equiv_x} \ldot \psi x \colon f \{ \phi\bang \hat{z}\} \quad\Df \] or in modern notation: \[ f\{z \mid \psi z\} =_{df} \exists \phi[\forall x(\phi x \equiv \psi x) \amp f(\lambda x \phi x)] \] where \(\phi\) is a predicative function of \(x\)

Note that \(f\) has to be interpreted as a higher-order function which is predicated of the function \(\phi \bang \hat{z}\). In the modern notation used above, the language has to be a typed language in which \(\lambda\) expressions are allowed in argument position. As was pointed out later (Chwistek 1924, Gödel 1944, and Carnap 1947) there should be scope indicators for class expressions just as there are for definite descriptions. Chwistek, for example, proposed copying the notation for definite descriptions, thus replacing ∗20·01 with:

\[ [\hat{z}(\psi z)] \sdot f\{ \hat{z}(\psi z)\} \ldot {=} \colon (\exists \phi) \colon \phi \bang x \ldot {\equiv_x} \ldot \psi x \colon f \{ \phi\bang \hat{z} \} \]Contemporary formalizations of set theory make use of something like these contextual definitions, when they require an “existence” theorem of the form \(\exists x\forall y(y \in x \equiv \ldots y\ldots)\), in order to justify the introduction of a singular term \(\{y \mid \ldots y\ldots \}\). (Given the law of extensionality, it follows from \(\exists x\forall y(y \in x \equiv \ldots y\ldots)\) that there is a unique such set.) The relation of membership in classes \(\in\) is defined in PM by first defining a similar relationship between objects and propositional functions: \[ \tag*{∗20·02} x \in (\phi\bang \hat{z}) \ldot {=} \ldot \phi \bang x \quad\Df \] or, in modern notation: \[ x \in \lambda z\phi z =_{df} \phi x \]

∗20·01 and ∗20·02 together are then used to define the more familiar notion of membership in a class. The formal expression “\(y \in \{ \hat{z}(\phi z)\}\)” can now been seen as a context in which the class term occurs; it is then eliminated by the contextual definition ∗20·01. (Exercise)

PM also has Greek letters for classes: \(\alpha, \beta, \gamma\), etc.
These will appear as bound (real) variables, apparent (free) variables
and in abstracts for propositional functions true of classes, as in
\(\phi \hat{\alpha}\). Only definitions of the bound Greek variables
appear in the body of the text, the others are informally defined in
the *Introduction*:
\[
\tag*{∗20·07}
(\alpha) \sdot f \alpha \ldot {=} \ldot (\phi) \sdot f \{ \hat{z}(\phi\bang z)\} \quad\Df
\]
or, in modern notation,
\[
\forall \alpha\, f\alpha =_{df} \forall \phi f\{z\mid\phi z\}
\]
where \(\phi\) is a predicative function.

Thus universally quantified class variables are defined in terms of quantifiers ranging over predicative functions. Likewise for existential quantification: \[ \tag*{∗20·071} (\exists \alpha) \sdot f \alpha \ldot {=} \ldot (\exists \phi) \sdot f \{ \hat{z}(\phi\bang z)\} \quad\Df \] or, in modern notation, \[ \exists \alpha\, f\alpha =_{df} \exists \phi f\{z\mid\phi z\} \] where \(\phi\) is a predicative function.

Expressions with a Greek variable to the left of \(\in\) are defined: \[ \tag*{∗20·081} \alpha \in \psi\bang \hat{\alpha} \ldot {=} \ldot \psi \bang \alpha \quad\Df \]

These definitions do not cover all possible occurrences of Greek variables. In the Introduction to PM, further definitions of are \(f \alpha\) and \(f \hat{\alpha}\) proposed, but it is remarked that the definitions are in some way peculiar and they do not appear in the body of the work. The definition considered for \(f \hat{\alpha}\) is:

\[ f \hat{\alpha} \ldot {=} \ldot (\exists \psi) \sdot \hat{\phi} \bang x \equiv_x \psi \bang x \sdot f \{ \psi\bang \hat{z} \} \]or, in modern notation,

\[ \lambda \alpha\, f\alpha =_{df} \lambda \phi f\{x \mid \phi x\} \]That is, \(f \hat{\alpha}\) is an expression naming the function which takes a function \(\phi\) to a proposition which asserts \(f\) of the class of \(\phi\)s. (The modern notation shows that in the proposed definition of \(f \hat{\alpha}\) in PM notation, we shouldn’t expect \(\alpha\) in the definiens, since it is really a bound variable in \(f \hat{\alpha}\); similarly, we shouldn’t expect \(\phi\) in the definiendum because it is a bound variable in the definiens.) One might also expect definitions like ∗20·07 and ∗20·071 to hold for cases in which the Roman letter “\(z\)” is replaced by a Greek letter. The definitions in PM are thus not complete, but it is possible to guess at how they would be extended to cover all occurrences of Greek letters. This would complete the project of the “no-classes” theory of classes by showing how all talk of classes can be reduced to the theory of propositional functions.

## 10. Going On—Logic with Relations in Extension and Descriptive Functions

Although students of philosophy usually read no further than ∗20 in PM, this is in fact the point where the “construction” of mathematics really begins. ∗21 presents the “General Theory of Relations” (the theory of relations in extension; in contemporary logic these are treated as sets of ordered pairs, following Wiener). \(\hat{x} \hat{y} \psi(x, y)\) is the relation between \(x\) and \(y\) which obtains when \(\psi(x, y)\) is true. In modern notation we represent this as as the set of ordered pairs \(\{\langle x, y \rangle \mid \psi( x, y ) \}\), which is read: the set of ordered pairs \(\langle x, y \rangle\) which are such that \(x\) bears the relation \(\psi\) to \(y\).

The following contextual definition (∗21·01) allows one to eliminate the relation term \(\hat{x} \hat{y}\psi (x, y)\) from occurrences in the context \(f\):

\[ f \{ \hat{x} \hat{y} \psi ( x, y )\} \ldot {=} \colondot (\exists \phi) \colon \phi \bang ( x, y ) \ldot {\equiv_{x,y}} \ldot \psi( x, x ) \colon f \{ \phi\bang (\hat{u}, \hat{v} )\} \quad\Df \]or in modern notation:

\[ f \{\langle x, y \rangle \mid \psi( x, y )\} =_{df} \exists \phi[\forall xy (\phi(x, y) \equiv \psi( x, y) ) \amp f ( \lambda u \lambda v \phi(u,v))] \]where \(\phi\) is a predicative function of \(u\) and \(v\).

*Principia* does not analyze relations (or mathematical
functions) in terms of sets of ordered pairs, but rather takes the
notion of propositional function as primitive and defines relations
and functions in terms of them. The upper case letters \({R}, {S}\)
and \({T}\), etc., are used after ∗21 to stand for these
“relations in extension”, and are distinguished from
propositional functions by being written between the arguments. Thus
it is \(\psi(x,y)\) with arguments after the propositional function
symbol, but \(xRy\). From ∗21 functions “\(\phi\) and
\(\psi\)”, etc., disappear and only relations in extension,
\({R}\), \({S}\) and \({T}\), etc., appear in the pages of *
Principia *. While propositional functions might be true of the
same objects yet not be identical, no two relations in extension are
true of the same objects. The logic of *Principia* is thus
“extensional”, from page 200 in volume I, through to the
end in Volume III.

∗22 on the “Calculus of Classes” presents the elementary
set theory of intersections, unions and the empty set which is often
all the set theory used in elementary mathematics of other sorts. The
student looking for the set theory of *Principia* to compare it
with, say the Zermelo-Fraenkel system, will have to look at various
numbers later in the text. The Axiom of Choice is defined at ∗88 as
the “Multiplicative Axiom” and a version of the Axiom of
Infinity appears at ∗120 in Volume II as “Infin ax”. The
set theory of *Principia* comes closest to Zermelo’s
axioms of 1908 among the various familiar axiom systems, which means
that it lacks the Axiom of Foundation and Axiom of Replacement of the
now standard Zermelo-Fraenkel axioms of set theory.

∗30 on “Descriptive Functions” provides Whitehead and
Russell’s analysis of mathematical functions in terms of
relations and definite descriptions. Frege had used the notion of
function, in the mathematical sense, as a basic notion in his logical
system. Thus a Fregean “concept” is a function from
objects as arguments to one of the two “truth values” as
its values. A concept yields the value “True” for each
object to which the concept applies, and “False” for all
others. Russell, from 1904, well before the writing of
*Principia* had preferred to analyze functions in terms of the
relation between each argument and value, and the notion of
“uniqueness”. With modern symbolism, his view would be
expressed as follows. For each function \(\lambda x f(x)\), there will
be some relation (in extension) \(R\), such that the value of the
function for an argument \(a\), that is \(f(a)\), will be the unique
individual which bears the relation \(R\) to \(a\). (Nowadays we
reduce functions to a binary relation between the argument in the
first place and value in the second place.) The result is that there
are no function symbols in *Principia*. As Whitehead and
Russell say, the familiar mathematical expressions such as
“\(\sin \pi/2\)” will be analyzed with a relation and a
definite description, as a “descriptive function”. The
“descriptive function”, \(R‘y\) (the \(R\) of
\(y)\), is defined as follows:

With the exception of an incomplete notation for relative types in Volume II, the reader should be able to work out all of the rest of the notation in PM using the explanations above and the definitions in the lists at the end of volume I of PM. We conclude by presenting a number of prominent examples from these later numbers below, with their intuitive meaning, location in PM, definition in PM, and a modern equivalent. (Some of these numbers are theorems rather than definitions.) Note, however, that the modern equivalent will sometimes logically differ from the original version in PM, such as by treating relations as sets of ordered pairs, etc.

For each formula number, we present the information in the following format:

PM Symbol | (Intuitive Meaning)
[Location]
PM Definition Modern Equivalent |

\(\alpha \subset \beta\) | (\(\alpha\) is a subset of \(\beta\))
[∗22·01]
\(x\in \alpha \ldot {\supset_x} \ldot x\in \beta\) \(\alpha \subseteq \beta\) |

\(\alpha \cap \beta\) | (the intersection of \(\alpha\) and \(\beta)\)
[∗22·02]
\(\hat{x} (x \in \alpha \sdot x \in \beta\)) \(\alpha \cap \beta\) |

\(\alpha \cup \beta\) | (the union of \(\alpha\) and \(\beta\))
[∗22·03]
\(\hat{x} (x \in \alpha \lor x \in \beta\)) \(\alpha \cup \beta\) |

\(-\alpha\) | (the complement of \(\alpha)\)
[∗22·04]
\(\hat{x} (x\osim \in \alpha\)) [i.e., \(\hat{x} \osim (x \in \alpha\)) by ∗20·06] \(\{x \mid x \not\in \alpha \}\) |

\(\alpha - \beta\) | (\(\alpha\)
minus \(\beta)\) [∗22·05]
\(\alpha \cap -\beta\) \(\{x \mid x\in \alpha \amp x\not\in \beta \}\) |

\(\mathrm{V}\) | (the universal class)
[∗24·01]
\(\hat{x} (x\) = \(x)\) \(\mathrm{V}\) or \(\{x \mid x = x\}\) |

\(\Lambda\) | (the empty class)
[∗24·02]
\(-\mathrm{V}\) \(\varnothing\) |

\(R‘y\) | (the \(R\) of \(y)\) (a descriptive function)
[∗30·01]
(\(\atoi x)(xRy)\) \(f^{-1}(y)\), where \(f = \{\langle x,y\rangle \mid Rxy \}\) |

\(\breve{R}\) | (the converse of \(R)\)
[∗31·02]
\(\hat{x} \hat{z} (zRx)\) \(\{\langle x,z\rangle \mid Rzx\}\) |

\(\overrightarrow{R}‘y\) | (the R-predecessors of \(y)\)
[∗32·01]
\(\hat{x} (xRy)\) \(\{x \mid Rxy \}\) |

\(\overleftarrow{R}‘x\) | (the R-successors of \(x)\)
[∗32·02]
\(\hat{z} (xRz)\) \(\{z \mid Rxz \}\) |

\(D‘R\) | (the domain of \(R)\)
[∗33·11]
\(\hat{x} \{ (\exists y) \sdot xRy \}\) \(\{x \mid \exists yRxy \}\) |

\(\backd‘R\) | (the range of \(R)\)
[∗33·111]
\(\hat{z} \{(\exists x) \sdot xR z \}\) \(\{z \mid \exists x Rxz \}\) |

\(C‘R\) | (the field of \(R)\)
[∗33·112]
\(\hat{x} \{(\exists y): xRy \ldot {\lor} \ldot yRx\}\) \(\{x \mid \exists y (xRy \lor yRx)\}\) |

\(R\mid S\) | (the relative product of \(R\) and \(S)\)
[∗34·01]
\(\hat{x} \hat{z} \{(\exists y) \sdot xRy \sdot ySz \}\) \(\{\langle x,z\rangle \mid \exists y(xRy \amp ySz)\}\) |

\(R \restriction \beta\) | (the restriction of \(R\) to \(\beta)\)
[∗35·02]
\(\hat{x} \hat{z}[xRz \sdot z\in \beta]\) \(\{\langle x,z\rangle \mid z\in \beta \amp Rxz \}\) |

\(\alpha \uparrow \beta\) | (the Cartesian product of \(\alpha\) and \(\beta)\)
[∗35·04]
\(\hat{x} \hat{z}[x \in \alpha \sdot z \in \beta\)] \(\alpha X\beta\), or \(\{\langle x,z\rangle \mid x\in \alpha \amp z\in \beta \}\) |

\(R‘‘\beta\) | (the projection of \(\beta\) by \(R)\)
[∗37·01]
\(\hat{x} \{(\exists y) \sdot y\in \beta \sdot x Ry\}\) \(\{x \mid \exists y(y\in \beta \amp Rxy)\}\) |

\(\iota‘x\) | (singleton of x)
[∗51·11]
\(\hat{z} (z = x)\) \(\{x\}\) |

\(R*\) | (the ancestral of \(R)\)
[∗90·01]
\(\hat{x} \hat{z} \{ x \in C‘ R \colon \breve{R}‘‘\mu \subset \mu \sdot x \in \mu \ldot {\supset_{\mu}} \ldot z \in \mu \}\) Frege’s definition: \(y\) is in all the \(R\)-hereditary classes \(x\) is in. |

## Bibliography

- Carnap, R., 1947,
*Meaning and Necessity*, Chicago: University of Chicago Press. - Church, A., 1976, “Comparison of Russell’s Resolution
of the Semantical Antinomies with That of Tarski”,
*Journal of Symbolic Logic*, 41: 747–60. - Chwistek, L., 1924, “The Theory of Constructive
Types”,
*Annales de la Société Polonaise de Mathématique*(*Rocznik Polskiego Towarzystwa Matematycznego*), II: 9–48. - Feys, R. and Fitch, F.B., 1969,
*Dictionary of Symbols of Mathematical Logic*, Amsterdam: North Holland. - Gödel, K., 1944, “Russell’s Mathematical
Logic”, in P.A. Schilpp, ed.,
*The Philosophy of Bertrand Russell*, LaSalle: Open Court, 125–153. - Landini, G., 1998,
*Russell’s Hidden Substitutional Theory*, New York and Oxford: Oxford University Press. - Linsky, B., 1999,
*Russell’s Metaphysical Logic*, Stanford: CSLI Publications. - –––, 2009, “From Descriptive Functions to
Sets of Ordered Pairs”, in
*Reduction – Abstraction – Analysis*, A. Hieke and H. Leitgeb (eds.), Ontos: Munich, 259–272. - –––, 2011,
*The Evolution of Principia Mathematica: Bertrand Russell’s Manuscripts and Notes for the Second Edition*, Cambridge: Cambridge University Press. - Russell, B., 1905, “On Denoting”,
*Mind*(N.S.), 14: 530–538. - Whitehead, A.N. and B. Russell, [PM],
*Principia Mathematica*, Cambridge: Cambridge University Press, 1910–13, second edition, 1925–27.

## Academic Tools

How to cite this entry. Preview the PDF version of this entry at the Friends of the SEP Society. Look up this entry topic at the Indiana Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers, with links to its database.

## Other Internet Resources

*Principia Mathematica*, reproduced in the University of Michigan Historical Math Collection.- Russell’s “On Denoting”,
from the reprint in
*Logic and Knowledge*(R. Marsh, ed., 1956) of the original article in*Mind*1905, typed into HTML by Cosma Shalizi (Center for the Study of Complex Systems, U. Michigan)

### Acknowledgments

The author would like to thank: Gregory Landini, Dick Schmitt, Franz Fritsche, Rafal Urbaniak, Adam Trybus, Pawel Manczyk and Kenneth Blackwell for corrections to this entry.