Stanford Encyclopedia of Philosophy

Common Knowledge

First published Tue Aug 28, 2001; substantive revision Fri Aug 10, 2007

A proposition A is mutual knowledge among a set of agents if each agent knows that A. Mutual knowledge by itself implies nothing about what, if any, knowledge anyone attributes to anyone else. Suppose each student arrives for a class meeting knowing that the instructor will be late. That the instructor will be late is mutual knowledge, but each student might think only she knows the instructor will be late. However, if one of the students says openly "Peter told me he will be late again," then each student knows that each student knows that the instructor will be late, each student knows that each student knows that each student knows that the instructor will be late, and so on, ad infinitum. The announcement made the mutually known fact common knowledge among the students.

Common knowledge is a phenomenon which underwrites much of social life. In order to communicate or otherwise coordinate their behavior successfully, individuals typically require mutual or common understandings or background knowledge. Indeed, if a particular interaction results in "failure", the usual explanation for this is that the agents involved did not have the common knowledge that would have resulted in success. If a married couple are separated in a department store, they stand a good chance of finding one another because their common knowledge of each others' tastes and experiences leads them each to look for the other in a part of the store both know that both would tend to frequent. Since the spouses both love cappuccino, each expects the other to go to the coffee bar, and they find one another. But in a less happy case, if a pedestrian causes a minor traffic jam by crossing against a red light, she explains her mistake as the result of her not noticing, and therefore not knowing, the status of the traffic signal that all the motorists knew. The spouses coordinate successfully given their common knowledge, while the pedestrian and the motorists miscoordinate as the result of a breakdown in common knowledge.

Given the importance of common knowledge in social interactions, it is remarkable that only quite recently have philosophers and social scientists attempted to analyze the concept. David Hume (1740) was perhaps the first to make explicit reference to the role of mutual knowledge in coordination. In his account of convention in A Treatise of Human Nature, Hume argued that a necessary condition for coordinated activity was that agents all know what behavior to expect from one another. Without the requisite mutual knowledge, Hume maintained, mutually beneficial social conventions would disappear. Much later, J. E. Littlewood (1953) presented some examples of common-knowledge-type reasoning, and Thomas Schelling (1960) and John Harsanyi (1967-1968) argued that something like common knowledge is needed to explain certain inferences people make about each other. Yet it was David Lewis (1969) who first gave an explicit analysis of common knowledge in the monograph Convention. Stephen Schiffer (1972), Robert Aumann (1976), and Gilbert Harman (1977) independently gave alternate definitions of common knowledge. Jon Barwise (1988, 1989) gave a precise formulation of Harman's intuitive account. Schiffer's analysis of common knowledge as a hierarchy of epistemic claims has become standard in the philosophical and social science literature. Lewis', Aumann's, and Barwise's accounts all imply the hierarchical account. In some contexts, Schiffer's, Aumann's, and Barwise's definitions of common knowledge are more convenient to use than Lewis' original definition. More recently, Margaret Gilbert (1989) proposed a somewhat different account of common knowledge which she argues is preferable to the standard account. Others have developed accounts of mutual knowledge, approximate common knowledge, and common belief which require less stringent assumptions than the standard account, and which serve as more plausible models of what agents know in cases where strict common knowledge seems impossible (Brandenburger and Dekel 1987, Stinchcombe 1988, Monderer and Samet 1989, Rubinstein 1992). The analysis and applications of common knowledge and related multi-agent knowledge concepts has become a lively field of research.

The purpose of this essay is to overview of some of the most important results stemming from this contemporary research. The topics reviewed in each section of this essay are as follows: Section 1 gives motivating examples which illustrate a variety of ways in which the actions of agents depend crucially upon their having, or lacking, certain common knowledge. Section 2 discusses alternative analyses of common knowledge. Section 3 reviews applications of multi-agent knowledge concepts, particularly to game theory (von Neumann and Morgenstern 1944), in which common knowledge assumptions have been found to have great importance in justifying solution concepts for mathematical games. Section 4 discusses skeptical doubts about the attainability of common knowledge. Finally, Section 5 discusses the common belief concept which result from weakening the assumptions of Lewis' account of common knowledge.


1. Motivating Examples

Most of the examples in this section are familiar in the common knowledge literature, although some of the details and interpretations presented here are new. Readers may want to ask themselves what, if any, distinctive aspects of mutual and common knowledge reasoning each example illustrates.

1.1. The Clumsy Waiter[1]

A waiter serving dinner slips, and spills gravy on a guest's white silk evening gown. The guest glares at the waiter, and the waiter declares "I'm sorry. It was my fault." Why did the waiter say that he was at fault? He knew that he was at fault, and he knew from the guest's angry expression that she knew he was at fault. However, the sorry waiter wanted assurance that the guest knew that he knew he was at fault. By saying openly that he was at fault, the waiter knew that the guest knew what he wanted her to know, namely, that he knew he was at fault. Note that the waiter's declaration established at least three levels of nested knowledge.

Certain assumptions are implicit in the preceding story. In particular, the waiter must know that the guest knows he has spoken the truth, and that she can draw the desired conclusion from what he says in this context. More fundamentally, the waiter must know that if he announces "It was my fault" to the guest, she will interpret his intended meaning correctly and will infer what his making this announcement ordinarily implies in this context. This in turn implies that the guest must know that if the waiter announces "It was my fault" in this context, then the waiter indeed knows he is at fault. Then on account of his announcement, the waiter knows that the guest knows that he knows he was at fault. The waiter's announcement was meant to generate higher-order levels of knowledge of a fact each already knew.

Just a slight strengthening of the stated assumptions results in even higher levels of nested knowledge. Suppose the waiter and the guest each know that the other can infer what he infers from the waiter's announcement. Can the guest now believe that the waiter does not know that she knows that he knows he is at fault? If the guest considers this question, she reasons that if the waiter falsely believes it is possible that she does not know that he knows he is at fault, then the waiter must believe it to be possible that she cannot infer that he knows he is at fault from his own declaration. Since she knows she can infer that the waiter knows he is at fault from his declaration, she knows that the waiter knows she can infer this, as well. Hence the waiter's announcement establishes the fourth-order knowledge claim: The guest knows that the waiter knows that she knows that he knows he is at fault. By similar, albeit lengthier, arguments, the agents can verify that corresponding knowledge claims of even higher-order must also obtain under these assumptions.

1.2 The Barbecue Problem

This is a variation of an example first published by Littlewood (1953), although he notes that his version of the example was already well-known at the time.[2] N individuals enjoy a picnic supper together which includes barbecued spareribs. At the end of the meal, k ≥ 1 of these diners have barbecue sauce on their faces. No one wants to continue the evening with a messy face. No one wants to wipe her face if it's not messy, for this would make her appear neurotic. And no one wants to take the risk of being thought rude by telling anyone else that he has barbecue sauce on his face. Since no one can see her own face, none of the messy diners makes a move to clean her face. Then the cook who served the spareribs returns with a carton of ice cream. Amused by what he sees, the cook rings the dinner bell and makes the following announcement: "At least one of you has barbecue sauce on her face. I will ring the dinner bell over and over, until anyone who is messy has wiped her face. Then I will serve dessert." For the first k − 1 rings, no one does anything. Then, at the kth ring, each of the messy individuals suddenly reaches for a napkin, and soon afterwards, the diners are all enjoying their ice cream.

How did the messy diners finally realize that their faces needed cleaning? The k = 1 case is easy, since in this case, the lone messy individual will realize he is messy immediately, since he sees that everyone else is clean. Consider the k = 2 case next. At the first ring, messy individual i1 knows that one other person, i2, is messy, but does not yet know about himself. At the second ring, i1 realizes that he must be messy, since had i2 been the only messy one, i2 would have known this after the first ring when the cook made his announcement, and would have cleaned her face then. By a symmetric argument, messy diner i2 also concludes that she is messy at the second ring, and both pick up a napkin at that time.

Let's next consider k = 3. Again at the first ring, each of the messy diners i1, i2, and i3 knows the status of the other diners, but not her own. The situation is apparently unchanged after the second ring. But on the third ring, i1 realizes that she is messy. For if i2 and i3 were the only messy ones, then they would have discovered this after the second ring by the argument of the previous paragraph. Since i1 can see that all of the diners other than i2 and i3 are clean, she concludes that she must be messy. i2 and i3 draw similar conclusions at the third ring, and all clean their faces at that time.

The general case follows by induction. Suppose that if k = j, then each of the j messy diners can determine that he is messy after j rings. Then if k = j + 1, then at the j + 1st ring, each of the j + 1 individuals will realize that he is messy. For if he were not messy, then the other j messy ones would have all realized their messiness at the jth ring and cleaned themselves then. Since no one cleaned herself after the jth ring, at the j + 1st ring each messy person will conclude that someone besides the other j messy people must also be messy, namely, himself.

The "paradox" of this argument is that for k > 1, like the case of the clumsy waiter of Example 1.1, the cook's announcement told the diners something that each already knew. Yet apparently the cook's announcement also gave the diners useful information. How could this be? By announcing a fact already known to every diner, the cook made this fact common knowledge among them, enabling each of them to eventually deduce the condition of his own face after sufficiently many rings of the bell. Note that the inductive argument the agents run through depends upon the conclusions they each draw from several counterfactual conditionals. In general, the consequences of agents' common knowledge are intimately related to how they evaluate subjunctive and counterfactual conditionals.[3]

1.3 The Farmer's Dilemma

Does meeting one's obligations to others serve one's self-interest? Plato and his successors recognized that in certain cases, the answer seems to be "No." Hobbes (1651, pp. 101-102) considers the challenge of a "Foole", who claims that it is irrational to honor an agreement made with another who has already fulfilled his part of the agreement. Noting that in this situation one has gained all the benefit of the other's compliance, the Foole contends that it would now be best for him to break the agreement, thereby saving himself the costs of compliance. Of course, if the Foole's analysis of the situation is correct, then would the other party to the agreement not anticipate the Foole's response to agreements honored, and act accordingly?

Hume (1740, pp. 520-521) takes up this question, using an example: Two neighboring farmers each expect a bumper crop of corn. Each will require his neighbor's help in harvesting his corn when it ripens, or else a substantial portion will rot in the field. Since their corn will ripen at different times, the two farmers can ensure full harvests for themselves by helping each other when their crops ripen, and both know this. Yet the farmers do not help each other. For the farmer whose corn ripens later reasons that if she were to help the other farmer, then when her corn ripens he would be in the position of Hobbes' Foole, having already benefited from her help. He would no longer have anything to gain from her, so he would not help her, sparing himself the hard labor of a second harvest. Since she cannot expect the other farmer to return her aid when the time comes, she will not help when his corn ripens first, and of course the other farmer does not help her when her corn ripens later.

The structure of Hume's Farmers' Dilemma problem can be summarized using the following tree diagram:

Figure 1.1a
Figure 1.1a
This tree is an example of a game in extensive form. At each stage i, the agent who moves can either choose Ci, which corresponds to helping or cooperating, or Di, which corresponds to not helping or defecting. The relative preferences of the two agents over the various outcomes are reflected by the ordered pairs of payoffs each receives at any particular outcome. If, for instance, Fiona chooses Ci and Alan chooses Di, then Fiona's payoff is 0, her worst payoff, and Alan's is 4, his best payoff. In a game such as the Figure 1.1.a game, agents are (Bayesian) rational if each chooses an act that maximizes her expected payoff, given what she knows.

In the Farmers' Dilemma game, following the C1,C2-path is strictly better for both farmers than following the D1,D2-path. However, Fiona chooses D1, as the result of the following simple argument: "If I were to choose C1, then Alan, who is rational and who knows the payoff structure of the game, would choose D2. I am also rational and know the payoff structure of the game. So I should choose D1." Since Fiona knows that Alan is rational and knows the game's payoffs, she concludes that she need only analyze the reduced game in the following figure:

Figure 1.1b
Figure 1.1b

In this reduced game, Fiona is certain to gain a strictly higher payoff by choosing D1 than if she chooses C1, so D1 is her unique best choice. Of course, when Fiona chooses D1, Alan, being rational, responds by choosing D2. If Fiona and Alan know: (i) that they are both rational, (ii) that they both know the payoff structure of the game, and (iii) that they both know (i) and (ii), then they both can predict what the other will do at every node of the Figure 1.1.a game, and conclude that they can rule out the D1,C2-branch of the Figure 1.1.b game and analyze just the reduced game of the following figure:

Figure 1.1c
Figure 1.1c

On account of this mutual knowledge, both know that Fiona will choose D1, and that Alan will respond with D2. Hence, the D1,D2-outcome results if the Farmers' Dilemma game is played by agents having this mutual knowledge, though it is suboptimal since both agents would fare better at the C1,C2-branch.[4] This argument, which in its essentials is Hume's argument, is an example of a standard technique for solving sequential games known as backwards induction.[5] The basic idea behind backwards induction is that the agents engaged in a sequential game deduce how each will act throughout the entire game by ruling out the acts that are not payoff-maximizing for the agents who would move last, then ruling out the acts that are not payoff-maximizing for the agents who would move next-to-last, and so on. Clearly, backwards induction arguments rely crucially upon what, if any, mutual knowledge the agents have regarding their situation, and they typically require the agents to evaluate the truth values of certain subjunctive conditionals, such as "If I (Fiona) were to choose C1, then Alan would choose D2".

1.4 The Centipede

The mutual knowledge assumptions required to construct a backwards induction solution to a game become more complex as the number of stages in the game increases. To see this, consider the sequential Centipede game depicted in the following figure:

Figure 1.2
Figure 1.2
At each stage i, the agent who moves can either choose Ri, which in the first three stages gives the other agent an opportunity to move, or Li, which ends the game.

Like the Farmers' Dilemma, this game is a commitment problem for the agents. If each agent could trust the other to choose Ri at each stage, then they would each expect to receive a payoff of 3. However, Alan chooses L1, leaving each with a payoff of only 1, as the result of the following backwards induction argument: "If node n4 were to be reached, then Fiona, (being rational) would choose L4. I, knowing this, would (being rational) choose L3 if node n3 were to be reached. Fiona, knowing this, would (being rational) choose L2 if node n2 were to be reached. Hence, I (being rational) should choose L1." To carry out this backwards induction argument, Alan implicitly assumes that: (i) he knows that Fiona knows he is rational, and (ii) he knows that Fiona knows that he knows she is rational. Put another way, for Alan to carry out the backwards induction argument, at node n1 he must know what Fiona must know at node n2 to make L2 her best response should n2 be reached. While in the Farmer's Dilemma Fiona needed only first-order knowledge of Alan's rationality and second-order knowledge of Alan's knowledge of the game to derive the backwards induction solution, in the Figure 1.2 game, for Alan to be able to derive the backwards induction solution, the agents must have third-order mutual knowledge of the game and second-order mutual knowledge of rationality, and Alan must have fourth-order knowledge of this mutual knowledge of the game and third-order knowledge of their mutual knowledge of rationality. This argument also involves several counterfactuals, since to construct it the agents must be able to evaluate conditionals of the form, "If node ni were to be reached, Alan (Fiona) would choose Li (Ri)", which for i > 1 are counterfactual, since third-order mutual knowledge of rationality implies that nodes n2, n3, and n4 are never reached.

The method of backwards induction can be applied to any sequential game of perfect information, in which the agents can observe each others' moves in turn and can recall the entire history of play. However, as the number of potential stages of play increases, the backwards induction argument evidently becomes harder to construct. This raises certain questions: (1) What precisely are the mutual or common knowledge assumptions that are required to justify the backwards induction argument for a particular sequential game? (2) As a sequential game increases in complexity, would we expect the mutual knowledge that is required for backwards induction to start to fail?

1.5 The Department Store

When a man loses his wife in a department store without any prior understanding on where to meet if they get separated, the chances are good that they will find each other. It is likely that each will think of some obvious place to meet, so obvious that each will be sure that it is "obvious" to both of them. One does not simply predict where the other will go, which is wherever the first predicts the second to predict the first to go, and so ad infinitum. Not "What would I do if I were she?" but "What would I do if I were she wondering what she would do if she were wondering what I would do if I were she … ?"
Thomas Schelling, The Strategy of Conflict

Schelling's department store problem is an example of a pure coordination problem, that is, an interaction problem in which the interests of the agents coincide perfectly. Schelling (1960) and Lewis (1969), who were the first to make explicit the role common knowledge plays in social coordination, were also among the first to argue that coordination problems can be modeled using the analytic vocabulary of game theory. A very simple example of such a coordination problem is given in the next figure:

Figure 1.3
Figure 1.3

The matrix of Figure 1.3 is an example of a game in strategic form. At each outcome of the game, which corresponds to a cell in the matrix, the row (column) agent receives as payoff the first (second) element of the ordered pair in the corresponding cell. However, in strategic form games, each agent chooses without first being able to observe the choices of any other agent, so that all must choose as if they were choosing simultaneously. The Figure 1.3 game is a game of pure coordination (Lewis 1969), that is, a game in which at each outcome, each agent receives exactly the same payoff. One interpretation of this game is that Schelling's spouses, Liz and Robert, are searching for each other in the department store with four floors, and they find each other if they go to the same floor. Four outcomes at which the spouses coordinate correspond to the strategy profiles (sjsj), 1 ≤ j ≤ 4, of the Figure 1.3 game. These four profiles are strict Nash equilibria (Nash 1950, 1951) of the game, that is, each agent has a decisive reason to follow her end of one of these strategy profiles provided that the other also follows this profile.[6]

The difficulty the agents face is trying to select an equilibrium to follow. For suppose that Robert hopes to coordinate with Liz on a particular equilibrium of the game, say (s2s2). Robert reasons as follows: "Since there are several strict equilibria we might follow, I should follow my end of (s2s2) if, and only if, I have sufficiently high expectations that Liz will follow her end of (s2s2). But I can only have sufficiently high expectations that Liz will follow (s2s2 ) if she has sufficiently high expectations that I will follow (s2s2). For her to have such expectations, Liz must have sufficiently high (second-order) expectations that I have sufficiently high expectations that she will follow (s2s2), for if Liz doesn't have these (second-order) expectations, then she will believe I don't have sufficient reason to follow (s2s2) and may therefore deviate from (s2, s2) herself. So I need to have sufficiently high (third-order) expectations that Liz has sufficiently high (second-order) expectations that I have sufficiently high expectations that she will follow (s2s2 ). But this implies that Liz must have sufficiently high (fourth-order) expectations that I have sufficiently high (third-order) expectations that Liz has sufficiently high (second-order) expectations that I have sufficiently high expectations that she will follow (s2s2), for if she doesn't, then she will believe I don't have sufficient reason to follow (s2s2), and then she won't, either. Which involves me in fifth-order expectations regarding Liz, which involves her in sixth-order expectations regarding me, and so on." What would suffice for Robert, and Liz, to have decisive reason to follow (s2s2) is that they each know that the other knows that … that the other will follow (s2, s2) for any number of levels of knowledge, which is to say that between Liz and Robert it is common knowledge that they will follow (s2, s2). If agents follow a strict equilibrium in a pure coordination game as a consequence of their having common knowledge of the game, their rationality and their intentions to follow this equilibrium, and no other, then the agents are said to be following a Lewis-convention (Lewis 1969).

Lewis' theory of convention applies to a more general class of games than pure coordination games, but pure coordination games already model a variety of important social interactions. In particular, Lewis models conventions of language as equilibrium points of a pure coordination game. The role common knowledge plays in games of pure coordination sketched above of course raises further questions: (1) Can people ever attain the common knowledge which characterizes a Lewis-convention? (2) Would less stringent epistemic assumptions suffice to justify Nash equilibrium behavior in a coordination problem?

2. Alternative Accounts of Common Knowledge

Informally, a proposition A is mutually known among a set of agents if each agent knows that A. Mutual knowledge by itself implies nothing about what, if any, knowledge anyone attributes to anyone else. Suppose each student arrives for a class meeting knowing that the instructor will be late. That the instructor will be late is mutual knowledge, but each student might think only she knows the instructor will be late. However, if one of the students says openly "Peter told me he will be late again," then the mutally known fact is now commonly known. Each student now knows that the instructor will be late, and so on, ad infinitum. The agents have common knowledge in the sense articulated informally by Schelling (1960), and more precisely by Lewis (1969) and Schiffer (1972). Schiffer uses the formal vocabulary of epistemic logic (Hintikka 1962) to state his definition of common knowledge. Schiffer's general approach was to augment a system of sentential logic with a set of knowledge operators corresponding to a set of agents, and then to define common knowledge as a hierarchy of propositions in the augmented system. Bacharach (1992) and Bicchieri (1993) adopt this approach, and develop logical theories of common knowledge which include soundness and completeness theorems. One can also develop alternate formal accounts of common knowledge in set-theoretic terms, which is the approach taken in this article.[7]

2.1 The Hierarchical Account

Monderer and Samet (1988) and Binmore and Brandenburger (1989) give a particularly elegant set-theoretic definition of common knowledge. I will review this definition here, and then show that it is logically equivalent to the ‘i knows that j knows that … k knows that A’ hierarchy that Lewis (1969) and Schiffer (1972) argue characterizes common knowledge.[8]

Some preliminary notions must be stated first. Following C. I. Lewis (1943-1944) and Carnap (1947), propositions are formally subsets of a set Ω of state descriptions or possible worlds. One can think of the elements of Ω as representing Leibniz's possible worlds or Wittgenstein's possible states of affairs. Some results in the common knowledge literature presuppose that Ω is of finite cardinality. If this admittedly unrealistic assumption is needed in any context, this will be explicitly stated in this essay, and otherwise one may assume that Ω may be either a finite or an infinite set. A distinguished actual world ωα is an element of Ω. A proposition A ⊆ Ω obtains (or is true) if the actual world ωαA. In general, we say that A obtains at a world ω ∈ Ω if ω ∈ A. What an agent i knows about the possible worlds is stated formally in terms of a knowledge operator Ki. Given a proposition A ⊆ Ω, Ki(A) denotes a new proposition, corresponding to the set of possible worlds at which agent i knows that A obtains. Ki(A) is read as ‘i knows (that) A (is the case)’. The knowledge operator Ki satisfies certain axioms, including:

K1:   Ki(A) ⊆ A

K2:   Ω ⊆ Ki(Ω)

K3:   Ki(k Ak)   =   k Ki(Ak)

K4:   Ki(A) ⊆ KiKi(A)[9]

K5:   −Ki(A) ⊆ KiKi(A)

In words, K1 says that if i knows A, then A must be the case. K2 says that i knows that some possible world in Ω occurs no matter which possible world ω occurs. K3 says that i knows a conjunction if, and only if, i knows each conjunct. K4 is a reflection axiom, sometimes also presented as the axiom of transparency, which says that if i knows A, then i knows that she knows A. Finally, K5 says that if the agent does not know an event, then she knows that she does not know. This axiom is presented as the axiom of negative introspection, or as the axiom of wisdom (since the agents possess Socratic wisdom, knowing that they do not know.) Note that by K3, if AB then Ki(A) ⊆ Ki(B), by K1 and K2, Ki(Ω) = Ω, and by K1 and K4, Ki(A) = KiKi(A). Any system of knowledge satisfying K1 – K5 corresponds to the modal system S5, while any system satisying K1 – K4 corresponds to S4 (Kripke 1963). If one drops the K1 axiom and retains the others, the resulting system would give a formal account of what an agent believes, but does not necessarily know.

A useful notion in the formal analysis of knowledge is that of a possibility set. An agent i's possibility set at a state of the world Ω is the smallest set of possible worlds that i thinks could be the case if ω is the actual world. More precisely,

Definition 2.1
Agent i's possibility set calligraphic-Hi(ω) at ω ∈ Ω is defined as
calligraphic-Hi(ω) ≡ { E | ω ∈ Ki(E) }
The collection of sets
calligraphic-Hi   =   ω∈Ω calligraphic-Hi(ω)
is i's private information system.

Since in words, calligraphic-Hi(ω) is the intersection of all propositions which i knows at ω, calligraphic-Hi(ω) is the smallest proposition in Ω that i knows at ω. Put another way, calligraphic-Hi(ω) is the most specific information that i has about the possible world ω. The intuition behind assigning agents private information systems is that while an agent i may not be able to perceive or comprehend every last detail of the world in which i lives, i does know certain facts about that world. The elements of i's information system represent what i knows immediately at a possible world. We also have the following:

Proposition 2.2
Ki(A) = { ω | calligraphic-Hi(ω) ⊆ A }

In many formal analyses of knowledge in the literature, possibility sets are taken as primitive and Proposition 2.2 is given as the definition of knowledge. If one adopts this viewpoint, then the axioms K1 - K5 follow as consequences of the definition of knowledge. In many applications, the agents' possibility sets are assumed to partition[10] the set, in which case calligraphic-Hi is called i's private information partition. Notice that if axioms K1 - K5 hold, then the possibility sets of each agent always partition the state set, and vice versa.

To illustrate the idea of possibility sets, let us return to the Barbecue Problem described in Example 1.2. Suppose there are three diners: Cathy, Jennifer and Mark. Then there are 8 relevant states of the world, summarized by Table 2.1:

Table 2.1
      Table 2.1

Each diner knows the condition of the other diners' faces, but not her own. Suppose the cook makes no announcement, after all. Then none of the diners knows the true state of the world whatever ω ∈ Ω the actual world turns out to be, but they do know a priori that certain propositions are true at various states of the world. For instance, Cathy's information system before any announcement is made is depicted in Figure 2.1a:

missing text, please inform
Figure 2.1a

In this case, Cathy's information system is a partition calligraphic-H1 of Ω defined by

calligraphic-H1 = {HCC, HCM, HMC, HMM}

where

HCC = {ω1, ω2} (i.e., Jennifer and Mark are both clean)

HCM = {ω4, ω6} (i.e., Jennifer is clean and Mark is messy)

HMC = {ω3, ω5} (i.e., Jennifer is messy and Mark is clean)

HMM = {ω7, ω8} (i.e., Jennifer and Mark are both messy)

Cathy knows immediately which cell calligraphic-H1(ω) in her partition is the case at any state of the world, but does not know which is the true state at any ω ∈ Ω.

If we add in the assumption stated in Example 1.2 that if there is at least one messy diner, then the cook announces the fact, then Cathy's information partition is depicted by Figure 2.1b:

missing text, please inform
Figure 2.1b

In this case, Cathy's information system is a partition calligraphic-H1 of Ω defined by

calligraphic-H1 = {HCCC, HMCC, HCM, HMC, HMM}

where

HCCC = 1} (i.e., Jennifer, Mark, and I are all clean)
HMCC = 2} (i.e., Jennifer and Mark are clean and I am messy)
HCM = 4, ω6} (i.e., Jennifer is clean and Mark is messy)
HMC = 3, ω5} (i.e., Jennifer is messy and Mark is clean)
HMM = 7, ω8} (i.e., Jennifer and Mark are both messy)

In this case, Cathy's information partition is a refinement of the partition she has when there is no announcement, for in this case, then Cathy knows a priori that if ω1 is the case there will be no announcement and will know immediately that she is clean, and Cathy knows a priori that if ω2 is the case, then she will know immediately from the cook's announcement that she is messy.

Similarly, if the cook makes an announcement only if he sees at least two messy diners, Cathy's possibility set is the one represented in fig. 2.1c:

missing text, please inform
Figure 2.1c

Cathy's information partition is now defined by

calligraphic-H1 = {HCC, HCMC, HCCM, HMMC, HMCM, HMM}

where

HCC = 1, ω2} (i.e., Jennifer and Mark are both clean)
HCMC = 3} (i.e., Mark and I are clean, Jennifer is messy)
HCCM = 4} (i.e., Jennifer and I are clean, Mark is messy)
HCCM = 5} (i.e., Jennifer and I are messy, Mark is clean)
HCCM = 6} (i.e., Mark and I are messy, Jennifer is clean)
HMM = 7, ω8} (i.e., Jennifer and Mark are both messy)

In this case, Cathy knows a priori that if ω3 obtains there will be no announcement, and similarly for ω4. Thus, she will be able to distinguish these states from ω5 and ω6, respectively.

As mentioned earlier in this subsection, the assumption that agents' possibility sets partition the state space depends on the modeler's choice of specific axioms for the knowledge operators. For example, if we drop axiom K5 (preserving the validity of K1 - K4) the agent's possibility sets need not partition the space set (follow the link for an example. For more details and applications, cf. Samet 1990.) It was conjectured (cf. Geanakoplos 1989) that lack of negative introspection (i.e. systems without K5) would allow to incorporate unforeseen contingencies in the epistemic model, by representing the agents' unawareness of certain events (i.e. the case in which the agent does not know that an event occurs and also does not know that she does not know that.) It was later shown by Dekel et al. (1998) that standard models are not suitable to represent agents' unawareness. An original non-standard model to represent unawareness is provided in Heifetz et al. (2006). For a comprehensive bibliography on modeling unawareness and applications of the notion, cf. the external links at the end on this entry.

We can now define mutual and common knowledge as follows:

Definition 2.3
Let a set Ω of possible worlds together with a set of agents N be given.

1. The proposition that A is (first level or first order) mutual knowledge for the agents of N, K1N(A), is the set defined by

K1N(A) ≡ iN Ki(A).

2. The proposition that A is mth level (or mth order) mutual knowledge among the agents of N, KmN(A), is defined recursively as the set

KmN(A) ≡ iN Ki(Km−1 N(A)).

3. The proposition that A is common knowledge among the agents of N, K*N(A), is defined as the set

K*N(A) ≡

m=1
KmN(A).

As a consequence of Proposition 2.2, the agents' private information systems determine an a priori structure of propositions over the space of possible worlds regarding what they can know, including what mutual and common knowledge they potentially have. The world ω ∈ Ω which obtains determines a posteriori what individual, mutual and common knowledge agents in fact have. Hence, one can read ω ∈ Ki(A) as ‘i knows A at (possible world) ω’, ω ∈ KmN(A) as ‘A is mth level mutual knowledge for the agents of N at ω’, and so on. If ω obtains, then one can conclude that i does know A, that A is mth level mutual knowledge, and so on. Common knowledge of a proposition E implies common knowledge of all that E implies, as is shown in the following:

Proposition 2.4
If ω ∈ K*N(E) and EF, then ω ∈ K*N(F).

Proof.

Note that (KmN(E)) m≥1 is a decreasing sequence of events, in the sense that Km+1N(E) ⊆ KmN(E), for all m ≥ 1. It is also easy to check that if everyone knows E, then E must be true, that is, K1N(E) ⊆ E. If Ω is assumed to be finite, then if E is common knowledge at ω, this implies that there must be a finite m such that

KmN(E) =

n=1
KnN(E).

The following result relates the set-theoretic definition of common knowledge to the hierarchy of ‘i knows that j knows that … knows A’ statements.

Proposition 2.5
ω ∈ KmN(A) iff
(1) For all agents i1, i2, … , imN, ω ∈ Ki1Ki2Kim(A)
Hence, ω ∈ K*N(A) iff (1) is the case for each m ≥ 1.

Proof.

The condition that ω ∈ Ki1Ki2Kim(A) for all m ≥ 1 and all i1, i2, … , imN is Schiffer's definition of common knowledge, and is often used as the definition of common knowledge in the literature.

2.2 Lewis' Account

Lewis is credited with the idea of characterizing common knowledge as a hierarchy of ‘i knows that j knows that … knows that A’ propositions. However, Lewis is aware of the difficulties that such an infinitary definition raises. On the one hand, there is the question of whether it is possible to reduce the infinity inherent in the hierarchical account into a workable finite definition. On the other hand, there is the issue that finite agents cannot entertain the infinite amount of epistemic states which is necessary for common knowledge to obtain. Lewis tackles both problems, yet his presentation is informal and occasionally lacking in detail. It is probably for this reason that Aumann is often credited with presenting the first finitary method of generating the common knowledge hierarchy (Aumann 1976). Recently, Cubitt and Sugden (2003) have argued that Aumann's and Lewis' accounts of common knowledge are radically different and irreconcilable.

First, it is important to see that, although Lewis introduced the technical term ‘common knowledge,’ his analysis is about belief, rather than knowledge. Indeed, Lewis offers his solution to the second problem mentioned above by introducing a distinction between actual belief and reason to believe. Reasons to believe are interpreted as potential beliefs of agents, so that the infinite hierarchy of epistemic states becomes harmless, consisting in an infinity of potential belief states. The solution to the second problem is given by providing a finite set of conditions that, if met, generate the infinite series of reasons to believe. Such conditions taken together represent Lewis' official definition of common knowledge. Notice that it would be more appropriate to speak of ‘common reason to believe,’ or, at least, of ‘common belief.’ Lewis himself later acknowledges that "[t]hat term [common knowledge] was unfortunate, since there is no assurance that it will be knowledge, or even that it will be true." Cf. (Lewis 1978, p. 44, n.13) The reader should be warned that, with Lewis and many others, we will at times abuse terminology and speak of ‘common knowledge’ tout court. Although conceptually persuasive, Lewis distinction between reasons to believe and actual beliefs remains informal, and it is debatable whether a mathematically precise account of Lewis' characterization of common knowledge can be given in the framework introduced by (Aumann 1976). Following (Vanderschraaf 1998) we give the details of such an account here, and show that Lewis' analysis does result in the common knowledge hierarchy following from a finite set of axioms. It is however debated whether a possible worlds approach can properly render the subtelties of Lewis' characterization. Cubitt and Sugden (2003), for example, abandon the possible worlds framework altogether and propose a different formal interpretation of Lewis in which, among other elements, the distinction between reasons to believe and actual belief is taken in account. An attempt to reconcile the two positions can be found in (Sillari 2005), where Lewis' characterization is formalized in a richer possible worlds semantic framework where the distinction between reasons to believe and actual believe is represented.

Lewis presents his account of common knowledge on pp. 52-57 of Convention. Lewis does not specify what account of knowledge is needed for common knowledge. As it turns out, Lewis' account is satisfactory for any formal account of knowledge in which the knowledge operators Ki, iN, satisfy K1, K2, and K3. A crucial assumption in Lewis' analysis of common knowledge is that agents know they share the same "rationality, inductive standards and background information" (Lewis 1969, p. 53) with respect to a state of affairs A′, that is, if an agent can draw any conclusion from A′, she knows that all can do likewise. This idea is made precise in the following:

Definition 2.6
Given a set of agents N and a proposition A′ ⊆ Ω, the agents of N are symmetric reasoners with respect to A′ (or A-symmetric reasoners) iff, for each i, jN and for any proposition E ⊆ Ω, if Ki(A′) ⊆ Ki(E) and Ki(A′) ⊆ KiKj( A′), then Ki(A′) ⊆ KiKj( E).[11]

The definiens says that for each agent i, if i can infer from A′ that E is the case and that everyone knows that A′ is the case, then i can also infer that everyone knows that E is the case.

Definition 2.7
A proposition E is Lewis-common knowledge at ω ∈ Ω among the agents of a set N = {1, … , n} iff there is a proposition A* such that ω ∈ A*, the agents of N are A*-symmetric reasoners, and for every iN,
L1:   ω ∈ Ki(A*)

L2:   Ki(A*) ⊆ Ki(jN Kj(A*))

L3:   Ki(A*) ⊆ Ki(E)

A* is a basis for the agents' common knowledge. L*N (E) denotes the proposition defined by L1 - L3 for a set N of A*-symmetric reasoners, so we can say that E is Lewis-common knowledge for the agents of N iff ω ∈ L*N(E).

In words, L1 says that i knows A* at ω. L2 says that if i knows that A* obtains, then i knows that everyone knows that A* obtains. This axiom is meant to capture the idea that common knowledge is based upon a proposition A* that is publicly known, as is the case when agents hear a public announcement. If the agents' knowledge is represented by partitions, then a typical basis for the agents' common knowledge would be an element calligraphic-M(ω) in the meet[12] of their partitions. L3 says that i can infer from A* that E.

A human agent obviously cannot work her way mentally through an infinite mutual knowledge hierarchy. Lewis argues that this is not a problem for his analysis of common knowledge, since the mutual knowledge claims of a common knowledge hierarchy for a chain of logical consequences, not a series of steps in anyone's actual reasoning. Lewis uses an example to show how his definition of common knowledge generates the first few levels of mutual knowledge. In fact, Lewis' definition implies the entire common knowledge hierarchy, as is shown in the following result.

Proposition 2.8
L*N(E) ⊆ K*N (E), that is, Lewis-common knowledge of E implies common knowledge of E.

Proof.

As mentioned above, it has recently come into question whether a formal rendition of Lewis' definition as the one given above adeguately represents all facets of Lewis' approach. Cubitt and Sugden (2003) argue that it does not, their critique hinging on two specific features of Lewis' analysis that are lost in a possible world framework. First, as we mentioned at the onset of this subsection, in a possible worlds account, the distinction between reasons to believe and actual belief is lost. To avoid that, Cubitt and Sugden concern themselves with operators representing reasons to believe rather than knowledge. The other feature of Lewis' analysis missing in the possible worlds rendition of it is the 3-place relation of indication used by Lewis. The definition of indication can be found at pp. 52-53 of Convention:

Definition 2.9
A state of affairs A indicates E to agent i (A indi E) if and only if, if i had reason to believe that A held, i would thereby have reason to believe that E

The wording of Lewis' definition and the use he makes of the indication relation in the definitory clauses for common knowledge, suggest that Lewis is careful to distinguish indication and material implication. Cubitt and Sugden (2003) incorporate such distinction in their formal reconstruction. Paired with their interpretation of "i has reason to believe x" as "x is yielded by some logic of reasoning that i endorses," we have that, if A indi x, then i's reason to believe A provides i with reason to believe x as well. Given that Lewis does want to endow agents with deductive reasoning, (Cubitt and Sugden 2003) list the following axioms, claiming that they capture the desired properties of indication. (To simplify the exposition, we overlook the distinction between state of affairs and propositions, which is not essential here.) For all agents i, j,

CS1:   (Ri AA indi x) → Ri x.

CS2:   (A entails B) → A indi B

CS3:   (A indi xA indi y) → A indi (xy)

CS4:   (A indi BB indi x) → A indi x

CS5:   ((A indi Rj B) ∧ Ri(B indj x)) → A indi Rj x

The first axioms captures the intuition behind indication. It says that if an agent has reason to believe that A holds, then, if A indicates x to her, she has reason to believe x as well. CS2 says that indication incorporates material implication. CS3 says that if two propositions x and y are indicated to an agent by a proposition A, then A indicates to her also the conjunction of x and y. The next axiom states that indication is transitive. CS5 says that if a proposition A indicates to i that agent j has reason to believe B, and i has reason to believe that B indicates x to j, then A indicates to i also that j has reason to believe x.

Armed with these axioms, it is possible to give the following definition.

Definition 2.10
In any given population P a proposition A is a reflexive common indicator that x if and only if, for all i, jP and all propositions x, y, the following four conditions hold:

RCI1:   ARi A

RCI2:   A indi Rj A

RCI3:   A indi x

RCI4:   A indjRi(A indj y)

Clauses RCI1-RCI3 above render L1-L3 of definition 2.7 above in the formal language that underlies axioms CS1-CS5; while RCI4 affirms (cf. definition 2.6 above) that agents are symmetric reasoners, i.e. that if a proposition indicates another proposition to a certain agent, then it does so to all agents in the population.

The following proposition shows that RCI1-RCI4 are sufficient conditions for ‘common reason to believe’ to arise:

Proposition 2.11
If A holds, and if A is a common reflexive indicator in the population P that x, then there is common reason to believe in P that x.

Proof.

Proposition 2.11 shows that a finite definition of ‘common reason to believe’ can be given. A straighforward extension to ‘common belief’ is easily obtained once we say that an agent i reasons faultlessy if she believes everything she has reason to believe. The following proposition characterizes then common belief in the approach of Cubitt and Sugden:
Proposition 2.12
Consider a population P and a proposition A such that (i) A is a reflexive common indicator in P that x and (ii) A is a reflexive common indicator in P that each member of P reasons faultlessy. Suppose A holds and each agent in P reasons faultlessly. Then there is common (actual) belief in P that x.

Proof.

Is it possible to take formally in account the insights of Lewis' definition of common knowledge without abandoning the possible world framework? (Sillari 2005) puts forth an attempt to give a postive answer to that question by articulating in a possible world semantics the distinction between actual belief and reason to believe. As in (Cubitt and Sugden 2003), the basic epistemic operator represents reasons to believe. The idea is then to impose an awareness structure over possible worlds. Simply put, an awareness structure associates to each agent, for every possible world, a set of events of which the agent is said to be aware. An agent entertains an actual belief that a certain event occurs if and only if she has reason to believe that the event occurs and such event is in her awareness set at the world under consideration.

2.3 Aumann's Account

Aumann (1976) gives a different characterization of common knowledge which gives another simple algorithm for determining what information is commonly known. Aumann's original account assumes that the each agent's possibility set forms a private information partition of the space Ω of possible worlds. Aumann shows that a proposition C is common knowledge if, and only if, C contains a cell of the meet of the agents' partitions. One way to compute the meet calligraphic-M of the partitions calligraphic-Hi, iN is to use the idea of "reachability".

Definition 2.13
A state ω′ ∈ Ω is reachable from ω ∈ Ω iff there exists a sequence ω=ω0, ω1, ω2, … , ωm=ω′ such that for each k ∈ {0,1, … , m−1}, there exists an agent ikN such that calligraphic-Hikk) = calligraphic-Hikk+1).

In words, ω′ is reachable from ω if there exists a sequence or "chain" of states from ω to ω′ such that two consecutive states are in the same cell of some agent's information partition. To illustrate the idea of reachability, let us return to the modified Barbecue Problem in which Cathy, Jennifer and Mark receive no announcement. Their information partitions are all depicted in Figure 2.1d:

missing text, please inform
Figure 2.1d

One can understand the importance of the notion of reachability in the following way: If ω′ is reachable from ω, then if ω obtains then some agent can reason that some other agent thinks that ω′ is possible. Looking at Figure 2.1d, if ω = ω1 occurs, then Cathy (who knows only that {ω1, ω2} has occurred) knows that Jennifer thinks that ω5 might have occurred (even though Cathy knows that ω5 did not occur). So Cathy cannot rule out the possibility that Jennifer thinks that Mark thinks that that ω8 might have occurred. And Cathy cannot rule out the possibility that Jennifer thinks that Mark thinks that Cathy believes that ω7 is possible. In this sense, ω7 is reachable from ω1. The chain of states which establishes this is ω1, ω 2, ω5, ω8, ω7, since calligraphic-H11) = calligraphic-H12), calligraphic-H22) = calligraphic-H25), calligraphic-H35) = calligraphic-H38), and calligraphic-H18) = calligraphic-H17). Note that one can show similarly that in this example any state is reachable from any other state. This example also illustrates the following immediate result:

Proposition 2.14
ω′ is reachable from ω iff there is a sequence i1, i2, … , imN such that
(1) ω′ ∈ calligraphic-Him( … (calligraphic-Hi2(calligraphic-Hi1(ω))))

One can read (1) as: ‘At ω, i1 thinks that i2 thinks that … , im thinks that ω′ is possible.’

We now have:

Lemma 2.15
ω′ ∈ calligraphic-M(ω) iff ω′ is reachable from ω.

Proof.

and
Lemma 2.16
calligraphic-M(ω) is common knowledge for the agents of N at ω.

Proof.

and
Proposition 2.17 (Aumann 1976)
Let calligraphic-M be the meet of the agents' partitions calligraphic-Hi for each iN. A proposition E ⊆ Ω is common knowledge for the agents of N at ω iff calligraphic-M(ω) ⊆ E. (In Aumann (1976), E is defined to be common knowledge at ω iff calligraphic-M(ω) ⊆ E.)

Proof.

If E = K1N(E), then E is a public event (Milgrom 1981) or a common truism (Binmore and Brandenburger 1989). Clearly, a common truism is common knowledge whenever it occurs, since in this case E = K1N(E) = K2N(E) = … , so E = K*N(E). The proof of Proposition 2.17 shows that the common truisms are precisely the elements of calligraphic-M and unions of elements of calligraphic-M, so any commonly known event is the consequence of a common truism.

2.4 Barwise's Account

Barwise (1988) proposes another definition of common knowledge that avoids explicit reference to the hierarchy of ‘i knows that j knows that … knows that A’ propositions. Barwise's analysis builds upon an informal proposal by Harman (1977). Consider the situation of the guest and clumsy waiter in Example 1 when he announces that he was at fault. They are now in a setting where they have heard the waiter's announcement and know that they are in the setting. Harman adopts the circularity in this characterization of the setting as fundamental, and propses a definition of common knowledge in terms of this circularity. Barwise's formal analysis gives a precise formulation of Harman's intuitive analysis of common knowledge as a fixed point. Given a function f, A is a fixed point of f if f(A)=A. Now note that

K1N ( E

m=1
KmN ( E ) )
=
K1N ( E ) ∩ K1N (

m=1
KmN ( E ) )
=
K1N ( E ) ∩ (

m=1
K1N ( KmN ( E ) ) )
=
K1N ( E ) ∩ (

m=2
KmN ( E ) )
=


m=1
KmN ( E )
So we have established that K*N (E) is a fixed point of the function fE defined by fE(X) = K1N (EX). fE has other fixed points. For instance, any contradiction BBc = ø is a fixed point of fE.[13] Note also that if AB, then EAEB and so
fE(A) = K1N (EA) ⊆ K1N (EB) = fE(B)
that is, fE is monotone. (We saw that K1N is also monotone in the proof of Proposition 2.4.) Barwise's analysis of common knowledge can be developed using the following result from set theory:
Proposition
A monotone function f has a unique fixed point C such that if B is a fixed point of f, then BC. C is the greatest fixed point of f.
This proposition establishes that fE has a greatest fixed point, which characterizes common knowledge in Barwise's account. As Barwise himself observes, the fixed point analysis of common knowledge is closely related to Aumann's partition account. This is easy to see when one compares the fixed point analysis to the notion of common truisms that Aumann's account generates. Some authors regard the fixed point analysis as an alternate formulation of Aumann's analysis. Barwise's fixed point analysis of common knowledge is favored by those who are especially interested in the applications of common knowledge to problems in logic, while the hierarchical and the partition accounts are favored by those who wish to apply common knowledge in social philosophy and social science. When knowledge operators satisfy the axioms (K1)-(K4), the Barwise account of common knowledge is equivalent to the hierarchical account.
Proposition 2.18
Let C*N be the greatest fixed point of fE. Then C*N(E) = K*N(E). ( In Barwise (1988, 1989), E is defined to be common knowledge at ω iff ω ∈ C*N(E).)

Proof.

Barwise argues that in fact the fixed point analysis is more flexible and consequently more general than the hierachical account. This may surprise readers in light of Proposition 2.18, which shows that Barwise's fixed point definition is equivalent to the hierarchical account. Indeed, while Barwise (1988, 1989) proves a result showing that the fixed point account implies the hierarchical account and gives examples that satisfy the common knowledge hierarchy but fail to be fixed points, a number of authors who have written after Barwise have given various proofs of the equivalence of the two definitions, as was shown in Proposition 2.18. In fact, there is not a true controversy, at least with respect to the analytical results. Barwise's fixed point account is indeed equivalent to the hierarchical and the partition accounts given the account of knowledge characterized by (K1)-(K4) that most practitioners accept. Barwise does not make explicit which axioms of (K1)-(K4) he accepts, but he wishes to analyze a weaker notion of knowledge that is not closed under logical implication, and so he is committed to rejecting (K3). By doing so, Barwise is able to prove the nonequivalence between the fixed point and the hierarchical account he claims. But Barwise's result comes at a price most analysts are not willing to pay. To formulate his results given his very weak conception of knowledge, Barwise must use non-well-founded set theory (Aczel 1988) in order to allow him to make the necessary circular definitions. As we have seen in this section, when one adopts the conventional analysis of knowledge that satisfies (K1)-(K4), the equivalence of the hierarchical and the fixed point accounts follows without the need to introduce non-well-founded set-theoretic concepts.

2.5 Gilbert's Account

Gilbert (1989, Chapter 3) presents an alternative account of common knowledge, which is meant to be more intuitively plausible than Lewis' and Aumann's accounts. Gilbert gives a highly detailed description of the circumstances under which agents have common knowledge.

Definition 2.19
A set of agents N are in a common knowledge situation calligraphic-S(A) with respect to a proposition A if, and only if, ω ∈ A and for each iN,
G1: i is epistemically normal, in the sense that i has normal perceptual organs which are functioning normally and has normal reasoning capacity.[14]
G2: i has the concepts needed to fulfill the other conditions.
G3: i perceives the other agents of N.
G4: i perceives that G1 and G2 are the case.
G5: i perceives that the state of affairs described by A is the case.
G6: i perceives that all the agents of N perceive that A is the case.

Gilbert's definition appears to contain some redundancy, since presumably an agent would not perceive A unless A is the case. Gilbert is evidently trying to give a more explicit account of single agent knowledge than Lewis and Aumann give. For Gilbert, agent i knows that a proposition E is the case if, and only if, ω ∈ E, that is, E is true, and either i perceives that the state of affairs E describes obtains or i can infer E as a consequence of other propositions i knows, given sufficient inferential capacity.

Like Lewis, Gilbert recognizes that human agents do not in fact have unlimited inferential capacity. To generate the infinite hierarchy of mutual knowledge, Gilbert introduces the device of an agent's smooth-reasoner counterpart. The smooth-reasoner counterpart i′ of an agent i is an agent that draws every logical conclusion from every fact that i knows. Gilbert stipulates that i′ does not have any of the constrains on time, memory, or reasoning ability that i might have, so i′ can literally think through the infinitely many levels of a common knowledge hierarchy.

Definition 2.20
If a set of agents N are in a common knowledge situation calligraphic-SN(A) with respect to A, then the corresponding set N′ of their smooth-reasoner counterparts is in a parallel situation calligraphic-SN(A) if, and only if, for each i′ ∈ N,
G1′: i′ can perceive anything that the counterpart i can perceive.
G2′: G2 - G6 obtain for i′ with respect to A and N′, same as for the counterpart i with respect to A and N.
G3′: i′ perceives that all the agents of N′ are smooth-reasoners.

From this definition we get the following immediate consequence:

Proposition 2.21
If a set of smooth-reasoner counterparts to a set N of agents are in a situation calligraphic-SN(A) parallel to a common knowledge situation calligraphic-SN(A) of N, then
for all mthe natural numbers and for any i1′, … , im′, Ki1Ki2Kim(A).
Consequently, KmN(A) for any mthe natural numbers.

Gilbert argues that, given calligraphic-SN(A), the smooth-reasoner counterparts of the agents of N actually satisfy a much stronger condition, namely mutual knowledge KαN(A) to the level of any ordinal number α, finite or infinite. When this stronger condition is satisfied, the proposition A is said to be open* to the agents of N. With the concept of open*-ness, Gilbert gives her definition of common knowledge.

Definition 2.22
A proposition E ⊆ Ω is Gilbert-common knowledge among the agents of a set N = {1, … ,n}, if and only if,
G1*: E is open* to the agents of N.
G2*: For every iN, Ki(G1*).
GN*(E) denotes the proposition defined by G1* and G2* for a set N of A*-symmetric reasoners, so we can say that E is Lewis-common knowledge for the agents of N iff ω ∈ GN*(E).

One might think that an immediate corollary to Gilbert's definition is that Gilbert-common knowledge implies the hierarchical common knowledge of Proposition 2.5. However, this claim follows only on the assumption that an agent knows all of the propositions that her smooth-reasoner counterpart reasons through. Gilbert does not explicitly endorse this position, although she correctly observes that Lewis and Aumann are committed to something like it.[15] Gilbert maintains that her account of common knowledge expresses our intuitions with respect to common knowledge better than Lewis' and Aumann's accounts, since the notion of open*-ness presumably makes explicit that when a proposition is common knowledge, it is "out in the open", so to speak.

3. Applications of Mutual and Common Knowledge

Readers primarily interested in philosophical applications of common knowledge may want to focus on the No Disagreement Theorem and Convention subsections. Readers interested in applications of common knowledge in game theory may continue with the Strategic Form Games, and Games of Perfect Information subsections.

3.1 The "No Disagreement" Theorem

Aumann (1976) originally used his definition of common knowledge to prove a celebrated result that says that in a certain sense, agents cannot "agree to disagree" about their beliefs, formalized as probability distributions, if they start with common prior beliefs. Since agents in a community often hold different opinions and know they do so, one might attribute such differences to the agents' having different private information. Aumann's surprising result is that even if agents condition their beliefs on private information, mere common knowledge of their conditioned beliefs and a common prior probability distribution implies that their beliefs cannot be different, after all!

Proposition 3.1
Let Ω be a finite set of states of the world. Suppose that
  1. Agents i and j have a common prior probability distribution μ(·) over the events of Ω such that μ(ω) > 0, for each ω ∈ Ω, and
  2. It is common knowledge at ω that i's posterior probability of event E is qi(E) and that j's posterior probability of E is qj(E).
Then qi(E) = qj(E).

Proof.
[Note that in the proof of this proposition, and in the sequel, μ(·|B) denotes conditional probability; that is, given μ(B)>0, μ(A|B) = μ(AB)/μ(B).]

In a later article, Aumann (1987) argues that the assumptions that Ω is finite and that μ(ω) > 0 for each ω ∈ Ω reflect the idea that agents only regard as "really" possible a finite collection of salient worlds to which they assign positive probability, so that one can drop the states with probability 0 from the description of the state space. Aumann also notes that this result implicitly assumes that the agents have common knowledge of their partitions, since a description of each possible world includes a description of the agents' possibility sets. And of course, this result depends crucially upon (i), which is known as the common prior assumption (CPA).

Aumann's "no disagreement" theorem has been generalized in a number of ways in the literature (McKelvey and Page 1986, Monderer and Samet 1989, Geanakoplos 1994). However, all of these "no disagreement" results raise the same philosophical puzzle raised by Aumann's original result: How are we to explain differences in belief? Aumann's result leaves us with two options: (1) admit that at some level, common knowledge of the agents' beliefs or how they form their beliefs fails, or (2) deny the CPA. For instance, agents in the real world often do not express their opinions probabilistically. If one agent announces‘I believe that E is the case’ while another announces‘I doubt that E is the case’, then they might attribute their divergent opinions to a lack of common knowledge of each other's true posteriors for E. Even if agents do assign precise posterior probabilities to an event, Aumann shows that if they have merely first-order mutual knowledge of the posteriors, they can "agree to disagree". Suppose the following all hold:

Ω = {ω1, ω2, ω3, ω4},
calligraphic-H1 = {{ω12}, {ω34}}
calligraphic-H2 = {{ω123}, {ω4}}
μ(ωi) = 1/4
Then if E = {ω14}, then at ω1, we have:
q1(E) = μ(E | {ω12}) = 1/2, and

q2(E) = μ(E | {ω123}) = 1/3

Moreover, at ω = ω1, Agent 1 knows that calligraphic-H2(ω) = {ω123}, so she knows that q2(E) = 1/3. Agent 2 knows at ω1 that either calligraphic-H1(ω) = {ω12} or calligraphic-H1(ω) = {ω34}, so either way he knows that q1(E) = 1/2. Hence the agents' posteriors are mutually known, and yet they are unequal. The reason for this is that the posteriors are not common knowledge. For Agent 2 does not know what Agent 1 thinks q2(E) is, since if ω = ω3, which is consistent with what Agent 2 knows, then Agent 1 will believe that q2(E) = 1/3 with probability 1/2 (if ω = ω3) and q2(E) = 1 with probability 1/2 (if ω = ω4).

Aumann's result could fail if the agents' partitions are not common knowledge. For suppose in the example just given, the agents do not know each other's partitions. Then at ω = ω1, if their posteriors are common knowledge, then Agent 1, who knows that ω ∈ {ω12}, can explain Agent 2's posterior as the result of Agent 2 having observed either {ω123}, {ω124}, {ω134} or {ω234}. Still another way Aumann's result might fail is if agents do not have common knowledge that they update their beliefs by Bayesian conditionalization. Then clearly, agents can explain divergent opinions as the result of others having modified their beliefs in the "wrong" way. However, there are cases in which none of these explanations will seem convincing. For instance, odds makers sometimes publicly announce different probabilities for an event, such as a particular winner of a prize at a forthcoming Academy Awards presentation, and they will know that none of them have any private information regarding the event. In cases such as this, the agents have common knowledge that they all have the same information structure and common knowledge of their posteriors. And knowing that they are all competent odds makers, they have common knowledge that they update by Bayesian conditionalization. Still, the odds makers' beliefs violate the conclusion of Aumann's result. More generally, denying the requisite common knowledge seems a rather ad hoc move. For instance, to deny that agents have common knowledge of information structures is simply to deny that agents can all infer the same conclusions regarding possible worlds as Aumann defines them. To deny that agents have common knowledge that they update their beliefs by Bayesian conditionalization is to assert that some believe that some might be updating their beliefs incoherently, in the sense that their belief updating leaves them open to a Dutch book (Skyrms 1984). As just noted, these failures of agents' beliefs in each others' competence do not fail in all cases. Why should one think that such failures of common knowledge provide a general explanation for divergent beliefs?

What of the second option, that is, denying the CPA?[16] The main argument put forward in favor of the CPA is that any differences in agents' probabilities should be the result of their having different information only, that is, there is no reason to think that the different beliefs that agents have regarding the same event are the result of anything other than their having different information. However, one can reply that this argument amounts simply to a restatement of the Harsanyi Doctrine.[17] And while defenders of the Harsanyi Doctrine may be right in thinking that there is apparently no compelling reason to think that agents' priors can be different, neither is there compelling reason to think they must be the same! In any event, while the controversy over the Harsanyi Doctrine remains unresolved, we can conclude that the "no disagreement" results have interesting implications for the viability of common knowledge and the very nature of probability. Defenders of the CPA take an objectivist view of probability, and by virtue of the "no disagreement" results are evidently committed to the view that common knowledge of agents beliefs and how they are formed is a rare phenomenon in the world. Those who are prepared to deny the CPA allow for a genuinely subjectivist conception of probability. They take the view that common knowledge of agents' beliefs and how they come by them can be a commonplace phenomenon, and that differences in opinion can stem from differences in (subjective) prior probabilities.

3.2 Convention

Schelling's Department Store problem of Example 1.5 is a very simple example in which the agents "solve" their coordination problem appropriately by establishing a convention. Using the vocabulary of game theory, Lewis (1969) defines a convention as a strict coordination equilibrium of a game which agents follow on account of their common knowledge that they all prefer to follow this coordination equilibrium. A coordination equilibrium of a game is a strategy combination such that no agent is better off if any agent unilaterally deviates from this combination. As with equilibria in general, a coordination equilibrium is strict if any agent who deviates unilaterally from the equilibrium is strictly worse off. The strategic form game of Figure 1.3 summarizes Liz's and Robert's situation. The Department Store game has four Nash equilibrium outcomes in pure strategies: (s1, s1), (s2, s2), (s3, s3), and (s4, s4).[18] These four equilibria are all strict coordination equilibria. If the agents follow either of these equilibria, then they coordinate successfully. For agents to be following a Lewis-convention in this situation, they must follow one of the game's coordination equilibria. However, for Lewis to follow a coordination equilibrium is not a sufficient condition for agents to be following a convention. For suppose that Liz and Robert fail to analyze their predicament properly at all, but Liz chooses s2 and Robert chooses s2, so that they coordinate at (s2, s2) by sheer luck. Lewis does not count accidental coordination of this sort as a convention.

Suppose next that both agents are Bayesian rational, and that part of what each agent knows is the payoff structure of the Intersection game. If the agents expect each other to follow (s2, s2) and they consequently coordinate successfully, are they then following a convention? Not necessarily, contends Lewis, in a subtle argument on p. 59 of Convention. For while each knows the game and that she is rational, she might not attribute like knowledge to the other agent. If each agent believes that the other agent will follow her end of the (s2, s2) equilibrium mindlessly, then her best response is to follow her end of (s2, s2). But in this case the agents coordinated as the result of their each falsely believing that the other acts like an automaton, and Lewis thinks that any proper account of convention must require that agents have correct beliefs about one another. In particular, Lewis requires that each agent involved in a convention must have mutual expectations that each is acting with the aim of coordinating with the other, that is, that each knows that:

A1: Both are rational,
A2: Both know the payoff structure of the game, and
A3: Both intend to follow (s2, s2), and not some other strategy combination.

Suppose that the agents' beliefs are appropriately augmented so that each agent knows that A1, A2, and A3 are the case. Again they coordinate on (s2, s2). Are they following a convention this time? Still not necessarily, says Lewis. For what if it turns out that Liz thinks that Robert does not know that they are both rational? Then Liz has a false belief about Robert. Beyond this, there are two other points which Lewis does not himself raise in this argument, but which clearly support his view. First, it would be counterintuitive, at the very least, to suppose that any agent following a convention believes that he has reasoning abilities that the other agents lack. If Liz has determined that A1, A2, and A 3 are the case, then if they are following a convention she should expect that Robert has arrived at the same conclusion. Second, what could explain Liz's knowledge of A3? The most natural explanation for Liz's expectation that Robert will follow his end of (s2, s2) is that Liz knows that Robert knows that A1, A2, and A3 are the case. So convention evidently involves agents having at least second-order mutual knowledge of A1, A2, and A3, that is, Robert (Liz) must know that Liz (Robert) knows that A1, A2, and A3 are the case. But this raises the question: Can third-order mutual knowledge that A1, A 2, and A3 obtain fail? No, argues Lewis. For if Robert thought that Liz did not know that Robert knew that A1, A2, and A3 were the case, then Robert would have a false belief about Liz. The additional supporting points also kick in again: If Robert has second-order mutual knowledge that A1, A2, and A3 obtain, then he should conclude that Liz also has this second-order mutual knowledge. To conclude otherwise would require Robert to assume, counterintuitively, that he has analyzed their deliberations in this situation in a way that Liz cannot. And how did Robert get his second-order mutual knowledge of A3? The most obvious way to account for Robert's second-order mutual knowledge would be to attribute to Robert the knowledge that Liz has second-order mutual knowledge that A1, A2, and A3 are the case. So convention requires third-order mutual knowledge that A1, A2, and A 3 are the case. And the argument can be continued for any higher level of mutual knowledge.

Lewis concludes that a necessary condition for agents to be following a convention is that their preferences to follow the corresponding coordination equilibrium be common knowledge. So on Lewis' account, a convention for a set of agents is a coordination equilibrium which the agents follow on account of their common knowledge of their rationality, the payoff structure of the relevant game and that each agent follows her part of the equilibrium.

A regularity R in the behavior of members of a population P when they are agents in a recurrent situation S is a convention if and only if it is true that, and it is common knowledge in P that, in any instance of S among the members of P,
  1. everyone conforms to R;
  2. everyone expects everyone else to conform to R;
  3. everyone has approximately the same preferences regarding all possible combinations of actions;
  4. everyone prefers that everyone conform to R, on condition that at least all but one conform to R;
  5. everyone would prefer that everyone conform to R′, on condition that at least all but one conform to R′,
where R′ is some possible regularity in the behavior of members of P in S, such that no one in any instance of S among members of P could conform both to R′ and to R.
(Lewis 1969, p. 76)[19]

Lewis includes the requirement that there be an alternate coordination equilibrium R′ besides the equilibrium R that all follow in order to capture the fundamental intuition that how the agents who follow a convention behave depends crucially upon how they expect the others to behave. In the Department Store game, the (s2, s2) equilibrium is a Lewis-convention when Liz and Robert have common knowledge of A1, A2, and A3. Had their expectations been different, so either had believed that the other would not follow (s2, s2), then the outcome might have been very different.

Sugden (1986) and Vanderschraaf (1998) argue that it is not crucial to the notion of convention that the corresponding equilibrium be a coordination equilibrium. Lewis' key insight is that a convention is a pattern of mutually beneficial behavior which depends on the agents' common knowledge that all follow this pattern, and no other. Vanderschraaf gives a more general definition of convention as a strict equilibrium together with common knowledge that all follow this equilibrium and that all would have followed a different equilibrium had their beliefs about each other been different. An example of this more general kind of convention is given below in the discussion of the Figure 3.1 example.

Cubitt and Sugden (2003) provide an insightful study of Lewis' account of convention. They notice that Lewis seems to be the first author to analyze games played recurrently in a population under the assumption that such games are not necessarily played by the same agents. This differs from the traditional approach to repeated games which considers games repeatedly played by the same agents and treats them as a single, self-contained supergame. Lewis' approach differs, however, also from the framework that game theorists have subsequently used to analyze recurrent games in a population. The methods adopted in such developments were inspired from the work of theoretical biologists and focus, rather than on the role of reasoning, on "blind or adaptive mechanisms of selection" (Cubitt and Sugden 2003, p. 177). For example, Skyrms (1996) recasts Lewis' analysis of convention in the framework of evolutionary game theory, dispensing with assumptions that are central to Lewis' argument, among which that of common knowledge. Building on their formal reconstruction of Lewis' definition of common knowledge (see section 2.2 for details,) Cubitt and Sugden claim that the regularity in the behavior of a population P lying at the core of Lewis' definition of convention matches the clauses of the definition if it does function as a basis for common knolwedge of the fact that it will keep to be a regularity in the future. Inductive reasoning plays a central role in Cubitt and Sugden's account, letting them conclude that "Lewis' approach offers an alternative both to the modelling strategy of classical game theory, in which self-contained games are played by hyper-rational agents, and to that of evolutionary game theory, in which players' behaviour is the product of blind processes of selection." (cf. Cubitt and Sugden 2003, p. 204).

3.3 Strategic Form Games

Lewis formulated the notion of common knowledge as part of his general account of conventions. In the years following the publication of Convention, game theorists have recognized that any explanation of a particular pattern of play in a game depends crucially on mutual and common knowledge assumptions. More specifically, solution concepts in game theory are both motivated and justified in large part by the mutual or common knowledge the agents in the game have regarding their situation.

To establish the notation that will be used in the discussion that follows, the usual definitions of a game in strategic form, expected utility and agents' distributions over their opponents' strategies, are given here:

Definition 3.2
A game Γ is an ordered triple (N, S, u) consisting of the following elements:
  1. A finite set N = {1,2, … , n}, called the set of agents or players.
  2. For each agent kN, there is a finite set Sk = {sk1,sk2, … , sknk}, called the alternative pure strategies for agent k. The Cartesian product S = S1 × … × Sn is called the pure strategy set for the game Γ.
  3. A map u : Sthe real numbersn, called the utility or payoff function on the pure strategy set. At each strategy combination s = (s1j1, … , snjn) ∈ S, agent k's particular payoff or utility is given by the kth component of the value of u, that is, agent k's utility uk at s is determined by
    uk (s) = Ik(u(s1j1, … , snjn))
    where Ik(x) projects xthe real numbersn onto its kth component.

The subscript ‘-k’ indicates the result of removing the kth component of an n-tuple or an n-fold Cartesian product. For instance,

S-k = S1 × … × Sk−1 × Sk+1 × … × Sn

denotes the pure strategy combinations that agent k's opponents may play.

Now let us formally introduce a system of the agents' beliefs into this framework. Δk(S-k) denotes the set of probability distributions over the measurable space (S-k, gothic-Fk), where gothic-Fk denotes the Boolean algebra generated by the strategy combinations S-k. Each agent k has a probability distribution μk ∈ Δk(S-k), and this distribution determines the (Savage) expected utilities for each of k's possible acts:

E(uk(sk j)) =
A-kS-k
uk(skj, s-k) μk(s-k),   j = 1, 2, … , nk

If i is an opponent of k, then i's individual strategy si j may be characterized as a union of strategy combinations { s-k | sijs-k } ∈ gothic-Fk, and so k's marginal probability for i's strategy si j may be calculated as follows:

μk(sij) =
{ s-k | sijs-k }
μk(s-k)
μk(· | A) denotes k's conditional probability distribution given a set A, and E(· | A) denotes k's conditional expectation given μk(· | A).

Suppose first that the agents have common knowledge of the full payoff structure of the game they are engaged in and that they are all rational, and that no other information is common knowledge. In other words, each agent knows that her opponents are expected utility maximizers, but does not in general know exactly which strategies they will choose or what their probabilities for her acts are. These common knowledge assumptions are the motivational basis for the solution concept for noncooperative games known as rationalizability, introduced independently by Bernheim (1984) and Pearce (1984). Roughly speaking, a rationalizable strategy is any strategy an agent may choose without violating common knowledge of Bayesian rationality. Bernheim and Pearce argue that when only the structure of the game and the agents' Bayesian rationality are common knowledge, the game should be considered "solved" if every agent plays a rationalizable strategy. For instance, in the "Chicken" game with payoff structure defined by Figure 3.1,

missing text, please inform
Figure 3.1
if Joanna and Lizzi have common knowledge of all of the payoffs at every strategy combination, and they have common knowledge that both are Bayesian rational, then any of the four pure strategy profiles is rationalizable. For if their beliefs about each other are defined by the probabilities
α1 = μ1 (Joanna plays s1), and
α2 = μ2 (Lizzi plays s1)
then
E(ui(s1)) = 3αi + 2(1 − αi) = αi + 2
and
E(ui(s2)) = 4αi + 0(1 − αi) = 4αi,   i = 1, 2

so each agent maximizes her expected utility by playing s1 if αi + 2 ≥ 4αi or αi ≤ 2/3 and maximizes her expected utility by playing s2 if αi ≥ 2/3. If it so happens that αi > 2/3 for both agents, then both conform with Bayesian rationality by playing their respective ends of the strategy combination (s2,s2) given their beliefs, even though each would want to defect from this strategy combination were she to discover that the other is in fact going to play s2. Note that the game's pure strategy Nash equilibria, (s1, s2) and (s2, s1), are rationalizable, since it is rational for Lizzi and Joanna to conform with either equilibrium given appropriate distributions. In general, the set of a game's rationalizable strategy combinations contains the set of the game's pure strategy Nash equilibria, and this example shows that the containment can be proper.

To show that rationalizability is a nontrivial notion, consider the 2-agent game with payoff structure defined by Figure 3.2a:

missing text, please inform
Figure 3.2a

In this game, s1 and s3 strictly dominate s2 for Lizzi, so Lizzi cannot play s2 on pain of violating Bayesian rationality. Joanna knows this, so Joanna knows that the only pure strategy profiles which are possible outcomes of the game will be among the six profiles in which Lizzi does not choose s2. In effect, the 3 × 3 game is reduced to the 2 × 3 game defined by Figure 3.2b:

missing text, please inform
Figure 3.2b

In this reduced game, s2 is strictly dominated for Joanna by s1, and so Joanna will rule out playing s2. Lizzi knows this, and so she rules out strategy combinations in which Joanna plays s2. The rationalizable strategy profiles are the four profiles that remain after deleting all of the strategy combinations in which either Lizzi or Joanna play s2. In effect, common knowledge of Bayesian rationality reduces the 3 × 3 game of Figure 3.2a to the 2 × 2 game defined by Figure 3.2c:

missing text, please inform
Figure 3.2c
since Lizzi and Joanna both know that the only possible outcomes of the game are (s1, s1), (s1, s3), (s3, s1), and (s3, s3).

Rationalizability can be defined formally in several ways. A variation of Bernheim's original (1984) definition is given here.

Definition 3.3
Given that each agent kN has a probability distribution μk ∈ Δk(s-k), the system of beliefs
μ   =   (μ1, … , μn) ∈ Δ1(S-1) × … × Δn(S-n)
is Bayes concordant if and only if,
(3.i) For ik, μi(skj) > 0 ⇒ skj maximizes k's expected utility for some σk ∈ Δk(s-k),
and (3.i) is common knowledge. A pure strategy combination s = (s1j1, … , snjn) ∈ S is rationalizable if and only if the agents have a Bayes concordant system μ of beliefs and, for each agent kN,
(3.ii) E(uk(skjk)) ≥ E(uk(skik)), for ikjk.[20]

The following result shows that the common knowledge restriction on the distributions in Definition 3.1 formalizes the assumption that the agents have common knowledge of Bayesian rationality.

Proposition 3.4
In a game Γ, common knowledge of Bayesian rationality is satisfied if, and only if, (3.i) is common knowledge.

Proof.

When agents have common knowledge of the game and their Bayesian rationality only, one can predict that they will follow a rationalizable strategy profile. However, rationalizability becomes an unstable solution concept if the agents come to know more about one another. For instance, in the Chicken example above with αi > 2/3, i = 1, 2, if either agent were to discover the other agent's beliefs about her, she would have good reason not to follow the (s2,s2) profile and to revise her own beliefs regarding the other agent. If, in the other hand, it so happens that α1 = 1 and α2 = 0, so that the agents maximize expected payoff by following the (s2, s1) profile, then should the agents discover their beliefs about each other, they will still follow (<