This is a file in the archives of the Stanford Encyclopedia of Philosophy. |
version |
Stanford Encyclopedia of Philosophy |
content revised
|
Before seeing the workings of the KS theorem in some detail, we must clarify why it is of importance to philosophers of science. The explicit premise of HV interpretations is one of value definiteness:
(VD) All observables defined for a QM system have definite values at all times.VD, however, is motivated by a more basic principle, an apparently innocuous realism about physical measurement which, initially, seems an indispensable tenet of natural science. This realism consists in the assumption that whatever exists in the physical world is causally independent of our measurements which serve to give us information about it. Now, since measurements of all QM observables, typically, yield more or less precise values, there is good reason to think that such values exist independently of any measurements - which leads us to assume VD. (Note that we do not need to assume here that the values are faithfully revealed by measurement, but only that they exist!) We can concretize our innocuous realism in a second assumption of noncontextuality:
(NC) If a QM system possesses a property (value of an observable), then it does so independently of any measurement context, i.e. independently of how that value is eventually measured.This means that if a system possesses a given property, it does so independently of possessing other values pertaining to other arrangements. So, both our assumptions incorporate the basic idea of an independence of physical reality from its being measured.
The KS theorem establishes a contradiction between VD + NC and QM; thus, acceptance of QM logically forces us to renounce either VD or NC. However, the situation is more dramatic than it would initially seem. VD is the key motivating assumption of the HV programme in the sense that, if feasible, it would most naturally explain the statistical character of QM and most elegantly explain away the infamous measurement problem haunting all interpreters of QM [see the entries on quantum mechanics and measurement in quantum theory for details]. But, as we just saw, the second assumption NC is motivated by the same innocuous realism which embodies a standard of scientific rationality, and it is far from obvious what an interpretation obeying this standard only partly, i.e. endorsing only VD but rejecting NC, should look like. This complex of issues -- namely, (1) VD + NC contradict QM; (2) the conceptual difficulties of interpreting QM provide a strong motivation for VD; (3) it is not obvious how to come up with a plausible story about QM containing VD, but not NC -- is what fuels philosophical interest in the KS theorem.
The present section states some elements of the historical and
systematic background of the KS theorem. Most importantly, an argument
by von Neumann (1932), a theorem by Gleason (1957), and a critical
discussion of both plus a later argument by Bell (1966) have to be
considered. Von Neumann, in his famous 1932 book Die mathematischen
Grundlagen der Quantenmechanik, disputed the possibility of
providing QM with an HV underpinning. He gave an argument which boils
down to the following: Consider the mathematical fact that, if
A and B are
self-adjoint operators, then any linear combination of them (any
C =
A +
B, where
,
are arbitrary real
numbers) is also a self-adjoint operator. QM further dictates that for
any QM state:
(1) If A and B (represented by self-adjoint operators A and B) are observables on a system, then there also is an observable C (represented by self-adjoint operator C defined as before) on the same system.Now consider A, B, C, as above, and let their values be v(A), v(B), v(C). Consider a hidden state V which determines v(A), v(B), v(C). We can then derive from V trivial expectation values which are just the possessed values themselves: <A>V = v(A), and so on.[1] Of course, these expectation values do not, in general, equal the QM ones: <A>V(2) If the expectation values of A and B are given by <A> and <B>, then C's expectation value is given by <C>=
<A> +
<B>.
(3) v(C) =This, however, is impossible, in general. An example very easily shows how (3) is violated, but because of its simplicity it also shows the argument's inadequacy. (This example is not due to von Neumann himself, but to Bell![2]) Let A =v(A) +
v(B).
The example illustrates why von Neumann's argument is unsatisfying. Nobody disputes the move from (2) to (3) for compatible observables, i.e. those which, according to QM, are jointly measurable in one arrangement. The above choice of A, B, C, however, is such that any two of them are incompatible, i.e. are not jointly observable. For these we will not want to require any HV interpretation to meet (3), but only (2). The hidden values need not conform with (3) in general, only the averages of their values in a series of tests must conform with (2). The authority of von Neumann's argument comes from the fact that requirements (1) and (2), for QM states, are consequences of the QM formalism, but this does not in itself justify extending these requirements to the hypothetical hidden states. Indeed, if (3) were unrestrictedly true, this would nicely explain, in the presence of hidden values, why (2) is. Von Neumann apparently thought that the HV proponent is committed to this explanation, but this seems an implausible restriction.
The KS theorem remedies this defect, spotted by Bell in von Neumann's argument, and thus strengthens the case against HV theories insofar as KS assume (3) only for sets of observables {A, B, C} which are all mutually compatible. The theorem requires that only for compatible observables assumption (3) must hold, which is something the HV theorist cannot reasonably deny.
A second, independent line of thought leading to the KS theorem is
provided by Gleason's theorem (Gleason 1957). The theorem states
that on a Hilbert space of dimension greater than or equal to 3, the
only possible probability measures are the measures
(P
)
=
Tr(P
W),
where
P
is a projection operator, W is the statistical operator
characterizing the system's actual state and Tr is the trace
operation. The
P
can be understood as representing yes-no observables, i.e. questions
concerning whether a QM system represented by a Hilbert space of
dimension greater than or equal to 3 has a property
or not, and every possible property
is associated uniquely with a vector
in the Hilbert space -- so, the
task is to unambiguously assign probabilities to all vectors in the
space. Now, the QM measure
is continuous, so Gleason's theorem in effect proves that
every probability assignment to all the possible properties in a
three-dimensional Hilbert space must be continuous, i.e. must
map all vectors in the space continuously into the interval [0,
1]. On the other hand, an HV theory (if characterized by VD + NC)
would imply that of every property we can say whether the system has
it or not. This yields a trivial probability function which maps all
the Pi to either 1 or 0, and, provided that values 1 and 0
both occur (which follows trivially from interpreting the numbers as
probabilities), this function must clearly be discontinuous
(cf. Redhead 1987: 28).
This is the easiest argument against the possibility of an HV
interpretation afforded by Gleason's theorem. Bell (1966: 6-8)
offers a variant with a particular twist which later is repeated as the
crucial step in the KS theorem. (This explains why some authors (like
Mermin 1990b) call the KS theorem the Bell-Kochen-Specker
theorem; they think that the decisive idea of the KS theorem is due to
Bell.[3])
He proves
that the mapping
dictates that two
vectors
and
mapped into 1 and 0
cannot be arbitrarily close, but must have a minimal angular
separation, while the HV mapping, on the other hand, requires that they
must be arbitrarily close.
After having offered his variant of the argument against HV theories from Gleason's theorem, Bell proceeds again to criticise it. The strategy parallels the one directed against von Neumann. Bell points out that his own Gleason-type argument against arbitrary closeness of two opposite-valued points presupposes non-trivial relations between values of non-commuting observables, which are only justified, given an assumption of noncontextuality (NC). He proposes as an analysis of what went wrong that his own argument "tacitly assumed that measurement of an observable must yield the same value independently of what other measurements may be made simultaneously" (1966: 9). In opposition to von Neumann, the Gleason-type argument derives restrictions on value assignments like (3) only for sets of compatible observables; but still one and the same observable can be a member of different commuting sets, and it is essential to the arguments that the observable gets assigned the same value in both sets, i.e. that the value assignment is not sensitive to a measurement context.
The KS theorem improves on the argument from Gleason's theorem. First, the authors repeat, in effect, Bell's proof that two vectors in the Hilbert space having values 1 and 0 cannot be arbitrarily close. However, while the Gleason argument and Bell's variant assume value assignments for a continuum of vectors in the Hilbert space, KS are able to explicitly present a discrete, even finite set of observables in the space for which an HV value assignment would lead to inconsistency. Obviously, the assumptions needed for the step of establishing that two opposite-valued points cannot be arbitrarily close are still in play in KS's improvement -- especially NC is! -- so Bell's criticism of his own Gleason-type argument survives that improvement.
Despite Bell's reasoning, the KS argument is of crucial importance in the HV discussions for two reasons: (1) It involves only a finite set of discrete observables. It thus avoids a possible objection to Bell's Gleason-type argument, namely that "it is not meaningful to assume that there are a continuum number of quantum mechanical propositions [viz. experiments]" (Kochen and Specker 1967: 70/307). So the KS theorem closes a loophole which a HV proponent might spot in Bell's argument. (2) KS propose a one-particle system as a physical realization of their argument. Thus, the argument trivially involves no separability or locality assumptions. Indeed, Bell first pointed out the tacit noncontextuality premise, but did so only in passing, and then, in the final section discussed an example of a two-particle system. Here, an eventual contextuality returns as nonseparability of the two particles, but Bell does not state the connection explicitly. Nor does he point out that the issue about the possibility of HV interpretations is, at bottom, not one about (non)separability or (non)locality, but rather one about (non)contextuality.[4] (After all, Bell's own argument against HV interpretations involves separability and/or locality assumptions!) This fact, however, is clearly illustrated by KS-type arguments.
Let H be a Hilbert space of QM state vectors of dimension xAssumption KS1 of the theorem obviously is an equivalent of VD. Assumptions KS2 (a) and (b) are called the Sum Rule and the Product Rule, respectively, in the literature. (The reader should again note that, in opposition to von Neumann's implicit premise, these rules non-trivially relate the values of compatible observables only.) Both are consequences of a deeper principle called the functional composition principle (FUNC), which in turn is a consequence of (among other assumptions) NC. The connection between NC, FUNC, Sum Rule and Product Rule will be made explicit in §4.3. Let M be a set containing y observables, defined by operators on H. Then, for specific values of x and y, the following two assumptions are contradictory:
(KS1) All y members of M simultaneously have values, i.e. are unambiguously mapped onto real unique numbers (designated, for observables A, B, C, ... by v(A), v(B), v(C), ...).
(KS2) Values of observables conform to the following constraints:
(a) If A, B, C are all compatible and C = A+B, then v(C) = v(A)+v(B);(b) if A, B, C are all compatible and C = A
B, then v(C) = v(A)
v(B).
In the original KS proof x=3 and y=117. More recently proofs involving less observables have been given by (among many others) Peres (1991, 1995) for x=3 and y=33 and by Kernaghan (1994) for x=4 and y=20. The KS proof is notoriously complex, and we will only sketch it in §3.4. The Peres proof establishes the KS result in full strength, with great simplicity, and, moreover, in an intuitively accessible way, since it operates in three dimensions; we refer the reader to Peres (1995: 197-99). The Kernaghan proof establishes a contradiction in four dimensions. This is a weaker result, of course, than the KS theorem (since every contradiction in 3 dimensions is also a contradiction in higher dimensions, but not conversely). However, the proof is so much simpler that we present it for starters in §3.2. Finally, in §3.5, we explain an argument by Clifton (1993) where x=3 and y=8 and an additional statistical assumption yields an easy and instructive KS argument.
(1) From KS2 we can derive a constraint on value assignments to projection operators, namely that for every set of projection operators P1, P2, P3, P4, corresponding to the four distinct eigenvalues q1, q2, q3, q4 of an observable Q on H4 the following holds:
(VC1((VC1) v(P1) + v(P2) + v(P3) + v(P4) = 1, where v(Pi) = 1 or 0, for i = 1, 2, 3, 4.
(2) Although the Hilbert space mentioned in the theorem, in order to
be suited for QM, must be complex, it is enough, in order to
show the inconsistency of claims KS1 and KS2, to consider a
real Hilbert space of the same dimension. So, instead of H4 we
consider a real Hilbert space R4 and translate
VC1
into the requirement: Within every set of orthogonal rays in R4,
exactly one is assigned the number 1 and the others 0. As usual in
the literature, we translate all this into the following colouring
problem: Within every set of orthogonal rays in R4, exactly one
must be coloured white and the others black. This, however, is
impossible, as is shown immediately by the following table (Kernaghan
1994):
1,0,0,0 | 1,0,0,0 | 1,0,0,0 | 1,0,0,0 | -1,1,1,1 | -1,1,1,1 | 1,-1,1,1 | 1,1,-1,1 | 0,1,-1,0 | 0,0,1,-1 | 1,0,1,0 |
0,1,0,0 | 0,1,0,0 | 0,0,1,0 | 0,0,0,1 | 1,-1,1,1 | 1,1,-1,1 | 1,1,-1,1 | 1,1,1,-1 | 1,0,0,-1 | 1,-1,0,0 | 0,1,0,1 |
0,0,1,0 | 0,0,1,1 | 0,1,0,1 | 0,1,1,0 | 1,1,-1,1 | 1,0,1,0 | 0,1,1,0 | 0,0,1,1 | 1,1,1,1 | 1,1,1,1 | 1,1,-1,-1 |
0,0,0,1 | 0,0,1,-1 | 0,1,0,-1 | 0,1,-1,0 | 1,1,1,-1 | 0,1,0,-1 | 1,0,0,-1 | 1,-1,0,0 | 1,-1,-1,1 | 1,1,-1,-1 | 1,-1,-1,1 |
There are 4 x 11 = 44 entries in this table. These entries are taken
from a set of 20 rays (so we allow for repeats). [Recall that to
specify a ray or line through the origin in four dimensions, it
suffices to give the four coordinates of any single point (apart from
the origin) that the line contains. For example, "1,0,0,0" denotes
the unique line containing the points with coordinates "0,0,0,0" and
"1,0,0,0", which line is, of course, just the "x-axis".] It is easy
to verify that every column in the table represents a set of four
orthogonal rays (simply calculate the dot products between
the vectors within each column --- they are always zero). Since the
number of columns is 11, we must end up with an odd number
of the table's entries coloured white. On the other hand, it can
be checked that each of the 20 rays appears either twice or four
times in the table. So any time we designate a particular one of
those rays as white, we commit ourselves to colouring an even number
of the entries white. It follows that the total number of table
entries coloured white must be even, not odd. Thus, a
colouring of the 20 ray set in accordance with
VC1
is impossible. (Note for future reference that the first part of
the argument -- the argument for odd -- uses only
VC1
,
while the second -- the argument for even -- relies
essentially on NC, by assuming that occurrences of the same ray in
different columns get assigned the same number!)
We consider an arbitrary operator Q on H3 with three distinct eigenvalues q1, q2, q3, its eigenvectors |q1>, |q2>, |q3>, and projection operators P1, P2, P3 projecting on the rays spanned by these vectors. Now, P1, P2, P3 are themselves observables (namely, Pi is a yes-no observable corresponding to the question Does the system have value qi for Q?). Moreover, P1, P2, P3 are mutually compatible, so we can apply the Sum Rule and Product Rule, and thereby derive a constraint on the assignment of values (Proof):
(VC1) v(P1) + v(P2) + v(P3) = 1, where v(Pi) = 1 or 0, for i = 1, 2, 3.The arbitrary choice of an observable Q defines new observables P1, P2, P3 which, in turn, select rays in H3. So, to impose that observables P1, P2, P3 all have values means to assign numbers to rays in H3, and VC1, in particular, means that of an arbitrary triple of orthogonal rays, specified by choice of an arbitrary Q (briefly: an orthogonal triple in H3), exactly one of its rays is assigned 1, the others 0. Now, if we introduce different incompatible observables Q, Q
It should be stressed, however, that at this point there is no direct
connection between R3 and physical space. KS wish to show that for an
arbitrary QM system requiring a representation in a Hilbert space of
at least three dimensions, the ascription of values in conjunction
with condition (KS2) (Sum Rule and Product Rule) is impossible, and
in order to do this it is sufficient to consider the space R3. This
space R3, however, does not represent physical space for the quantum
system at issue. In particular, orthogonality in R3 is not to be
confused with orthogonality in physical space. This becomes obvious,
if we move to an example of a QM system sitting in physical space and
at the same time requiring a QM representation in H3, e.g. a one
particle spin-1 system measured for spin. Given one arbitrary
direction
in physical space and an operator
S
representing the observable of a spin component in direction
,
H3 is spanned by the eigenvectors of
S
,
namely
|S
=1>,
|S
=0>,
|S
=-1>,
which are mutually orthogonal in H3. The fact that these three
vectors corresponding to three possible results of measurement in one
spatial direction are mutually orthogonal illustrates the different
senses of orthogonality in H3 and in physical space. (The reason
lies, of course, in the structure of QM which represents different
values of an observable by different directions in H3.) Now, if
orthogonality in H3 differs from orthogonality in physical space, and
we just use R3 to prove a result about H3, then certainly
orthogonality in R3 bears no direct connection with physical
space.
KS themselves, in the abstract, proceed in exactly the same way, but they illustrate with an example that does establish a direct connection with physical space. It is important to see this connection, but also to be clear that it is produced by KS's example and is not inherent in their mathematical result. KS propose to consider a one-particle spin 1 system and the measurement of the squared components of orthogonal directions of spin in physical space Sx2, Sy2, Sz2 which are compatible (while Sx, Sy, Sz themselves are not).[5] Measurement of a squared component of spin determines its absolute magnitude, but not its direction. Here, we derive a slightly different constraint on value assignments, again using the Sum Rule and the Product Rule (Proof):
(VC2) v(Sx2) + v(Sy2) + v(Sz2) = 2, where v(SNow, since Sx2, Sy2, Sz2 are compatible, there is an observable O such that Sx2, Sy2, Sz2 are all functions of O. So, the choice of an arbitrary O fixes Sx2, Sy2, Sz2 and, since these latter can be directly associated with mutually orthogonal rays in H3, again fixes the choice of an orthogonal triple in H3. The resulting problem here is to assign numbers {1, 1, 0} to an orthogonal triple in H3 specified by the choice of O or, more directly, Sx2, Sy2, Sz2. This is, of course, the mirror-image of our previous problem of assigning numbers {1, 0, 0} to such a triple, and we need not consider it separately.2) = 1 or 0, for
= x, y, z.
However, the choice of a specific O which selects observables
Sx2, Sy2,
Sz2 at the same time selects three
orthogonal rays in physical space, namely by fixing a coordinate
system
x,
y,
z
(which defines along which orthogonal rays the squared spin
components are to be measured) in physical space. So now, by
choice of an observable O, there is a direct connection of
with directions in H3: orthogonality in H3 now does
correspond to orthogonality in physical space. The same holds for R3,
if, in order to give an argument for H3, we consider
R3. Orthogonality in R3 now corresponds to orthogonality in physical
space. It is important to notice that this correspondence is not
necessary to give the argument, even if we insist that the pure
mathematical facts should be supplemented by a physical
interpretation - since we have, just before, seen an example without
any correspondence. The point is only that we can devise an
example such that there is a correspondence. In particular, we can
now follow the proof in R3 and all along imagine a system sitting in
physical space, namely a spin 1 particle, returning three values upon
measurement of three physical magnitudes, associated directly with
orthogonal directions in physical space, namely
v(Sx2),
v(Sy2),
v(Sz2), for arbitrary choices
of x, y, z. The KS proof then shows that
it is impossible (given its premises, of course) to assign to the
spin 1 particle values for all these arbitrary choices. That is, the
KS argument shows that (given the premises) a spin 1 particle cannot
possess all the properties at once which it displays in different
measurement arrangements.
Three further features which have become customary in KS arguments need to be mentioned:
(1) Obviously, we can unambiguously specify any ray in R3 through the origin by just giving one point contained in it. KS thus identify rays with points on the unit sphere E. KS do not need to refer to concrete coordinates of a certain point, since their argument is coordinate-free. We will, however, for illustration sometimes mention concrete points and then (a) use Cartesian coordinates to check orthogonality relations and (b) specify rays by points not lying on E. (Thus, e.g., the triple of points (0, 0, 1), (4, 1, 0), (1, -4, 0) is used to specify a triple of orthogonal rays.) Both usages conform with the recent literature (see e.g. Peres (1991) and Clifton (1993)).
(2) We translate the constraints (VC1) and (VC2) on value ascriptions into constraints for colouring the points. We can, operating under (VC1) colour the points white (for "1") and black (for "0"), or, operating under (VC2) colour the points white (for "0") and black (for "1"). In either case the constraints translate into the same colouring problem.
(3) KS illustrate orthogonality relations of rays by graphs which have come to be called KS diagrams. In such a diagram each ray (or point specifying a ray) is represented by a vertex. Vertices joined by a straight line represent orthogonal rays. The colouring problem then translates into the problem of colouring the vertices of the diagram white or black such that joined vertices cannot be both white and triangles have exactly one white vertex.
(1) In the first (and decisive) step they show that two rays with
opposite colours cannot be arbitrarily close. They show that the
diagram
1
depicted in Fig. 1 which consists of ten vertices including
a0 and a9 is constructible, if a0
and a9 are separated by an angle
with 0
sin-1(1/3)
(Proof).
What this step shows is the following: It is possible to construct this KS diagram, i.e. to specify ten rays in R3 with orthogonality relations as specified in the diagram, but only if rays a0 and a9 are closer than sin-1(1/3). Consider now (for a reductio ad absurdum) that a0 and a9 have different colours. We arbitrarily colour a0 white and a9 black. The colouring constraints then force us to colour the rest of the diagram as is done in fig.1, but this forces that a5 and a6 are orthogonal and both white -- which is forbidden. Hence, two points closer than sin-1(1/3) cannot have different colour. Contrapositively, two points of different colour cannot be closer than sin-1(1/3).![]()
Figure 1: Ten-point KS graph1 with inconsistent colouring.
(2) KS now construct another quite complicated KS diagram
2
in the following way. They consider a realization of
1
for an angle
=18°
<
sin-1(1/3). Now they choose three orthogonal points
p0, q0, r0 and space interlocking
copies of
1
between them such that every instance of point a9 of one
copy of
1
is identified with the instance of a0 of the next
copy. In this way five interlocking copies of
1
are spaced between p0 and q0 and all five
instances of a8 are identified with r0
(likewise for q0, r0, and p0, and
for r0, p0, and q0). That
2
is constructible is borne out directly by the construction
itself. Spacing out five copies of with angles
=18°
between instances of a0
will space out an angle of 5x18° = 90° which is exactly what
is required. Moreover, wandering from one copy of
1
to the next between, say, p0 and q0 is
equivalent to a rotation of the copy about the axis through the
origin and r0 of 18° which evidently conserves the
orthogonality between the points a0 and a9 of
the copy and r0.
However, although![]()
Figure 2: 117-point KS graph2
(From Kochen and Specker 1967, 69; by permission of the Indiana University Mathematics Journal)
If from the 15 copies of
1
used in the process of constructing
2
we subtract those points that were identified with each other, we
end up with 117 different points. So what KS have shown is that a set
of 117 observables cannot consistently be assigned values in
accordance with VC1 (or, equivalently, VC2).
Note that in the construction of
1,
i.e. the set of 10 points forming 22 interlocking triples, all
points except a9 appear in more than one triple. In
2 every
point appears in a multiplicity of triples. It is here that the
noncontextuality premise is crucial to the argument: We assume that
an arbitrary point keeps its value 1 or 0 as we move from one
orthogonal triple to the next (i.e. from one maximal set of
compatible observables to another).
Consider the KS diagram![]()
Figure 3: 8-point KS-Clifton graph3 with inconsistent colouring.
(VC2The argument so far has made use of the original KS conditions KS1 and KS2. We now assume, in addition, that any constraint on value assignments will show up in the measurement statistics. In particular: A value assignment dictated by a constraint entails that this assigned value with certainty is the result of any measurement respecting the constraint. Or in symbols:) If, for a spin-1 system, a certain direction x of spin in space is assigned value 0, then any other direction x
which lies away from x by an angle cos-1(1/3) must be assigned value 1, or, in symbols: If v(Sx)=0, then v(Sx
)=1.
(3) If prob[v(A)=a] = 1, and v(A)=a implies v(B)=b, then prob[v(B)=b] = 1.Despite the use of statistics, this reasoning crucially differs from von Neumann's argument. Von Neumann had argued that algebraic relations between values should transfer into the statistics of the measured values, therefore the QM constraints on these statistics should have value constraints as their exact mirror images - which reasoning leads us to derive value constraints from statistical constraints (for arbitrary observables). Here, on the contrary, we derive a value constraint independently from any statistical reasoning, and then conclude that this constraint should transfer into the measurement statistics.[7]
Now,
VC2
and the statistical condition (3) entail: If
prob[v(Sx)=0]=1, then
prob[v(Sx
)=1]=1. This, however, contradicts the statistics
derived from QM for a state where
prob[v(Sx)=0] =
1.[8]
In fact, there is a probability of 1/17 that
v(Sx
=0).
So, in a long-run test 1/17 of the spin-1 particles will violate the
constraint.
1/17 may not seem a terribly impressive number, but if we accept Clifton's statistical reasoning, we have an entirely valid KS argument establishing a contradiction between an HV interpretation of QM and the very predictions of QM. Moreover, Clifton presents a slightly more complex set of 13 observables yielding, along the same lines, a statistical contradiction of 1/3.
FUNC: Let A be a self-adjoint operator associated with observable A, let f: R(We introduced the state superscript above to allow for a possible dependence of values on the particular quantum state the system is prepared in.) The Sum Rule and the Product Rule are straightforward consequences of FUNC (Proof). FUNC itself is not derivable from the formalism of QM, but a statistical version of it (called STAT FUNC) is [Proof]:R be an arbitrary function, such that f(A) is another self-adjoint operator, and let
be an arbitrary state; then f(A) is associated uniquely with an observable f(A) such that:
v(f(A))= f(v(A))
![]()
STAT FUNC: Given A, f,But STAT FUNC cannot only be derived from the QM formalism, it also follows from FUNC [Proof]. This can be seen as providing "a plausibility argument for FUNC" (Redhead 1987: 132): STAT FUNC is true, as a matter of the mathematics of QM. Now, if FUNC were true, we could derive STAT FUNC, and thus understand part of the QM mathematics as a consequence of FUNC.as defined in FUNC, then, for an arbitrary real number b:
prob[v(f(A))=b] = prob[f(v(A))
=b]
But how can we derive FUNC itself, if not from STAT FUNC? It is a direct consequence of STAT FUNC and three assumptions (two of which are familiar from the introduction):
Value Realism (VR): If there is an operationally defined real numberSome comments on these conditions are in order. First, we need to explain the content of VR. The statistical algorithm of QM tells us how to calculate a probability from a given state, a given observable and its value. Here we understand it as a mere mathematical device without any physical interpretation: Given a Hilbert space vector, an operator and its eigenvalues, the algorithm tells us how to calculate new numbers (which have the properties of probabilities). In addition, by operationally defined we here simply mean made up from a number which we know to denote a real property. So, VR, in effect, says that, if we have a real property, associated with a self-adjoint operator A and distributed probabilistically according to the statistical algorithm of QM for A, i.e. if there exists a real number
with
=prob (v(A)=
), then there exists an observable A with value
.
Value Definiteness (VD): All observables defined for a QM system have definite values at all times.
Noncontextuality (NC): If a QM system possesses a property (value of an observable), then it does so independently of any measurement context.
Secondly, concerning NC: A failure of NC could be understood in two ways. Either, the value of an observable might be context-dependent, although the observable itself is not; or, the value of an observable might be context-dependent, because the observable itself is. There are, however, good grounds to think that both options are equivalent. We will indeed assume that, if NC holds, this means that the observable -- and thereby also its value -- is independent of the measurement context, i.e. is independent of how it is measured. In particular, the independence from context of an observable implies that there is a 1:1 correspondence of observables and operators. This implication of NC is what we will use presently in the derivation of FUNC. Conversely, failure of NC will be construed solely as failure of the 1:1 correspondence.
From VR, VD, NC and STAT FUNC, we can derive FUNC as follows. Consider an arbitrary state of a system and an arbitrary observable Q. By VD, Q possesses a value v(Q)=a. Thus, we can form the number f(v(Q))=b for an arbitrary function f. For this number, by STAT FUNC, prob[f(v(Q))=b] = prob[v(f(Q))=b]. Hence, we have, by transforming probabilities according to STAT FUNC, created a new self-adjoint operator f(Q), and associated it with the two real numbers b and prob[f(v(Q))=b]. Thus, by VR, there is an observable corresponding to f(Q) with value b, hence f(v(Q))=v(f(Q)). By NC, that observable is unique, hence FUNC follows.
The rocks and shoals of modal interpretations are beyond the scope of this article (see the entry on modal interpretations). We just note that it is by no means clear how these interpretations can manage to always pick out the right set of observables assumed to have values. Right set here means that the observable actually measured must always be included (in order to avoid the measurement problem) and must always recover the QM statistics. We also mention two important results which cast doubt on the feasibility of modal interpretations: First, it can be shown that either partial value definiteness collapses into total value definiteness (i.e., VD) or classical reasoning about physical properties must be abandoned. (Clifton 1995). Second, it is possible to derive a kind of KS theorem even in certain modal interpretations (Bacciagaluppi 1995, Clifton 1996).
Now, in order to formulate VR we had to give a very reduced reading to the statistical algorithm, i.e. that it is a mere mathematical device for calculating numbers from vectors, operators and numbers. (What if we had done otherwise? Well, if we say: Whatever fulfills the statistical algorithm is an observable, we cannot very well suppose that an operator, in order to fulfill the algorithm, must be understood as an observable, since this would make the condition a trivial consequence of the algorithm.) This reading is very artificial and presupposes that a minimal interpretational apparatus required to make physical sense of some operators (like Q) can be withheld for others (like f(Q)).
Moreover, it seems entirely implausible to assume that some operators
- sums and products of operators that are associated with
well-defined observables - are themselves not associated with
well-defined observables, even if they mathematically inherit exact
values from their summands or factors. Put in a crude example, this
would amount to saying that to ask for a system's energy is a
well-defined question, while to ask for the square of the
system's energy is not, even if, from our answer to the first
question and trivial mathematics, we have a well-defined answer at
hand. There seems no good a priori reason to justify this
restriction. So, to make rejecting VR plausible at all, an additional
proposal is made: It is crucial to the KS argument that one and the
same operator is constructed from different maximal ones which are
incompatible: f(Q) is identical to
g(P), where PQ
QP
0. We now assume that only the construction of
f(Q) via Q, but not the one via P, leads to a
well-defined
observable.[11]
This move however, automatically makes some observables context-sensitive. So, this way of motivating the denial of VR amounts to a kind of contextualism, which we might come by cheaper, by directly rejecting NC, and without any tampering with the statistical algorithm. (This fact explains why we did not mention denial of VR as a separate option in the introduction.).
The KS argument has been presented for possessed values of a QM system - independently of considerations about measurement. Indeed, in the argument measurement was mentioned only once and in the negative - in NC. However, since now we consider the rejection of NC, we must also take into account measurement and its complications. An additional manifestation of our innocuous realism (see the introduction above) is a principle of faithful measurement (FM): QM measurement of an observable faithfully delivers the value which that observable had immediately prior to the measurement interaction. FM also is an extremely plausible presupposition of natural science. Moreover, FM entails VD (therefore we could have, using the stronger principle, given a KS argument for possible measurement results). Consider now the motivation, for the HV proponent, to reject NC. Obviously, the aim is to save other presuppositions, especially VD. Now, VD and NC are independent realist convictions, but NC and FM are not quite so independent. Indeed, we will see that rejection of NC entails the rejection of FM in one version of contextuality, and strongly suggests it in the other. (This makes more precise the somewhat cryptical remark from the introduction that it is not obvious what an interpretation endorsing the realist principle VD, but rejecting the realist principle NC, should look like. Such an interpretation would have to violate a third realist principle, i.e. FM.)
If an interpreter wanted to defend causal contextuality, this would entail abandoning FM, at least for observables of the type f(Q) (non-maximal observables): Since their values causally depend on the presence of certain measurement arrangements, these arrangements are causally necessary for the values to come about, thus the values cannot be present before the system-apparatus interaction, and FM is violated. As an advantage of causal contextualism the following might be pointed out. It does not imply that the ontological status of the physical properties involved must change, i.e. does not imply that they become relational. If the property in an object is brought about via interaction with another one, it can still be one which the object has for itself after the interaction. However, the idea of causal contextuality is sometimes discussed critically, since there is reason to think that it may be empirically inadequate (see Shimony 1984, Stairs 1992).
(a) We might think that v(f(Q)) just is not a self-sustained
physical property, but one which ontologically depends on the
presence of another property v(Q). (Recall that in the proof
of FUNC v(f(Q)) is constructed from v(Q).)
But, since the position does not reject questions about values of
f(Q) in a P-measurement situation as illegitimate (because it does
not trade on a notion of an observable being well-defined in one
context only!), this seems to lead to new and pressing questions, to
say the least. As an attempt to defend a contextualist hidden
variables interpretation, this position must concede that not
only does the system have, in the Q-measurement situation, a value
v(Q), but also, in a P-measurement situation, it has a value
v(Q),
although perhaps
v
(Q)
v(Q). Now, questions for values of f(Q)
in this situation at least are legitimate. Does
v
(Q)
install another
v
(f(Q))
v(f(Q))? Or does v
(Q), in opposition to v(Q), not lead to a value
of f(Q), at all? Neither option seems plausible, for couldn't
we, just by switching for a certain prepared system between a P- and
Q-measurement situation either switch v(f(Q)) in and out of
existence or switch between v(f(Q)) and v
(f(Q))? . (b) We might think that, in
order for f(Q) to be well-defined, one measurement arrangement rather
than the other is necessary. The idea is strongly reminiscent of
Bohr's 1935 argument against EPR, and indeed may be viewed as
the appropriate extension of Bohr's views on QM to the modern HV
discussion (see Held 1998, ch.7). In this version of ontological
contextualism the property v(f(Q)), rather than depending on
the presence of another property v(Q), is dependent on the
presence of a Q-measuring apparatus. This amounts to a holistic
position: For some properties it only makes sense to speak of them as
pertaining to the system, if that system is part of a certain
system-apparatus whole. Here, the question for values of f(Q) in a
P-measurement situation does become illegitimate, since
f(Q)'s being well-defined is tied to a Q-measurement situation.
But again reservations apply. Does the position hold that, in
opposition to f(Q), Q itself is well-defined in a P-measurement
situation? If it does not, Q hardly can have a value (since not being
well-defined was the reason to deny f(Q) a value) which means that we
are not considering an HV interpretation any longer, and that there
is no need to block the KS argument, at all. If it does, what
explains that, in the P-measurement situation, Q remains
well-defined, but f(Q) loses this status?
What becomes of FM in both versions of ontological contextualism? Well, if we remain agnostic about how the position could be made plausible, we can save FM, while, if we choose version (a) or (b) to make it plausible, we lose it. Consider first an agnostic denial of NC. FM said that every QM observable is faithfully measured. Now, contextualism splits an operator which can be constructed from two different noncommuting operators into two observables, and ontological contextualism does not try to give us a causal story which would ruin the causal independence of the measured value from the measurement interaction embodied in FM. We simply introduce a more fine-grained conception of observables, but for these new contextual observables still can impose FM.
However, the concrete versions of ontological contextualism, by attempting to motivate the contextual feature, ruin FM. Version (a) allows f(Q) to switch on and off or to switch between different values upon the change between P- and Q-measurement situations - which is a flagrant violation of FM. Version (b) fares no better. It introduces the ontological dependence on the measuring arrangement. It is hard to see what else this should be, but the same causal dependence pushed to a higher, ontological key. Again, couldn't we, just by flipping back and forth the measurement arrangement, change back and forth that f(Q) is well-defined, thus flip in and out of existence v(f(Q))?
Finally, we note that both types of ontological contextualism, in opposition to the causal version, do entail that system properties which we earlier thought to be intrinsic, become relational in the sense that a system can only have these properties either if it has certain others, or if it is related to a certain measurement arrangement.
(1) KS themselves describe a concrete experimental arrangement to measure Sx2, Sy2, Sz2 on a one-particle spin 1 system as functions of one maximal observable. An orthohelium atom in the lowest triplet state is placed in a small electric field E of rhombic symmetry. The three observables in question then can be measured as functions of one single observable, the perturbation Hamiltonian Hs. Hs, by the geometry of E, has three distinct possible values measurement of which reveals which two observables of Sx2, Sy2, Sz2 have value 1, which one has value 0 (see Kochen and Specker 1967: 72/311). This is, of course, a proposal to realize an experiment exemplifying our above value constraint (VC2). Could we also realize a (VC1) experiment, i.e. measure a set of commuting projectors projecting on eigenstates of one maximal observable? Peres (1995: 200) answers the question in the affirmative, discusses such an experiment, and refers to Swift and Wright (1980) for details about the technical feasibility. It seems, however, that, despite being possible in principle, no such experiment has been actually carried out (see Cabello and Garcìa-Alcaine (1998) for more discussion and another experimental proposal).
(2) In conjunction with manifestations of FUNC, i.e. the Sum Rule and the Product Rule, QM yields constraints like VC1 or VC2 that contradict VD. So providing concrete physical examples that could, given the Sum Rule and the Product Rule, instantiate VC1 or VC2, as just outlined, is not enough. We must ask whether these rules themselves can be empirically supported. There was considerable discussion in the early 80s about this question --- explicitly about whether the Sum Rule is empirically testable --- and there was general agreement that it is not.[12]
The reason is the following: Recall that the derivation of FUNC established uniqueness of the new observable f(Q) only in its final step (via NC). It is this uniqueness which guarantees that one operator represents exactly one observable such that observables (and thereby their values) in different contexts can be equated. This allows to establish indirect connections between different incompatible observables. Without this final step, FUNC must be viewed as holding relative to different contexts, the connection is broken and FUNC is restricted to one set of observables which are all mutually compatible. Then indeed FUNC, the Sum Rule and the Product Rule become trivial, and empirical testing in these cases would be a pointless question.[13] It is NC which does all the work and which deserves to be tested via checking whether for incompatible P, Q such that f(Q)=g(P) it is true that v(f(Q))=v(g(P)). Testing this, however, is impossible, due to the impossibility of simultaneously measuring P and Q.
(3) Very recently, it has been argued that the (physically reasonable) assumption of finitely precise measurement creates a decisive loophole in the KS argument (see Meyer 1999, Kent 1999, Clifton and Kent 1999; briefly MKC).[14] Indeed, if we consider a KS argument for measured values, infinite precision is crucial to the argument in two different ways: (1) It is necessary to the argument that the measured components of one triple (or quadruple) are exactly orthogonal. (2) It is necessary (to install NC) that two measurements intended to pick out the same observable as member of two different maximal sets, pick out exactly the same direction. If we relax this assumption of infinite precision, noncontextual HV models can be constructed. In these models, it is not exactly the sets of observables specified in the KS argument (or related arguments) by points in R3, but sets specified by points with rational components (which approximate the former arbitrarily closely) that are colourable, i.e. that can consistently be assigned noncontextual values. So the argument ultimately trades on the fact that we cannot empirically distinguish between a real point and its rational approximation.
The MKC argument is hotly debated and the question whether it is relevant or even destructive to the KS argument is unsettled, so we shall just record part of the discussion. One quite obvious objection is that the original KS argument works for possessed values, not measured values, so the MKC argument, which turns on the finite precision of measurements, misses the mark. We might not be able to test observables which are exactly orthogonal or exactly alike in different tests, but it would be a strange HV interpretation that asserts that such components do not exist (see Cabello 1999). Of course, such a noncontextual HV proposal would be immune to the KS argument, but it would be forced to either deny that for every one of the continuously many directions in physical space there is an observable, or else deny that there are continuously many directions -- and neither denial seems very attractive.
In addition, the MKC argument is dissatisfying, since it exploits the finite precision of real measurements only in one of the above senses, but presupposes infinite precision in the other. MKC assume, for measured observables, that there is finite precision in the choice of different orthogonal triples, such that we cannot, in general, have exactly the same observable twice, as a member of two different triples. However, MKC still assume infinite precision, i.e. exact orthogonality, within the triple (otherwise the colouring constraints could find no application, at all). It has been claimed that this feature can be exploited to rebut the argument and to re-install contextualism (see Mermin 1999, Appleby 2000).
Finally, it can be shown that quantum probabilities vary continuously as we change directions in R3, so small imperfections of selection of observables that block the argument (but only for measured values!) in the single case will wash out in the long run (see Mermin 1999). This in itself does not constitute an argument, since in the colourable sets of observables in MKC's constructions probabilities also vary (in a sense) continuously.[15] We might, however, exploit Mermin's reasoning in the following way. Reconsider Clifton's set of eight directions (in Figure 3) leading to a colouring constraint for the outermost points which statistically contradicts the QM statistics by a fraction of 1/17. Now, starting from the colourable subset of directions constructed by MKC, we are unable to derive the constraint for the eight points, since these eight points do not lie in that set; i.e., as we move, in the colourable subset, from one mutually orthogonal triple of rays to the next, we never hit upon exactly the same ray again, but only to one approximating it arbitrarily closely. However, consider the following response. Assume that observables corresponding to the eight directions, though not lying in the colourable subset, exist and, according to the HV premise, all have values. Then we can derive Clifton's constraint for the outermost points. For these outermost points it is irrelevant whether, in an eventual empirical test we hit them exactly, for the Mermin argument says that, even if, in every single imperfect measurement, we only measure points nearby, we will, in the long-run better and better approximate the QM statistics for exactly the points in question - which means that we will better and better approach 1/17, while the HV assumption requires that we will better and better approach 0. (Recall also that this number can be pushed up to 1/3 by choosing a set of 13 directions!)
So, in sum it seems that, as long as we assume that there are continuously many QM observables (corresponding to the continuum of directions in physical space), statistical tests building, e.g., on the Clifton 1993 or the Cabello/Alcaìne 1998 proposal remain entirely valid as empirical confirmations of the KS theorem. Since these statistical violations of the HV programme come about as contradictions of results of QM, VD, VR, and NC on the one hand, and QM and experiment on the other, the experimental data still force upon us the trilemma of giving up either VD or VR or NC. As we have seen, denial of value realism in the end becomes identical to a kind of contextualism, hence we really have only two options: (1) Giving up VD, either for all observables forbidden to have values in the orthodox interpretation (thus giving up the HV programme), or for a subset of these observables (as modal interpretations do). (2) Endorse a kind of contextualism. Moreover, as things presently stand, the choice between these two options seems not to be a matter of empirical testing, but one of pure philosophical argument.
Carsten Held cheld@ruf.uni-freiburg.de |