#### Supplement to Inductive Logic

## The Effect on EQI of Partitioning the Outcome Space More Finely—Including Proof of the Nonnegativity of EQI

Given some experiment or observation (or series of them) *c*, is
there any special advantage to parsing the space of possible outcomes
*O* into more alternatives rather than fewer alternatives?
Couldn’t we do as well at evidentially evaluating hypotheses by
parsing the space of outcomes into just a few alternatives—e.g.,
one possible outcome that \(h_i\) says is very likely and \(h_j\) says
is rather unlikely, one that \(h_i\) says is rather unlikely and
\(h_j\) says is very likely, and perhaps a third outcome on which
\(h_i\) and \(h_j\) pretty much agree? The answer is
“*No!*”. Parsing the space of outcomes into a
larger number of empirically distinct possible outcomes always
provides a better measure of evidential support.

To see this intuitively, suppose some outcome description *o* can
be parsed into two distinct outcome descriptions, \(o_1\) and \(o_2\),
where *o* is equivalent to \((o_1\vee o_2)\), and suppose that
\(h_i\) differs from \(h_j\) much more on the likelihood of \(o_1\)
than on the likelihood of \(o_2\). Then, intuitively, when *o* is
found to be true, whichever of the more precise descriptions, \(o_1\)
or \(o_2\), is true should make a difference as to how strong the
comparative support for the two hypotheses turns out to be. Reporting
whichever of \(o_1\) or \(o_2\) occurs will be more informative than
simply reporting *o*. That is, if the outcome of the experiment
is only described as *o*, relevant information is lost.

It turns out that EQI measures how well possible outcomes can distinguish between hypotheses in a way that reflects the intuition that a finer partition of the possible outcomes is more informative. The numerical value of EQI is always made larger by parsing the outcome space more finely, provided that the likelihoods for outcomes in the finer parsing differ at least a bit from some of the likelihoods for outcomes of the less refined parsing. This is important for our main convergence result because in that theorem we want the average value of EQI for the whole sequence of experiments and observations to be positive, and the larger the better.

The following **Partition Theorem** implies the
**Nonnegativity of EQI** result as well. It show that
each \(\EQI[c_k \pmid h_i /h_j \pmid b]\) must be non-negative; and it
will be positive *just in case* for at least one possible
outcome

This theorem will also show that \(\EQI[c_k \pmid h_i /h_j \pmid b]\)
generally becomes larger whenever the outcome space is partitioned
more finely. It follows immediately that the average value of EQI for
a sequence of experiments or observations, \(\bEQI[c^n \pmid h_i /h_j
\pmid b]\), averaged over the sequence of observations
*c\(^n\)*, is non-negative, and must be positive if for even
one of the \(c_k\) that contribute to it, at least one possible
outcome \(o_{ku}\) distinguishes between the two hypotheses by
making

**Partition Theorem**:

For any positive real numbers \(r_1, r_2, s_1, s_2\):

- (1) if \(r_1 /s_1 \gt (r_1 +r_2)/(s_1 +s_2)\), then \[(r_1 +r_2)\times\log\left[\frac{(r_1 +r_2)}{(s_1 +s_2)}\right] \lt r_1\times\log\left[\frac{r_1}{s_1}\right] + r_2\times\log\left[\frac{r_2}{s_2}\right];\]

and

- (2) if \(r_1 /s_1 = (r_1 +r_2)/(s_1 +s_2)\), then \[ r_1\times\log\left[\frac{r_1}{s_1}\right] + r_2\times\log\left[\frac{r_2}{s_2}\right] = (r_1 +r_2)\times\log\left[\frac{(r_1 +r_2)}{(s_1 +s_2)}\right].\]

To prove this theorem first notice that

\[ \begin{align} \frac{r_1}{s_1} = \frac{(r_1 +r_2)}{(s_1 +s_2)} & \ifft r_1 s_1 + r_1 s_2 = s_1 r_1 + s_1 r_2 \\ & \ifft \frac{r_1}{s_1} = \frac{r_2}{s_2}. \end{align} \]We’ll draw on this little result immediately below. It is clearly relevant to the antecedent of case (2) of the theorem we want to prove.

We establish case (2) first. Suppose the antecedent of case (2) holds. Then, from the little result just proved, we have

\[ \begin{align} r_1 \log\left[\frac{r_1}{s_1}\right] + r_2 \log\left[\frac{r_2}{s_2}\right] & = r_1 \log\left[\frac{(r_1 +r_2)}{(s_1 +s_2)}\right] + r_2 \log\left[\frac{(r_1 +r_2)}{(s_1 +s_2)}\right] \\ & = (r_1 + r_2) \log\left[\frac{(r_1 +r_2)}{(s_1 +s_2)}\right]. \end{align} \]That establishes case (2).

To get case (1), consider the following function of *p*:

where we only assume that \(u \gt 0\), \(v \gt 0\), and \(0 \lt p \lt 1\).

This function has its minimum value when \(p = u/(u+v)\). (This is
easily verified by setting the derivative of \(f(p)\) with respect to
*p* equal to 0 to find the minimum value of \(f(p)\); and it is
easy to verified that this is a minimum rather than a maximum value.)
At this minimum, where \(p = u/(u+v)\), we have

Thus, for all values of *p* other than \(u/(u+v)\),

That is, if \(p \ne u/(u+v)\),

\[ -\log[u+v] \lt p \log\left[\frac{p}{u}\right] + (1-p) \log\left[\frac{(1-p)}{v}\right]. \]Now, let \(p = r_1 /(r_1 +r_2)\), let \(u = s_1 /(r_1 +r_2)\), and let \(v = s_2 /(r_1 +r_2)\). Plugging into the previous formula, and multiplying both sides by \((r_1 +r_2)\), we get:

if

\[ \frac{r_1}{(r_1 +r_2)} \ne \frac{s_1}{(s_1 +s_2)}\](i.e., equivalently, if \(r_1 /s_1 \ne (r_1 +r_2)/(s_1 +s_2)\)),

then

\[\begin{multline} \log\left[\frac{(r_1 +r_2)}{(s_1 +s_2)}\right] \\ \lt \left[\frac{r_1}{(r_1 +r_2)}\right] \log\left[\frac{r_1}{s_1}\right] + \left(1-\left[\frac{r_1}{(r_1 +r_2)}\right]\right) \log\left[\frac{r_2}{s_2}\right] \end{multline}\](i.e., equivalently, \( (r_1 +r_2) \log\left[\frac{(r_1 +r_2)}{(s_1 +s_2)}\right] \lt r_1 \log\left[\frac{r_1}{s_1}\right] + r_2 \log\left[\frac{r_2}{s_2}\right]. \))

Thus, from the two equivalents, we’ve proved case 2:

if

\[ \frac{r_1}{s_1} \ne \frac{(r_1 +r_2)}{(s_1 +s_2)},\]then

\[ (r_1 +r_2) \log\left[\frac{(r_1 +r_2)}{(s_1 +s_2)}\right] \lt r_1 \log\left[\frac{r_1}{s_1}\right] + r_2 \log\left[\frac{r_2}{s_2}\right]. \]This completes the proof of the theorem.

To apply this result to \(\EQI[c_k \pmid h_i /h_j \pmid b]\) recall that

\[ \begin{multline} \EQI[c_k \pmid h_i /h_j \pmid b] = \sum_{\{u: P[o_{ku} \pmid h_{j}\cdot b\cdot c_{k}] \gt 0\}} \log\left[\frac{P[o_{ku} \pmid h_{i}\cdot b\cdot c_k]}{P[o_{ku} \pmid h_j\cdot b\cdot c_{k}]}\right]\\ \times P[o_{ku} \pmid h_{i}\cdot b\cdot c_{k}]. \end{multline} \]
Suppose \(c_k\) has *m* alternative outcomes \(o_{ku}\) on which
both

Let’s label their likelihoods relative to \(h_i\) (i.e., their likelihoods \(P[o_{ku} \pmid h_{i}\cdot b\cdot c_{k}]\)) as \(r_1\), \(r_2\), …, \(r_{m}\). And let’s label their likelihoods relative to \(h_j\) as \(s_1\), \(s_2\),… ,\(s_m\). In terms of this notation,

\[ \EQI[c_k \pmid h_i /h_j \pmid b] = \sum^{m}_{u = 1} r_u\times\log \left[\frac{r_u}{s_u}\right]. \]Notice also that

\[(r_1 +r_2 +r_3 +\ldots +r_m) = 1\]and

\[(s_1 +s_2 +s_3 +\ldots +s_m) = 1.\]Now, think of \(\EQI[c_k \pmid h_i /h_j \pmid b]\) as generated by applying the theorem in successive steps:

\[ \begin{align} 0 & = 1\times \log\left[\frac{1}{1}\right] \\ & = (r_1 +r_2 +r_3 +\ldots +r_{m})\times\log \left[\frac{(r_1 +r_2 +r_3 +\ldots +r_{m})}{(s_1 +s_2 +s_3 +\ldots +s_{m})}\right] \\ & \le r_1\times\log \left[\frac{r_1}{s_1}\right] + (r_2 +r_3 +\ldots +r_{m})\times \log\left[\frac{(r_2 +r_3 +\ldots +r_{m})}{(s_2 +s_3 +\ldots +s_{m})}\right] \\ & \le r_1\times\log \left[\frac{r_1}{s_1}\right] + r_2\times\log \left[\frac{r_2}{s_2}\right] \\ &\qquad + (r_3 +\ldots +r_{m}) \times\log \left[\frac{(r_3 +\ldots +r_{m})}{(s_3 +\ldots +s_{m})}\right] \\ & \le \ldots \\ & \le \sum^{m}_{u = 1} r_u\times\log \left[\frac{r_u}{s_u}\right] \\ & = \EQI[c_k \pmid h_i /h_j \pmid b]. \end{align} \]
The theorem also says that *at each step* equality holds just
in case

which itself holds just in case

\[ \frac{r_u}{s_u} = \frac{(r_{u+1}+\ldots +r_{m})}{(s_{u+1}+\ldots +s_{m})}. \]So,

\[ \EQI[c_k \pmid h_i /h_j \pmid b] = 0 \]just in case

\[ \begin{align} 1 &= \frac{(r_1 +r_2 +r_3 +\ldots +r_{m})}{(s_1 +s_2 +s_3 +\ldots +s_{m})} \\ &= \frac{r_1}{s_1} \\ &= \frac{(r_2 +r_3 +\ldots +r_{m})}{(s_2 +s_3 +\ldots +s_m)} \\ &= \frac{r_2}{s_2} \\ &= \frac{(r_3 +\ldots +r_{m})}{(s_3 +\ldots +s_{m})} \\ &= r_3 /s_3 \\ &= \ldots \\ &= \frac{r_{m}}{s_{m}}. \\ \end{align} \]That is,

\[ \EQI[c_k \pmid h_i /h_{j} \pmid b] = 0 \]just in case for all \(o_{ku}\) such that \(P[o_{ku} \pmid h_j\cdot b] \gt 0\) and \(P[o_{ku} \pmid h_i\cdot b] \gt 0\),

\[ \frac{P[o_{ku} \pmid h_{i}\cdot b\cdot c_{k}]}{P[o_{ku} \pmid h_j\cdot b\cdot c_k]} = 1. \]Otherwise,

\[ \EQI[c_k \pmid h_i /h_j \pmid b] \gt 0; \]and for each successive step in partitioning the outcome space to generate \(\EQI[c_k \pmid h_i /h_j \pmid b]\), if

\[ r_u /s_u \ne \frac{(r_{u}+r_{u+1}+\ldots +r_{m})}{(s_u +s_{u+1}+\ldots +s_{m})}, \]we have the strict inequality:

\[ \begin{multline} (r_{u}+r_{u+1}+\ldots +r_{m}) \times \log\left[\frac{(r_{u}+r_{u+1}+\ldots +r_{m})}{(s_u +s_{u+1}+\ldots +s_{m})}\right] \\ \lt r_u\times\log \left[\frac{r_u}{s_u}\right] + (r_{u+1}+\ldots +r_{m})\times\log \left[\frac{(r_{u+1}+\ldots +r_{m})}{(s_{u+1}+\ldots +s_{m})}\right]. \end{multline} \]
So each such division of \((o_{ku}\vee o_{ku+1}\vee \ldots \vee
o_{km})\) into two separate statements, \(o_{ku}\) and \((o_{ku+1}\vee
\ldots \vee o_{km})\), adds a strictly positive contribution to the
size of \(\EQI[c_k \pmid h_i /h_j \pmid b]\) *just when*