Supplement to Inductive Logic
Likelihood Ratios, Likelihoodism, and the Law of Likelihood
The versions of Bayes’ Theorem provided by Equations 9–11 show that for probabilistic inductive logic the influence of empirical evidence (of the ind for which hypotheses express likelihoods) is completely captured by the ratios of likelihoods,
\[\frac{P[e^n \pmid h_{j}\cdot b\cdot c^{n}]}{P[e^n \pmid h_{i}\cdot b\cdot c^{n}]}.\]The evidence \((c^{n}\cdot e^{n})\) influences the posterior probabilities in no other way. So, the following “Law” is a consequence of the logic of probabilistic support functions.
General Law of Likelihood:
Given any pair of incompatible hypotheses \(h_i\) and \(h_j\),
whenever the likelihoods \(P_{\alpha}[e^n \pmid h_{j}\cdot b\cdot
c^{n}]\) and \(P_{\alpha}[e^n \pmid h_{i} \cdot b\cdot c^{n}]\) are
defined, the evidence \((c^{n}\cdot e^{n})\) supports \(h_i\) over
\(h_j\), given b, if and only if
The ratio of likelihoods
\[\frac{P_{\alpha}[e^n \pmid h_{i}\cdot b\cdot c^{n}]}{P_{\alpha}[e^n \pmid h_{j}\cdot b\cdot c^{n}]}\]measures the strength of the evidence for \(h_i\) over \(h_j\) given b.
Two features of this law require some explanation. As stated, the General Law of Likelihood does not presuppose that likelihoods of form \(P_{\alpha}[e^n \pmid h_{j}\cdot b\cdot c^{n}]\) and \(P_{\alpha}[e^n \pmid h_{i}\cdot b\cdot c^{n}]\) are always defined. This qualification is introduced to accommodate a conception of evidential support called Likelihoodism, which is especially influential among statisticians. Also, the likelihoods in the law are expressed with the subscript \(\alpha\) attached to indicate that the law holds for each inductive support function \(P_{\alpha}\), even when the values of the likelihoods are not objective or agreed on by all agents in a given scientific community. These two features of the law are closely related, as we will see.
Each probabilistic support function satisfies the axioms in Section 2. According to these axioms the conditional probability of one sentence on another is always defined. So, in the context of the inductive logic of support functions the likelihoods are always defined, and the qualifying clause about this in the General Law of Likelihood is automatically satisfied. For inductive support functions, all of the versions of Bayes’ theorem (Equations 8–11) continue to hold even when the likelihoods are not objective or intersubjectively agreed on by the scientific community. In many scientific contexts there will be general agreement on the values of likelihoods; but whenever such agreement fails the subscripts \(\alpha , \beta\), etc. must remain attached to the support function likelihoods to indicate this. Even so, the General Law of Likelihood continues to hold for each support function.
There is a view, or family of views, called likelihoodism that maintains that the inductive logician or statistician should only be concerned with whether the evidence provides increased or decreased support for one hypothesis over another, and only in cases where this evaluation is based on the ratios of completely objective likelihoods. (Prominent likelihoodists include Edwards (1972) and Royall (1997); also see Forster & Sober (2004) and Fitelson (2007).) When the likelihoods involved are objective, the ratios
\[\frac{P[e^n \pmid h_{j}\cdot b\cdot c^{n}]}{P[e^n \pmid h_{i}\cdot b\cdot c^{n}]}\]provide a pure, objective measure of how strongly the evidence supports \(h_i\) as compared to \(h_j\), a measure that is “untainted” by prior plausibility considerations. According to likelihoodists, only this kind of pure measure is scientifically appropriate for the assessment of how evidence impacts hypotheses. (It should be noted that the classical school of statistics, associated with R.A. Fisher (1922) and with Neyman and Pearson (1967), reject both the Bayesian logic of evidential support and the claim about the nature of evidential support expressed by the General Law of Likelihood.)
Likelihoodists maintain that it is not appropriate for statisticians to incorporate assumptions about prior probabilities of hypotheses into the assessment of evidential support. It is not their place to compute recommended values of posterior probabilities for the scientific community. When the results of experiments are made public, say in scientific journals, only objective likelihoods should be reported. The evaluation of the impact of objective likelihoods on agents’ posterior probabilities depends on each agent’s individual subjective prior probability, which represents plausibility considerations that have nothing to do with the evidence. So, likelihoodists suggest, posterior probabilities should be left for individuals to compute, if they desire to do so.
The conditional probabilities between most pairs of sentences fail to be objectively defined in a way that suits likelihoodists. So, for likelihoodists, the general logic of evidential support functions (captured by the axioms in Section 2 and the forms of Bayes’ theorem discussed above) cannot represent an objective logic of evidential support for hypotheses. Because they eschew the logic of support functions, likelihoodist do not have Bayes’ theorem available, and so cannot derive the Law of Likelihood from it. Rather, they must state the Law of Likelihood as an axiom of their particular version of inductive logic, an axiom that applies only when the likelihoods have well-defined objective values.
Likelihoodists tend to have a very strict conception of what it takes for likelihoods to be well-defined. They consider a likelihood to be well-defined only when it is (what we referred to earlier as) a direct inference likelihood— i.e., only when either, (1) the hypothesis (together with background and experimental conditions) logically entails the data, or (2) the hypothesis (together with background) logically entails an explicit simple statistical hypothesis that (together with experimental conditions) specifies precise statistical probabilities for each of the events that make up the evidence.
Likelihoodists contrast simple statistical hypotheses with composite statistical hypotheses, which only entail vague, or imprecise, or directional claims about the statistical probabilities of evidential events. Whereas a simple statistical hypothesis might say, for example, “the chance of heads on tosses of the coin is precisely .65”, a composite statistical hypothesis might say, “the chance of heads on tosses is either .65 or .75”, or it may be a directional hypothesis that says, “the chance of heads on tosses is greater than .65”. Likelihoodists maintain that composite hypotheses are not an appropriate basis for well-defined likelihoods. Such hypotheses represent a kind of disjunction of simple statistical hypotheses. The direction hypothesis, for instance, is essentially a disjunction of the various simple statistical hypotheses that assign specific values above .65 to the chances of heads on tosses. Likelihoods based on such hypotheses are not appropriately objective by the lights of the likelihoodist because they must in effect depend on factors that represent the degree to which the composite hypothesis supports each of the simple statistical hypotheses that it encompasses; and likelihoodists consider such factors too subjective to be permitted in a logic that should permit only objective likelihoods.[24]
Taking all of this into account, the version of the Law of Likelihood appropriate to likelihoodists may be stated as follows.
Special Law of Likelihood:
Given a pair of incompatible hypotheses \(h_i\) and \(h_j\) that imply
simple statistical models regarding outcomes \(e^n\) given \((b\cdot
c^n)\), the likelihoods \(P[e^n \pmid h_{j}\cdot b\cdot c^{n}]\) and
\(P[e^n \pmid h_{i}\cdot b\cdot c^{n}]\) are well defined. For such
likelihoods, the evidence \((c^{n}\cdot e^{n})\) supports \(h_i\) over
\(h_j\), given b, if and only if
the ratio of likelihoods
\[\frac{P[e^n \pmid h_{i}\cdot b\cdot c^{n}]}{P[e^n \pmid h_{j}\cdot b\cdot c^{n}]} \]measures the strength of the evidence for \(h_i\) over \(h_j\) given b.
Notice that when either version of the Law of Likelihood holds, the absolute size of a likelihood is irrelevant to the strength of the evidence. All that matters is the relative size of the likelihoods for one hypothesis as compared to another. That is, let \(c_1\) and \(c_2\) be the conditions for two distinct experiments having outcomes \(e_1\) and \(e_2\), respectively. Suppose that \(e_1\) is 1000 times more likely on \(h_i\) (given \(b\cdot c_1)\) than is \(e_2\) on \(h_i\) (given \(b\cdot c_2)\); and suppose that \(e_1\) is also 1000 times more likely on \(h_j\) (given \(b\cdot c_1)\) than is \(e_2\) on \(h_j\) (given \(b\cdot c_2)\)—i.e., suppose that
\[P_{\alpha}[e_1 \pmid h_i\cdot b\cdot c_1] = 1000 \times P_{\alpha}[e_2 \pmid h_i\cdot b\cdot c_1],\]and
\[P_{\alpha}[e_1 \pmid h_j\cdot b\cdot c_1] = 1000 \times P_{\alpha}[e_2 \pmid h_j\cdot b\cdot c_2].\]Which piece of evidence, \((c_1\cdot e_1)\) or \((c_2\cdot e_2)\), is stronger evidence with regard to the comparison of \(h_i\) to \(h_j\)? The Law of Likelihood implies both are equally strong. All that matters evidentially are the ratios of the likelihoods, and they are the same:
\[ \frac{P_{\alpha}[e_1 \pmid h_i\cdot b\cdot c_1]}{P_{\alpha}[e_1 \pmid h_j\cdot b\cdot c_1]} = \frac{P_{\alpha}[e_2 \pmid h_i\cdot b\cdot c_2]} {P_{\alpha}[e_2 \pmid h_j\cdot b\cdot c_2]}.\]Thus, the General Law of Likelihood implies the following principle.
General Likelihood Principle:
Suppose two different experiments or observations (or two sequences of
them) \(c_1\) and \(c_2\) produce outcomes \(e_1\) and \(e_2\),
respectively. Let \(\{ h_1, h_2 , \ldots \}\) be any set of
alternative hypotheses. If there is a constant K such that for
each hypothesis \(h_j\) from the set,
then the evidential import of \((c_1\cdot e_1)\) for distinguishing among hypotheses in the set (given b) is precisely the same as the evidential import of \((c_2\cdot e_2)\).
Similarly, the Special Law of Likelihood implies a corresponding Special Likelihood Principle that applies only to hypotheses that express simple statistical models.[25]
Throughout the remainder of this article we will not assume that likelihoods must be based on simple statistical hypotheses, as likelihoodist would have them. However, most of what will be said about likelihoods, especially the convergence result in Section 4, applies to likelihoodist likelihoods as well. We will, however, continue to suppose that likelihoods are objective in the sense that all members of the scientific community agree on their numerical values. In Section 5 we will see how even this supposition may be relaxed in scientific contexts where completely objective values for likelihoods are not realistically available.