The Legal Concept of Evidence

First published Fri Nov 13, 2015; substantive revision Fri Oct 8, 2021

The legal concept of evidence is neither static nor universal. Medieval understandings of evidence in the age of trial by ordeal would be quite alien to modern sensibilities (Ho 2003–2004) and there is no approach to evidence and proof that is shared by all legal systems of the world today. Even within Western legal traditions, there are significant differences between Anglo-American law and Continental European law (see Damaška 1973, 1975, 1992, 1994, 1997). This entry focuses on the modern concept of evidence that operates in the legal tradition to which Anglo-American law belongs.^[1] It concentrates on evidence in relation to the proof of factual claims in law.^[2]

It may seem obvious that there must be a legal concept of evidence that is distinguishable from the ordinary concept of evidence. After all, there are in law many special rules on what can or cannot be introduced as evidence in court, on how evidence is to be presented and the uses to which it may be put, on the strength or sufficiency of evidence needed to establish proof and so forth. But the law remains silent on some crucial matters. In resolving the factual disputes before the court, the jury or, at a bench trial, the judge has to rely on extra-legal principles. There have been academic attempts at systematic analysis of the operation of these principles in legal fact-finding (Wigmore 1937; Anderson, Schum, and Twining 2009). These principles, so it is claimed, are of a general nature. On the basis that the logic in “drawing inferences from evidence to test hypotheses and justify conclusions” is governed by the same principles across different disciplines (Twining and Hampsher-Monk 2003: 4), ambitious projects have been undertaken to develop a cross-disciplinary framework for the analysis of evidence (Schum 1994) and to construct an interdisciplinary “integrated science of evidence” (Dawid, Twining, and Vasilaki 2011; cf. Tillers 2008).

While evidential reasoning in law and in other contexts may share certain characteristics, there nevertheless remain aspects of the approach to evidence and proof that are distinctive to law (Rescher and Joynt 1959). Section 1 (“conceptions of evidence”) identifies different meanings of evidence in legal discourse. When lawyers talk about evidence, what is it that they are referring to? What is it that they have in mind? Section 2 (“conditions for receiving evidence”) approaches the concept of legal evidence from the angle of what counts as evidence in law. What are the conditions that the law imposes and must be met for something to be received by the court as evidence? Section 3 (“strength of evidence”) shifts the attention to the stage where the evidence has already been received by the court. Here the focus is on how the court weighs the evidence in reaching the verdict. In this connection, three properties of evidence will be discussed: probative value, sufficiency, and degree of completeness.

1. Conceptions of Evidence: What does Evidence Refer to in Law?
2. Conditions for Receiving Evidence: What Counts as Evidence in Law?
3. Strength of Evidence
Bibliography
Academic Tools
Other Internet Resources
Related Entries

1. Conceptions of Evidence: What does Evidence Refer to in Law?

Stephen (1872: 3–4, 6–7) long ago noted that legal usage of the term “evidence” is ambiguous. It sometimes refers to that which is adduced by a party at the trial as a means of establishing factual claims. (“Adducing evidence” is the legal term for presenting or producing evidence in court for the purpose of establishing proof.) This meaning of evidence is reflected in the definitional section of the Indian Evidence Act (Stephen 1872: 149).^[3] When lawyers use the term “evidence” in this way, they have in mind what epistemologists would think of as “objects of sensory evidence” (Haack 2004: 48). Evidence, in this sense, is divided conventionally into three main categories:^[4] oral evidence (the testimony given in court by witnesses), documentary evidence (documents produced for inspection by the court), and “real evidence”; the first two are self-explanatory and the third captures things other than documents such as a knife allegedly used in committing a crime.

The term “evidence” can, secondly, refer to a proposition of fact that is established by evidence in the first sense.^[5] This is sometimes called an “evidential fact”. That the accused was at or about the scene of the crime at the relevant time is evidence in the second sense of his possible involvement in the crime. But the accused’s presence must be proved by producing evidence in the first sense. For instance, the prosecution may call a witness to appear before the court and get him to testify that he saw the accused in the vicinity of the crime at the relevant time. Success in proving the presence of the accused (the evidential fact) will depend on the fact-finder’s assessment of the veracity of the witness and the reliability of his testimony. (The fact-finder is the person or body responsible for ascertaining where the truth lies on disputed questions of fact and in whom the power to decide on the verdict vests. The fact-finder is also called “trier of fact” or “judge of fact”. Fact-finding is the task of the jury or, for certain types of cases and in countries without a jury system, the judge.) Sometimes the evidential fact is directly accessible to the fact-finder. If the alleged knife used in committing the crime in question (a form of “real evidence”) is produced in court, the fact-finder can see for himself the shape of the knife; he does not need to learn of it through the testimony of an intermediary.

A third conception of evidence is an elaboration or extension of the second. On this conception, evidence is relational. A factual proposition (in Latin, factum probans) is evidence in the third sense only if it can serve as a premise for drawing an inference (directly or indirectly) to a matter that is material to the case (factum probandum) (see section 2.2 below for the concept of materiality). The fact that the accused’s fingerprints were found in a room where something was stolen is evidence in the present sense because one can infer from this that he was in the room, and his presence in the room is evidence of his possible involvement in the theft. On the other hand, the fact that the accused’s favorite color is blue would, in the absence of highly unusual circumstances, be rejected as evidence of his guilt: ordinarily, what a person’s favorite color happens to be cannot serve as a premise for any reasonable inference towards his commission of a crime and, as such, it is irrelevant (see discussion of relevance in section 2.1 below). In the third sense of “evidence”, which conceives of evidence as a premise for a material inference, “irrelevant evidence” is an oxymoron: it is simply not evidence. Hence, this statement of Bentham (1825: 230):^[6]

To say that testimony is not pertinent, is to say that it is foreign to the case, has no connection with it, and does not serve to prove the fact in question; in a word, it is to say, that it is not evidence.

There can be evidence in the first sense without evidence in the second or third sense. To pursue our illustration, suppose it emerges during cross-examination of the expert that his testimony of having found a finger-print match was a lie. Lawyers would describe this situation as one where the “evidence” (the testimony of the expert) fails to prove the fact that it was originally produced to prove and not that no “evidence” was adduced on the matter. Here “evidence” is used in the first sense—evidence as testimony—and the testimony remains in the court’s record whether it is believed or not. But lawyers would also say that, in the circumstances, there is no “evidence” that the accused was in the room, assuming that there was nothing apart from the discredited expert testimony of a fingerprint match to establish his presence there. Here, the expert’s testimony is shown to be false and fails to establish that the accused’s fingerprints were found in the room, and there is no (other) factual basis for believing that he was in the room. The factual premise from which an inference is sought to be drawn towards the accused’s guilt is not established.

Fourthly, the conditions for something to be received (or, in technical term “admitted”) as evidence at the trial are sometimes included in the legal concept of evidence. (These conditions are discussed in section 2 below.) On this conception, legal evidence is that which counts as evidence in law. Something may ordinarily be treated as evidence and yet be rejected by the court. Hearsay is often cited as an example. It is pointed out that reliance on hearsay is a commonplace in ordinary life. We frequently rely on hearsay in forming our factual beliefs. In contrast, “hearsay is not evidence” in legal proceedings (Stephen 1872: 4–5). As a general rule, the court will not rely on hearsay as a premise for an inference towards the truth of what is asserted. It will not allow a witness to testify in court that another person X (who is not brought before the court) said that p on a certain occasion (an out-of-court statement) for the purpose of proving that p.

In summary, at least four possible conceptions of legal evidence are in currency: as an object of sensory evidence, as a proposition of fact, as an inferential premise and as that which counts as evidence in law. The sense in which the term “evidence” is being used is seldom made explicit in legal discourse although the intended meaning will often be clear from the context.

2. Conditions for Receiving Evidence: What Counts as Evidence in Law?

This section picks up on the fourth conception of evidence. To recall, something will be accepted by the court as evidence—it is, to use Montrose’s term, receivable as evidence in legal proceedings—only if three basic conditions are satisfied: relevance, materiality, and admissibility (Montrose 1954). These three conditions of receivability are discussed in turn below.

2.1 Relevance

2.1.1 Legal Significance of Relevance

The concept of relevance plays a pivotal role in legal fact-finding. Thayer (1898: 266, 530) articulates its significance in terms of two foundational principles of the law of evidence: first, without exception, nothing which is not relevant may be received as evidence by the court and secondly, subject to many exceptions and qualifications, whatever is relevant is receivable as evidence by the court. Thayer’s view has been influential and finds expression in sources of law, for example, in Rule 402 of the Federal Rules of Evidence in the United States.^[7] Thayer claims, and it is now widely accepted, that relevance is a “logical” and not a legal concept; in section 2.1.3, we will examine this claim and the dissent expressed by Wigmore. Leaving aside the dissenting view for the moment, we will turn first to consider possible conceptions of relevance in the conventional sense of logical relevance.

2.1.2 Conceptions of Logical Relevance

Evidence may be adduced in legal proceedings to prove a fact only if the fact is relevant. Relevance is a relational concept. No fact is relevant in itself; it is relevant only in relation to another fact. The term “probable” is often used to describe this relation. We see two instances of this in the following well-known definitions. According to Stephen (1886: 2, emphasis added):

The word “relevant” means that any two facts to which it is applied are so related to each other that according to the common course of events one either taken by itself or in connection with other facts proves or renders probable the past, present, or future existence or non-existence of the other.

The second definition is contained in the United States’ Federal Rule of Evidence 401 which (in its restyled version) states that evidence is relevant if “it has a tendency to make a fact more or less probable than it would be without the evidence” (emphasis added). The word “probable” in these and other standard definitions is sometimes construed as carrying the mathematical meaning of probability.^[8] In a leading article, Lempert gave this example to show how relevance turns on the likelihood ratio. The prosecution produces evidence that the perpetrator’s blood found at the scene of the crime is type A. The accused has the same blood type. Suppose fifty percent of the suspect population has type A blood. If the accused is in fact guilty, the probability that the blood found at the scene will be type A is 1.0. But if he is in fact innocent, the probability of finding type A blood at the scene is 0.5—that is, it matches the background probability of type A blood from the suspect population. The likelihood ratio is the ratio of the first probability to the second—1.0:0.5 or, more simply, 2:1. Evidence is considered relevant so long as the likelihood ratio is other than 1:1 (Lempert 1977). If the ratio is 1:1, that means that the probability of the evidence is the same whether the accused is guilty or innocent.

The conventional view is that relevance in law is a binary concept: evidence is either relevant or it is not. So long as the likelihood ratio is other than 1:1, the evidence is considered relevant.^[9] However, the greater the likelihood ratio deviates from 1:1, the higher the so-called probative value of the evidence (that is, on one interpretation of probative value). We will take a closer look at probative value in section 3.1 below.

While the likelihood ratio may be useful as a heuristic device in analysing evidential reasoning, it is controversial as to whether it captures correctly the concept of relevance. In the first place, it is unclear that the term “probable” in the standard definitions of relevance was ever intended as a reference to mathematical probability. Some have argued that relevance should be understood broadly such that any evidence would count as relevant so long as it provides some reason in support of the conclusion that a proposition of fact material to the case is true or false (Pardo 2013: 576–577).

The mathematical conception of relevance has been disputed. At a trial, it is very common for the opposing sides to present competing accounts of events that share certain features. To use Allen’s example, the fact that the accused drove to a particular town on a particular day and time is consistent with the prosecution’s case that he was driving there to commit a murder and also with the defence’s case that he was driving there to visit his mother. This fact, being a common feature of both sides’ explanations of the material events, is as consistent with the hypothesis of guilt as with the hypothesis of innocence. On the likelihood ratio conception of relevance, this fact should be irrelevant and hence evidence of it should not be allowed to be adduced. But in such cases, the court will let the evidence in (Park et al. 2010: 10). The mathematical theory of relevance cannot account for this. (For critical discussion of this claim, see section 4.2 of the entry on legal probabilism.) It is argued that an alternative theory of relevance better fits legal practice and is thus to be preferred. On an explanatory conception of relevance, evidence is relevant if it is explained by or provides a reason for believing the particular explanation of the material events offered by the side adducing the evidence, and it remains relevant even where, as in our example, the evidence also supports or forms part of the explanation offered by the opponent (Pardo and Allen 2008: 241–2; Pardo 2013: 600).

One possible response to the above challenge to the likelihood ratio theory of relevance is to deny that it was ever meant to be the exclusive test of relevance. Evidence is relevant if the likelihood ratio is other than 1:1. But evidence may also be relevant on other grounds, such as when it provides for a richer narrative or helps the court in understanding other evidence. It is for these reasons that witnesses are routinely allowed to give their names and parties may present diagrams, charts and floor plans (so-called “demonstrative evidence”) at the trial (McCormick 2013: 995). The admission of evidence in the scenario painted by Allen above has been explained along a similar line (Park et al. 2010: 16).

2.1.3 Logical Relevance versus Legal Relevance

The concept of relevance examined in the preceding section is commonly known as “logical relevance”. This is somewhat of a misnomer: “Relevance is not a matter of logic, but depends on matters of fact” (Haack 2004: 46). In our earlier example, the relevance of the fact that the accused has type A blood depends obviously on the state of the world. On the understanding that relevance is a probabilistic relation, it is tempting to think that in describing relevance as “logical”, one is subscribing to a logical theory of probability (cf. Franklin 2011). However, the term “logical relevance” was not originally coined with this connotation in mind. In the forensic context, “logic” is used loosely and refers to the stock of background beliefs or generalisations and the type of reasoning that judges and lawyers are fond of labelling as “commonsense” (MacCrimmon 2001–2002; Twining 2006: 334–335).

A key purpose of using the adjective “logical” is to flag the non-legal character of relevance. As Thayer (1898: 269) famously claimed, relevance “is an affair of logic and not of law.” This is not to say that relevance has no legal dimension. The law distinguishes between questions of law and questions of fact. An issue of relevance poses a question of law that is for the judge to decide and not the jury, and so far as relevance is defined in legal sources (for example, in Federal Rule of Evidence 401 mentioned above), the judge must pay heed to the legal definition. But legal definitions of relevance are invariably very broad. Relevance is said to be a logical, and non-legal, concept in the sense that in answering a question of relevance and in applying the definition of relevance, the judge has necessarily to rely on extra-legal resources and is not bound by legal precedents. Returning to Federal Rule of Evidence 401, it states generally that evidence is relevant if “it has a tendency to make a fact more or less probable than it would be without the evidence”. In deciding whether the evidence sought to be adduced does have this tendency, the judge has to look outside the law. Thayer was most insistent on this. As he put it, “[t]he law furnishes no test of relevancy. For this, it tacitly refers to logic and general experience” (Thayer 1898: 265). That the accused’s favorite color is blue is, barring extraordinary circumstances, irrelevant to the question of his intention to commit theft. It is not the law that tells us so but “logic and general experience”. On Thayer’s view, the law does not control or regulate the assessment of relevance; it assumes that judges are already in possession of the (commonsense) resources to undertake this assessment.

Wigmore adopts a different position. He argues, against Thayer, that relevance is a legal concept. There are two strands to his contention. The first is that for evidence to be relevant in law, “a generally higher degree of probative value” is required “than would be asked in ordinary reasoning”:

legal relevance denotes…something more than a minimum of probative value. Each single piece of evidence must have a plus value. (cf. Pattenden 1996–7: 373)

As Wigmore sees it, the requirement of “plus value” guards against the jury “being satisfied by matters of slight value, capable to being exaggerated by prejudice and hasty reasoning” (Wigmore 1983b: 969, cf. 1030–1031). Opponents of Wigmore acknowledge that there may be sound policy reasons for excluding evidence of low probative value. Receiving the evidence at the trial might raise a multiplicity of issues, incur too much time and expense, confuse the jurors or produce undue prejudice in their mind. When the judge excludes evidence for any of these reasons, and the judge has the discretion to do so in many countries, the evidence is excluded despite it being relevant (e.g., United States’ Federal Rule of Evidence 403). Relevance is a relation between facts and the aforesaid reasons for exclusion are extrinsic to that relation; they are grounded in considerations such as limitation of judicial resources and jury psychology. The notion of “plus value” confuses relevance with extraneous considerations (James 1941; Trautman 1952).

There is a second strand to Wigmore’s contention that relevance is a legal concept. Relevance is legal in the sense that the judge is bound by previously decided cases (“judicial precedents”) when he has to make a ruling on the relevance of a proposed item of evidence.

So long as Courts continue to declare…what their notions of logic are, just so long will there be rules of law which must be observed. (Wigmore 1983a: 691)

Wigmore cites in support the judgment of Cushing C.J. in State v LaPage where it was remarked:

[T]here are many instances in which the evidence of particular facts as bearing on particular issues has been so often the subject of discussion in courts of law, and so often ruled upon, that the united logic of a great many judges and lawyers may be said to furnish…the best evidence of what may be properly called common-sense, and thus to acquire the authority of law. (1876 57 N.H. 245 at 288 [Supreme Court, New Hampshire])

Wigmore’s position on relevance is strangely at odds with his strong stand against the judge being bound by judicial precedents in assessing the weight or credibility of evidence (Wigmore 1913). More importantly, the second strand of his argument also does not sit well with the first strand. If, as Wigmore contends, evidence must have a plus value to make it legally relevant, the court has to consider the probative value of the evidence and to weigh it against the amount of time and expense likely to be incurred in receiving the evidence, the availability of other evidence, the risk of the evidence misleading or confusing the trier of fact and so forth. Given that the assessment of plus value and, hence, legal relevance is so heavily contextual, it is difficult to see how a judicial precedent can be of much value in another case in determining a point of legal relevance (James 1941: 702).

2.2 Materiality and Facts-in-issue

We have just considered the first condition of receivability, namely, relevance. That fact A is relevant to fact B is not sufficient to make evidence of fact A receivable in court. In addition, B must be a “material” fact. The materiality of facts in a particular case is determined by the law applicable to that case. In a criminal prosecution, it depends on the law which defines the offence with which the accused is charged and at a civil trial, the law which sets out the elements of the legal claim that is being brought against the defendant (Wigmore 1983a, 15–19; Montrose 1954: 536–537).

Imagine that the accused is prosecuted for the crime of rape and the alleged victim’s behaviour (fact A) increases the probability that she had consented to have sexual intercourse with the accused (fact B). On the probabilistic theory of relevance that we have considered, A is relevant to B. Now suppose that the alleged victim is a minor. Under criminal law, it does not matter whether she had consented to the sexual intercourse. If B is of no legal consequence, the court will not allow evidence of A to be adduced for the purpose of proving B: the most obvious reason is that it is a waste of time to receive the evidence.

Not all material facts are necessarily in dispute. Suppose the plaintiff sues the defendant for breach of contract. Under the law of contract, to succeed in this action, the plaintiff must prove the following three elements: that there was a contract between the parties, that the defendant was in breach of the contract, and that the plaintiff had suffered loss as a result of that breach. The defendant may concede that there was a contract and that he was in breach of it but deny that the plaintiff had suffered any loss as a result of that breach. In such a situation, only the last of the material facts is disputed. Following Stephen’s terminology, a disputed material fact is called a “fact in issue” (Stephen 1872: 9).

The law does not allow evidence to be adduced to prove facts that are immaterial. Whether evidence may be adduced to prove a material fact may depend on whether the material fact is disputed; for instance, the requirement that it must be disputed exists under Rule 210 of the Evidence Code of California but not Rule 401 of the Federal Rules of Evidence in the United States. “Relevance” is often used in the broader sense that encompasses the concepts under discussion. Evidence is sometimes described as “irrelevant” not for the reason that no logical inference can be drawn to the proposition that is sought to be proved (in our example, A is strictly speaking relevant to B) but because that proposition is not material or not disputed (in our example, B is not material).^[10] This broader usage of the term “relevance”, though otherwise quite harmless, does not promote conceptual clarity because it runs together different concepts (see James 1941: 690–691; Trautman 1952: 386; Montrose 1954: 537).

2.3 Admissibility

2.3.1 Admissibility and Relevance

A further condition must be satisfied for evidence to be received in legal proceedings. There are legal rules that prohibit evidence from being presented at a trial even though it is relevant to a factual proposition that is material and in issue. These rules render the evidence to which they apply “inadmissible” and require the judge to “exclude” it. Two prominent examples of such rules of admissibility or rules of exclusion are the rule against hearsay evidence and the rule against character evidence. This section considers the relation between the concept of relevance and the concept of admissibility. The next section (section 2.3.2) discusses general arguments for and against exclusionary or admissibility rules.

Here, again, the terminology is imprecise. Admissibility and receivability are not clearly distinguished. It is common for irrelevant evidence, or evidence of an immaterial fact to be described as “inadmissible”. What this means is that the court will refuse to receive evidence if it is irrelevant or immaterial. But, importantly, the court also excludes evidence for reasons other than irrelevance and immateriality. For Montrose, there is merit in restricting the concept of “inadmissibility” to the exclusion of evidence based on those other reasons (Montrose 1954: 541–543). If evidence is rejected on the ground of irrelevance, it is, as Thayer (1898: 515) puts it, “the rule of reason that rejects it”; if evidence is rejected under an admissibility or exclusionary rule, the rejection is by force of law. The concepts of admissibility and materiality should also be kept apart. This is because admissibility or exclusionary rules serve purposes and rationales that are distinct from the law defining the crime or civil claim that is before the court and it is this law that determines the materiality of facts in the dispute.

Thayer (1898: 266, 530) was influential in his view that the law of evidence has no say on logical relevance and that its main business is in dealing with admissibility. If the evidence is logically irrelevant, it must for that reason be excluded. If the evidence is logically relevant, it will be received by the court unless the law—in the form of an exclusionary or admissibility rule—requires its exclusion. In this scheme, the concept of relevance and the concept of admissibility are distinct: indeed, admissibility rules presuppose the relevance of the evidence to which they apply.

Stephen appears to hold a different view, one in which the concept of admissibility is apparently absorbed by the concept of relevance. Take, for example, Stephen’s analysis of the rule that in general no evidence may be adduced to prove “statements as to facts made by persons not called as witnesses”, in short, hearsay (Stephen 1872: 122). As a general rule, no evidence may be given of hearsay because the law prohibits it. The question then arises as to the rationale for this prohibition. Stephen’s answer to this question is often taken to be that hearsay is not “relevant” and he is criticised for failing to see the difference between relevance and admissibility (Whitworth 1881: 3; Thayer 1898: 266–268; Pollock 1876, 1899; Wigmore 1983a: §12). His critics point out that hearsay has or can have probative value and evidence of hearsay is excluded despite or regardless of its relevance. On the generalisation that there is no smoke without fire, the fact that a person claimed that p in a statement made out-of-court does or can have a bearing on the probability that p, and p may be (logically relevant to) a material fact in the dispute.

Interestingly, Stephen seemed to have conceded as much. He acknowledged that a policeman or a lawyer engaged in preparing a case would be negligent if he were to shut his ears to hearsay. Hearsay is one of those facts that are “apparently relevant but not really so” (Stephen 1872: 122; see also Stephen 1886: xi). In claiming that hearsay is irrelevant, Stephen appears to be merely stating the effect of the law: the law requires that hearsay be treated as irrelevant. He offered a variety of justifications for excluding hearsay evidence: its admissibility would “present a great temptation to indolent judges to be satisfied with second-hand reports” and “open a wide door to fraud”, with the result that “[e]veryone would be at the mercy of people who might tell a lie, and whose evidence could neither be tested nor contradicted” (Stephen 1872: 124–125). For his detractors, these are reasons of policy and fairness and it disserves clarity to sneak such considerations into the concept of relevance.

Although there is force to the criticism that Stephen had unhelpfully conflated admissibility and relevance (understood as logical relevance), something can perhaps be said in his defence. Exclusionary rules or rules of admissibility—at any rate, many of them—are more accurately seen as excluding forms of reasoning rather than prohibiting proof of certain types of facts (McNamara 1986). This is certainly true of the hearsay rule. On one authoritative definition of the rule (decision of the Privy Council in Subramaniam v PP, (1956) 1 Weekly Law Reports 965), what it prohibits is the use of a hearsay statement to prove the truth of the facts asserted therein.^[11] The objection is to the drawing of the inference that p from X’s out-of-court statement that p where X is not available to be examined in court. But the court will allow the evidence of X’s hearsay statement to be admitted—it will allow proof of the statement— where the purpose of adducing the evidence is to persuade the court that X did make the statement and this fact is relevant for some other purpose. For instance, it may be relevant as to the state of mind of the person hearing the statement, and his state of mind may be material to his defence of having acted under duress. Hence, two writers have commented that “there is no such thing as hearsay evidence, only hearsay uses” (Roberts and Zuckerman 2010: 385).

Other admissibility rules are also more accurately seen as targeted at forms of reasoning and not types of facts. In the United States, Federal Rule of Evidence 404(a)(1) bars the use of evidence of a person’s character “to prove that on a particular occasion the person acted in accordance with the character” and Federal Rule of Evidence 404(b)(1) provides that evidence of a crime or wrong

is not admissible to prove a person’s character in order to show that on a particular occasion the person acted in accordance with the character.

It is doubtful that evidence of a person’s character and past behaviour can have no probabilistic bearing on his behaviour on a particular occasion; on a probabilistic conception of relevance, it is difficult to see why the evidence is not relevant. Even so, there may be policy, moral or other reasons for the law to prohibit certain uses of character evidence. In declaring a fact as irrelevant for a particular purpose, we are not necessarily saying or implying anything about probability. We may be expressing a normative judgment. For policy, moral or other reasons, the law takes the position that hearsay or the accused’s character or previous misconduct must not be used as the premise for a particular line of reasoning. The line of reasoning might be morally objectionable (“give a dog a bad name and hang him for it”) or it might be unfair to permit the drawing of the inference when the opponent was not given a fair opportunity to challenge it (as in the hearsay situation) (Ho 2008: chs. 5, 6). If we take a normative conception of relevance instead of a logical or probabilistic one, it is not an abuse of language to describe inadmissible evidence as irrelevant if what is meant is that the evidence ought not to be taken into account in a certain way.

2.3.2 Admissibility or Exclusionary Rules

On one historical account, admissibility or exclusionary rules are the product of the jury system where citizens untrained in assessing evidence sit as judges of fact. These rules came about because it was thought necessary to keep away from inexperienced jurors certain types of evidence that may mislead or be mishandled by them—for instance, evidence to which they are likely to give too much weight or that carries the risk of creating unfair prejudice in their minds (Thayer 1898; Wigmore 1935: 4–5). Epistemic paternalism is supposedly at play (Leiter 1997: 814–5; Allen and Leiter 2001: 1502). Subscription to this theory has generated pressure for the abolition of exclusionary rules with the decline of the jury system and the replacement of lay persons with professional judges as triers of fact. There is doubt as to the historical accuracy of this account; at any rate, it does not appear capable of explaining the growth of all exclusionary rules (Morgan 1936–37; Nance 1988: 278–294).

Even if the theory is right, it does not necessarily follow that exclusionary rules should be abolished once the jury system is removed. Judges may be as susceptible to the same cognitive and other failings as the jury and there may be the additional risk that judges may over-estimate their own cognitive and intellectual abilities in their professional domain. Hence, there remains a need for the constraints of legal rules (Schauer 2006: 185–193). But the efficacy of these rules in a non-jury system is questionable. The procedural reality is that judges will have to be exposed to the evidence in order to decide on its admissibility. Since a judge cannot realistically be expected to erase the evidence from his mind once he has decided to exclude it, there seems little point in excluding the evidence; we might as well let the evidence in and allow judge to give the evidence the probative value that it deserves (Mnookin 2006; Damaška 2006; cf. Ho 2008: 44–46).

Bentham was a strong critic of exclusionary rules. He was much in favour of “freedom of proof” understood as free access to information and the absence of formal rules that restrict such access (Twining 2006: 232, n 65). The direct object of legal procedure is the “rectitude of decision”, by which he means the correct application of substantive law to true findings of facts. The exclusion of relevant evidence—evidence capable of casting light on the truth—is detrimental to this end. Hence, no relevant evidence should be excluded; the only exceptions he would allow are where the evidence is superfluous or its production would involve preponderant delay, expense or vexation (Bentham 1827: Book IX; Bentham 1825: Book VII; Twining 1985: ch. 2). Bentham’s argument has been challenged on various fronts. It is said that he overvalued the pursuit of truth, undervalued procedural fairness and procedural rights, and placed too much faith in officials, underestimating the risk of abuse when they are given discretion unfettered by rules (Twining 1985: 70–71).

Even if we agree with Bentham that rectitude of decision is the aim of legal procedure and that achieving accuracy in fact-finding is necessary to attain this aim, it is not obvious that a rule-based approach to admissibility will undermine this aim in the long run. Schauer has defended exclusionary rules of evidence along a rule-consequentialist line. Having the triers of fact follow rules on certain matters instead of allowing them the discretion to exercise judgment on a case-by-case basis may produce the greatest number of favourable outcomes in the aggregate. It is in the nature of a formal rule that it has to be followed even when doing so might not serve the background reason for the rule. If hearsay evidence is thought to be generally unreliable, the interest of accuracy may be better served overall to require such evidence to be excluded without regard to its reliability in individual cases. Given the imperfection of human reason and our suspicion about the reasoning ability of the fact-finder, allowing decisions to be taken individually on the reliability and admissibility of hearsay evidence might over time produce a larger proportion of misjudgements than on the rule-based approach (Schauer 2006: 180–185; Schauer 2008). However, this argument is based on a large assumption about the likely effects of having exclusionary rules and not having them, and there is no strong empirical basis for thinking that the consequences are or will be as alleged (Goldman 1999: 292–295; Laudan 2006: 121–122).

Other supporters of exclusionary rules build their arguments on a wide range of different considerations. The literature is too vast to enter into details. Here is a brief mention of some arguments. On one theory, some exclusionary rules are devices that serve as incentives for lawyers to produce the epistemically best evidence that is reasonably available (Nance 1988, 2016: 195–201). For example, if lawyers are not allowed to rely on second-hand (hearsay) evidence, they will be forced to seek out better (first-hand) evidence. On another theory, exclusionary rules allocate the risks of error. Again, consider hearsay. The problem with allowing a party to rely on hearsay evidence is that the opponent has no opportunity to cross-examine the original maker of the statement and is thus deprived of an important means of attacking the reliability of the evidence. Exclusionary rules in general insulate the party against whom the evidence is sought to be adduced from the risks of error that the evidence, if admitted, would have introduced. The distribution of such risks is said to be a political decision that should not be left to the discretion of individual fact-finders (Stein 2005; cf. Redmayne 2006 and Nance 2007a: 154–164). It has also been argued that the hearsay rule and the accompanying right to confront witnesses promote the public acceptance and stability of legal verdicts. If the court relies on direct evidence, it can claim superior access to the facts (having heard from the horse’s mouth, so to speak) and this also reduces the risk of new information emerging after the trial to discredit the inference that was drawn from the hearsay evidence (the original maker of the statement might turn up after the trial to deny the truth of the statement that was attributed to him) (Nesson 1985: 1372–1375; cf. Park 1986; Goldman 1999: 282; Goldman 2005: 166–167).

3. Strength of Evidence

The decision whether to allow a party to adduce a particular item of evidence is one that the judge has to make and arises in the course of a trial. Section 2 above dealt with the conditions that must be satisfied for a witness’s testimony, a document or an object to be received as evidence. At the end of the trial, the fact-finder must consider all the evidence that has been presented and reach a verdict. Although verdict deliberation is sometimes subjected to various forms of control through legal devices such as presumptions and corroboration rules, such control is limited and the fact-finder is expected to exercise personal judgment in the evaluation of evidence (Damaška 2019). Having heard or seen the evidence, the fact-finder now has to evaluate or ‘weigh’ it in reaching the verdict. Weight can refer to any of the following three properties of evidence: (a) the probative value of individual items of evidence, (b) the sufficiency of the whole body of evidence adduced at the trial in meeting the standard of proof, or (c) the relative completeness of this body of evidence. The first two aspects of weight are familiar to legal practitioners but the third has been confined to academic discussions. These three ideas are discussed in the same order below.

3.1 Probative Value of Specific Items of Evidence

In reaching the verdict, the trier of fact has to assess the probative value of the individual items of evidence which have been received at the trial. The concept of probative value can also play a role at the prior stage (which was the focus in section 2) where the judge has to make a ruling on whether to receive the evidence in the first place. In many legal systems, if the judge finds the probative value of a proposed item of evidence to be low and substantially outweighed by countervailing considerations, such as the risk of causing unfair prejudice or confusion, the judge can refuse to let the jury hear or see the evidence (see, e.g., Rule 403 of the United States’ Federal Rules of Evidence).

The concept of probative value (or, as it is also called, probative force) is related to the concept of relevance. Section 2.1.2 above introduced and examined the claim that the likelihood ratio is the measure of relevance. To recapitulate, the likelihood of an item of evidence, E (in our previous example, the likelihood of a blood type match) given a hypothesis H (that the accused is in fact guilty) is compared with the likelihood of E given the negation of H (that the accused is in fact innocent). Prior to the introduction of E, one may have formed some belief about H based on other evidence that one already has. This prior belief does not affect the likelihood ratio since its computation is based on the alternative assumptions that H is true and that H is false (Kaye 1986a; Kaye and Koehler 2003; cf. Davis and Follette 2002 and 2003). Rulings on relevance are made by the judge when objections of irrelevance are raised in the course of the trial. The relevance of an item of evidence is supposedly assessed on its own, without consideration of other evidence, and, indeed, much of the other evidence may have yet to presented at the point when the judge rules on the relevance of a particular item of evidence (Mnookin 2013: 1544–5).^[12]

Probative value, as with relevance, has been explained in terms of the likelihood ratio (for detailed examples, see Nance and Morris 2002; Finkelstein and Levin 2003). It was noted earlier that evidence is either relevant or not, and, on the prevailing understanding, it is relevant so long as the likelihood ratio deviates from 1:1. But evidence can be more or less probative depending on the value of the likelihood ratio. In our earlier example, the probative value of a blood type match was 1.0:0.5 (or 2:1) as 50% of the suspect population had the same blood type as the accused. But suppose the blood type is less common and only 25% of the suspect population has it. The probative value of the evidence is now 1.0:0.25 (or 4:1). In both cases, the evidence is relevant; but the probative value is greater in the latter than in the former scenario. It is tempting to describe probative value as the degree of relevance but this would be misleading as relevance in law is a binary concept.

There is a second way of thinking about probative value. On the second view, but not on the first, the probative value of an item of evidence is assessed contextually. The probative value of E may be low given one state of the other evidence and substantial given a different body of other evidence (Friedman 1986; Friedman and Park 2003; cf. Davis and Follette 2002, 2003). Where the other evidence shows that a woman had died from falling down an escalator at a mall while she was out shopping, her husband’s history of spousal battery is unlikely to have any probative value in proving that he was responsible for her death. But where the other evidence shows that the wife had died of injuries in the matrimonial home, and the question is whether the injuries were sustained from an accidental fall from the stairs or inflicted by the husband, the same evidence of spousal battery will now have significant probative value.

On the second view, the probative value of an item of evidence (E) is not measured simply by the likelihood ratio as it is on the first view. Probative value is understood as the degree to which E increases (or decreases) the probability of the proposition or hypothesis (H) in support of (or against) which E is led. The probative value of E is measured by the difference between the probability of H given E (the posterior probability) and the probability of H absent E (the prior probability) (Friedman 1986; James 1941: 699).

Probative value of \(E = P(H | E) - P(H)\)

\(P(H | E)\) (the posterior probability) is derived by applying Bayes’ theorem—that is, by multiplying the prior probability by the likelihood ratio (see discussion in section 3.2.2 below). On the present view, while the likelihood ratio does not itself measure the probative value of E, it is nevertheless a crucial component in the assessment.

A major difficulty with both of the mathematical conceptions of probative value that we have just examined is that for most evidence, obtaining the figures necessary for computing the likelihood ratio is problematic (Allen 1991: 380). Exceptionally, quantitative base rates data exist, as in our blood type example. Where objective data is unavailable, the fact-finder has to draw on background experience and knowledge to come up with subjective values. In our blood type example, a critical factor in computing the likelihood ratio was the percentage of the “suspect population” who had the same blood type as the accused. “Reference class” is the general statistical term for the role that the suspect population plays in this analysis. How should the reference class of “suspect population” be defined? Should we look at the population of the country as a whole or of the town or the street where the alleged murder occurred? What if it occurred at an international airport where most the people around are foreign visitors? Or what if it is shown that both the accused and the victim were at the time of the alleged murder inmates of the same prison? Should we then take the prison population as the reference class? The distribution of blood types may differ according to which reference class is selected. Sceptics of mathematical modelling of probative value emphasize that data from different reference classes will have different explanatory power and the choice of the reference class is open to—and should be subjected to—contextual argument and requires the exercise of judgment; there is no a priori way of determining the correct reference class. (On the reference class problem in legal factfinding, see, in addition to references cited in the rest of this section, Colyvan, Regan, and Ferson 2001; Tillers 2005; Allen and Roberts 2007.)

Some writers have proposed quantifiable ways of selecting, or assisting in the selection, of the appropriate reference class. On one suggestion, the court does not have to search for the optimal reference class. A general characteristic of an adversarial system of trial is that the judge plays a passive role; it is up to the parties to come up with the arguments on which they want to rely and to produce evidence in support of their respective arguments. This adversarial setting makes the reference class problem more manageable as the court need only to decide which of the reference classes relied upon by the parties is to be preferred. And this can be done by applying one of a variety of technical criteria that statisticians have developed for comparing and selecting statistical models (Cheng 2009). Another suggestion is to use the statistical method of “feature selection” instead. The ideal reference class is defined by the intersection of all relevant features of the case, and a feature is relevant if it is correlated to the matter under enquiry (Franklin 2010, 2011: 559–561). For instance, if the amount of drug likely to be smuggled is reasonably believed to co-vary with the airport through which it is smuggled, the country of origin and the time period, and there is no evidence that any other feature is relevant on which data is available, the ideal reference class is the class of drug smugglers passing through that airport originating from that country and during that time period. Both suggestions have self-acknowledged limitations: not least, they depend on the availability of suitable data. Also, as Franklin stresses, while statistical methods “have advice to offer on how courts should judge quantitative evidence”, they do so “in a way that supplements normal intuitive legal argument rather than replacing it by a formula” (Franklin 2010: 22).

The reference class problem is not confined to the probabilistic assessment of the probative value of individual items of evidence. It is a general difficulty with a mathematical approach to legal proof. In particular, the same problem arises on a probabilistic interpretation of the standard of proof when the court has to determine whether the standard is met based on all the evidence adduced in the case. This topic is explored in section 3.2 below but it is convenient at this juncture to illustrate how the reference class problem can also arise in this connection. Let it be that the plaintiff sues Blue Bus Company to recover compensation for injuries sustained in an accident. The plaintiff testifies, and the court believes on the basis of his testimony, that he was run down by a recklessly driven bus. Unfortunately, it was dark at the time and he cannot tell whether the bus belonged to Blue Bus Company. Assume further that there is also evidence which establishes that Blue Bus Company owns 75% of the buses in the town where the accident occurred and the remaining 25% is owned by Red Bus Company. No other evidence is presented. To use the data as the basis for inferring that there is 0.75 probability that the bus involved in the accident was owned by Blue Bus Company would seem to privilege the reference class of “buses operating in the town” over other possible reference classes such as “buses plying the street where the accident occurred” or “buses operating at the time in question” (Allen and Pardo 2007a: 109). Different reference classes may produce very different likelihood ratios. It is crucial how the reference class is chosen and this is ultimately a matter of argument and judgment. Any choice of reference class (other than the class that shares every feature of the particular incident, which is, in effect, the unique incident itself) is in principle contestable.

Critics of the mathematization of legal proof raise this point as an example of inherent limitations to the mathematical modelling of probative value (Allen and Pardo 2007a).^[13] Allen and Pardo propose an alternative, the explanatory theory of legal proof. They claim that this theory has the advantage of avoiding the reference class problem because it does not attempt to quantify probative value (Pardo 2005: 374–383; Pardo and Allen 2008: 261, 263; Pardo 2013: 600–601). Suppose a man is accused of killing his wife. Evidence is produced of his extra-marital affair. The unique probative value of the accused’s infidelity cannot be mathematically computed from statistical base rates of infidelity and uxoricides (husbands murdering wives). In assessing its probative value, the court should look instead at how strongly the evidence of infidelity supports the explanation of the material events put forward by the side adducing the evidence and how strongly it challenges the explanation offered by the opponent. For instance, the prosecution may be producing the evidence to buttress its case that the accused wanted to get rid of his wife so that he could marry his mistress, and the defence may be advancing the alternative theory that the couple was unusual in that they condoned extra-marital affairs and had never let it affect their loving relationship. How much probative value the evidence of infidelity has depends on the strength of the explanatory connections between it and the competing hypotheses, and this is not something that can be quantified.

But the disagreement in this debate is not as wide as it might appear. The critics concede that formal models for evaluating evidence in law may be useful. What they object to is

scholarship arguing … that such models establish the correct or accurate probative value of evidence, and thus implying that any deviations from such models lead to inaccurate or irrational outcomes. (Allen and Pardo 2007b: 308)

On the other side, it is acknowledged that there are limits to mathematical formalisation of evidential reasoning in law (Franklin 2012: 238–9) and that context, argument and judgment do play a role in identifying the reference class (Nance 2007b).

3.2 Sufficiency of Evidence and the Standards of Proof

3.2.1 Mathematical Probability and the Standards of Proof

In the section 3.1 above, we concentrated on the weight of evidence in the sense of probative value of individual items of evidence. The concept of weight can also apply to the total body of evidence presented at the trial; here “weight” is commonly referred to as the “sufficiency of evidence”.^[14] The law assigns the legal burden of proof between parties to a dispute. For instance, at a criminal trial, the accused is presumed innocent and the burden is on the prosecution to prove that he is guilty as charged. To secure a conviction, the body of evidence presented at the trial must be sufficient to meet the standard of proof. Putting this generally, a verdict will be given in favour of the side bearing the legal burden of proof only if, having considered all of the evidence, the fact-finder is satisfied that the applicable standard of proof is met. The standard of proof has been given different interpretations.

On one interpretation, the standard of proof is a probabilistic threshold. In civil cases, the standard is the “balance of probabilities” or, as it is more popularly called in the United States, the “preponderance of evidence”. The plaintiff will satisfy this standard and succeed in his claim only if there is, on all the evidence adduced in the case, more than 0.5 probability of his claim being true. At criminal trials, the standard for a guilty verdict is “proof beyond a reasonable doubt”. Here the probabilistic threshold is thought to be much higher than 0.5 but courts have eschewed any attempt at authoritative quantification. Typically, a notional value, such as 0.9 or 0.95, is assumed by writers for the sake of discussion. For the prosecution to secure a guilty verdict, the evidence adduced at the trial must establish the criminal charge to a degree of probability that crosses this threshold. Where, as in the United States, there is an intermediate standard of “clear and convincing evidence” which is reserved for special cases, the probabilistic threshold is said to lie somewhere between 0.5 and the threshold for proof beyond reasonable doubt.

Kaplan was among the first to employ decision theory to develop a framework for setting the probabilistic threshold that represents the standard of proof. Since the attention in this area of the law tends to be on the avoidance of errors and their undesirable consequences, he finds it convenient to focus on disutility rather than utility. The expected disutility of an outcome is the product of the disutility (broadly, the social costs) of that outcome and the probability of that outcome. Only two options are generally available to the court: in criminal cases, it must either convict or acquit the accused and in civil cases, it has to give judgment either for the plaintiff or for the defendant. At a criminal trial, the decision should be made to convict where the expected disutility of a decision to acquit is greater than the expected disutility of a decision to convict. This is so as to minimize the expected disutilities. To put this in the form of an equation:

\[ P\cdot\textit{Dag} > (1-P)\textit{Dci} \]

P is the probability that the accused is guilty on the basis of all the evidence adduced in the case, Dag is the disutility of acquitting a guilty person and Dci is the disutility of convicting an innocent person. A similar analysis applies to civil cases: the defendant should be found liable where the expected disutility of finding him not liable when he is in fact liable exceeds the expected disutility of finding him liable when he is in fact not liable.

On this approach, a person should be convicted of a crime only where P is greater than:

\[ \frac{1}{1+\frac{\textit{Dag}}{\textit{Dci}}} \]

The same formula applies in civil cases except that the two disutilities (Dag and Dci) will have to be replaced by their civil equivalents (framed in terms of the disutility of awarding the judgment to a plaintiff who in fact does not deserve it and disutility of awarding the judgment to a defendant who in fact does not deserve it). On this formula, the crucial determinant of the standard of proof is the ratio of the two disutilities. In the civil context, the disutility of an error in one direction is deemed equal to the disutility of an error in the other direction. Hence, a probability of liability of greater than 0.5 would suffice for a decision to enter judgment against the defendant (see Redmayne 1996: 171). The situation is different at a criminal trial. Dci, the disutility of convicting an innocent person is considered far greater than Dag, the disutility of acquitting a guilty person.^[15] Hence, the probability threshold for a conviction should be much higher than 0.5 (Kaplan 1968: 1071–1073; see also Cullison 1969).

An objection to this analysis is that it is incomplete. It is not enough to compare the costs of erroneous verdicts. The utility of an accurate conviction and the utility of an accurate acquittal should also be considered and factored into the equation (Lillquist 2002: 108).^[16] This results in the following modification of the formula for setting the standard of proof:

\[ \frac{1}{1+\frac{\textit{Ucg}-\textit{Uag}}{\textit{Uai}-\textit{Uci}}} \]

Ucg is the utility of convicting the guilty, Uag is the utility of acquitting the guilty, Uai is the utility of acquitting the innocent and Uci the utility of convicting the innocent.

Since the relevant utilities depend on the individual circumstances, such as the seriousness of the crime and the severity of the punishment, the decision-theoretic account of the standard of proof would seem, on both the simple and the modified version, to lead to the conclusion that the probabilistic threshold should vary from case to case (Lillquist 2002; Bartels 1981; Laudan and Saunders 2009; Ribeiro 2019). In other words, the standard of proof should be a flexible or floating one. This view is perceived to be problematic.

First, it falls short descriptively. The law requires the court to apply a fixed standard of proof for all cases within the relevant category. In theory, all criminal cases are governed by the same high standard and all civil cases are governed by the same lower standard. That said, it is unclear whether factfinders in reality adhere strictly to a fixed standard of proof (see Kaplow 2012: 805–809).

The argument is better interpreted as a normative argument—as advancing the claim about what the law ought to be and not what it is. The standard of proof ought to vary from case to case. But this proposal faces a second objection. For convenience, the objection will be elaborated in the criminal setting; in principle, civil litigants have the same two rights that we shall identify. According to Dworkin (1981), moral harm arises as an objective moral fact when a person is erroneously convicted of a crime. Moral harm is distinguished from the bare harm (in the form of pain, frustration, deprivation of liberty and so forth) that is suffered by a wrongfully convicted and punished person. While accused persons have the right not to be convicted if innocent, they do not have the right to the most accurate procedure possible for ascertaining their guilt or innocence. However, they do have the right that a certain weight or importance be attached to the risk of moral harm in the design of procedural and evidential rules that affect the level of accuracy. Accused persons have the further right to a consistent weighting of the importance of moral harm and this further right stems from their right to equal concern and respect. Dworkin’s theory carries an implication bearing on the present debate. It is arguable that to adopt a floating standard of proof would offend the second right insofar as it means treating accused persons differently with respect to the evaluation of the importance of avoiding moral harm. This difference in treatment is reflected in the different level of the risk of moral harm to which they are exposed.

There is a third objection to a floating standard of proof. Picinali (2013) sees fact-finding as a theoretical exercise that engages the question of what to believe about the disputed facts. What counts as “reasonable” for the purposes of applying the standard of proof beyond reasonable doubt is accordingly a matter for theoretical as opposed to practical reasoning. Briefly, theoretical reasoning is concerned with what to believe whereas practical reasoning is about what to do. Only reasons for belief are germane in theoretical reasoning. While considerations that bear on the assessment of utility and disutility provide reasons for action, they are not reasons for believing in the accused’s guilt. Decision theory cannot therefore be used to support a variable application of the standard of proof beyond reasonable doubt.

The third criticism of a flexible standard of proof does not directly challenge the decision-theoretic analysis of the standard of proof. On that analysis, it would seem that the maximisation of expected utility is the criterion for selecting the appropriate probabilistic threshold to apply and it plays no further role in deciding whether that threshold, once selected, is met on the evidence adduced in the particular case. It is not incompatible with the decision-theoretic analysis to insist that the question of whether the selected threshold is met should be governed wholly by epistemic considerations. However, it is arguable that what counts as good or strong enough theoretical reason for judging, and hence believing, that something is true is dependent on the context, such as what is at stake in believing that it is true. More is at stake at a trial involving the death penalty than in a case of petty shop-lifting; accordingly, there should be stronger epistemic justification for a finding of guilt in the first than in the second case. Philosophical literature on epistemic contextualism and on interest-relative accounts of knowledge and justified belief has been drawn upon to support a variant standard of proof (Ho 2008: ch. 4; see also Amaya 2015: 525–531).^[17]

The premise of the third criticism is that the trier of fact has to make a finding on a disputed factual proposition based on his belief in the proposition. This is contentious. Beliefs are involuntary; we cannot believe something by simply deciding to believe it. The dominant view is that beliefs are context-independent; at any given moment, we cannot believe something in one context and not believe it in another. On the other hand, legal fact-finding involves choice and decision making and it is dependent on the context; for example, evidence that is strong enough to justify a finding of fact in a civil case may not be strong enough to justify the same finding in a criminal case where the standard of proof is higher. It has been argued that the fact-finder has to base his findings not on what he believes but what he accepts (Cohen 1991, 1992: 117–125, Beltrán 2006; cf. Picinali 2013: 868–869). Belief and acceptance are propositional attitudes: they are different attitudes that one can have in relation to a proposition. As Cohen (1992: 4) explains:

to accept that p is to have or adopt a policy of deeming, positing or postulating that p—i.e. of including that proposition or rule among one’s premises for deciding what to do or think in a particular context.

3.2.2 Objections to Using Mathematical Probability to Interpret Standards of Proof

Understanding standards of proof in terms of mathematical probabilities is controversial. It is said to raise a number of paradoxes (Cohen 1977; Allen 1986, 1991; Allen and Leiter 2001; Redmayne 2008). Let us return to our previous example. The defendant, Blue Bus Company, owns 75% of the buses in the town where the plaintiff was injured by a recklessly driven bus and the remaining 25% is owned by Red Bus Company. No other evidence is presented. Leaving aside the reference class problem discussed above, there is a 0.75 probability that the accident was caused by a bus owned by the defendant. On the probabilistic interpretation of the applicable standard of proof (that is, the balance of probabilities), the evidence should be sufficient to justify a verdict in the plaintiff’s favour. But most lawyers would agree that the evidence is insufficient. Another familiar hypothetical scenario is set in the criminal context (Nesson 1979: 1192–1193). Twenty five prisoners are exercising in a prison yard. Twenty four of them suddenly set upon a guard and kill him. The remaining prisoner refuses to participate. We cannot in the ensuing confusion identify the prisoner who refrained from the attack. Subsequently, one prisoner is selected randomly and prosecuted for the murder of the guard. Those are the only facts presented at the trial. The applicable standard is proof beyond a reasonable doubt. Assume that the probabilistic threshold of this standard is 0.95. On the statistical evidence, there is a probability of 0.96 that the defendant is criminally liable.^[18] Despite the statistical probability of liability exceeding the threshold, it is widely agreed that the defendant must be acquitted. In both of the examples just described, why is the evidence insufficient and what does this say about legal standards of proof?

Various attempts have been made to find the answers (for surveys of these attempts, see Enoch and Fisher 2015: 565–571; Redmayne 2008, Ho 2008: 135–143, 168–170; Gardiner 2019b; section 6 of the entry on legal probabilism). It has been argued that meeting a legal standard of proof is not merely or fundamentally a matter of adducing evidence to establish a mathematical probability of liability beyond a certain level. Standards of proof should be interpreted in epistemic rather than probabilistic terms. According to one interpretation, the evidence is sufficient to satisfy a standard of proof only if it is capable of justifying full or outright belief in the material facts that constitute legal liability and bare statistical evidence, as in our examples, cannot justify such a belief. (Nelkin 2021; Smith 2018; Buchak 2014; Ho 2008: 89–99.) On Smith’s account, the statistical evidence in our two examples fails to justify belief in the proposition that the defendant is liable because the evidence does not normically support that proposition. Evidence normically supports a proposition just in case the situation in which the evidence is true and the proposition is false is less normal, in the sense of requiring more explanation, than the situation in which the evidence and the proposition are both true. Where all that we have is statistical evidence, it could just so happen that the material proposition is false (it could just so happen that the accident-causing bus was red or that the accused was the one who refused to join in the murder), so no further explanation is needed where the proposition is false than where it is true (Smith 2018).

On a different epistemic interpretation, the evidence is sufficient to meet a legal standard of proof, and a finding of legal liability is permissible, only if the factfinder can gain knowledge of the defendant’s liability—to be precise, of the material facts establishing such liability—from the evidence (Duff et al. 2007: 87–91; Pardo 2010; for a critical overview of knowledge-centered accounts, see Gardiner forthcoming). High probability of liability alone will not suffice. On more subtle knowledge-centered theories, the standards of proof are met only if, on the available evidence, there is a sufficiently high probability that the fact finder knows that the defendant is liable (Littlejohn 2020 and 2021; Blome-Tillmann 2017), or only if the fact finder’s credence in the defendant’s liability exceeds the relevant legal threshold and the credence constitutes knowledge (Moss 2018). It is further claimed that the relevant knowledge necessary for a finding of liability cannot be obtained from statistical evidence alone (Littlejohn 2020 and 2021; Blome-Tillmann 2017; Moss 2018 and forthcoming). According to Thomson, this is because the statistical evidence (to take our first example, the 75% ownership of blue buses) is not causally connected with the fact sought to be proved and cannot guarantee the truth of the relevant belief (that the bus which caused the accident was blue) (Thomson 1986). An alternative argument is that knowledge requires the ruling out of all relevant alternatives and, to take our prison scenario, there is no evidence that addresses the possibility that the defendant was the one who refrained from joining in the attack or the possibility that the defendant is less likely to be guilty than an arbitrary prisoner in the yard. (See Moss forthcoming; Moss 2018: 213. Gardiner 2019a adapts the relevant alternatives framework to model legal standards of proof in a non-mathematical way while eschewing a knowledge account of those standards.) Another possible explanation for the failure to know relies on the notion of sensitivity. The belief that the defendant is liable is not sensitive to the truth where it is based on bare statistical evidence; in the bus example, evidence of the market share of buses remain the same whether it is true or not that a blue bus caused the accident (cf. Enoch, Spectre, and Fisher 2012; Enoch and Fisher 2015; Enoch and Spectre 2019 – while suggesting that the lack of knowledge has generally to do with the insensitivity of the belief, the authors deny that knowledge should matter to the imposition of legal liability). Yet another explanation is that it is unsafe to find a person liable on bare statistical evidence. Though safety is sometimes treated as a condition of knowledge (in that knowledge requires a true belief that is safe), one can treat safety as a condition for finding the defendant liable without also taking the position that the finding must be based on knowledge of liability. Safety is commonly understood in terms of whether a belief formed on the same basis would be true in close possible worlds. Roughly, a finding of liability is unsafe where it can easily be wrong in the sense that little in the actual world needs to change for it to be wrong. Whether the requirement of safety can explain why judgment should not entered against the defendant in our two hypothetical cases would depend on whether it can easily happen that the accident-causing bus was red or that the accused is innocent. (See Pritchard 2015 and 2018; Pardo 2018; cf. Gardiner 2020.) While theorizing of standards of proof in epistemic terms has gathered pace in recent years, it is criticised for relying on unrealistic hypotheticals that fail to attend to the actual operation of legal systems and for making impossible epistemological demands (Allen 2020).

There is another paradox in the mathematical interpretation of the standard of proof. This is the “conjunction paradox”. To succeed in a civil claim (or a criminal prosecution), the plaintiff (or the prosecution) will have to prove the material facts—or “elements”—that constitute the civil claim (or criminal charge) that is before the court (see discussion of “materiality” in section 2.2 above). Imagine a claim under the law of negligence that rests on two elements: a breach of duty of care by the defendant (element A) and causation of harm to the plaintiff (element B). To win the case, the plaintiff is legally required to prove A and B. For the sake of simplicity, let A and B be mutually independent events. Suppose the evidence establishes A to a probability of 0.6 and B to a probability of 0.7. On the mathematical interpretation of the civil standard of proof, the plaintiff should succeed in his claim since the probability with respect to each of the elements exceeds 0.5. However, according to the multiplication rule of conventional probability calculus, the probability that A and B are both true is the product of their respective probabilities; in this example, it is only 0.42 (obtained by multiplying 0.6 with 0.7). Thus, the overall probability is greater that the defendant deserves to win than that the plaintiff deserves to win, and yet the verdict is awarded in favour of the plaintiff.

One way of avoiding the conjunction paradox is to take the position that it should not be enough for each element to cross the probabilistic threshold; the plaintiff (or the prosecution) should win only if the probability of the plaintiff’s (or prosecution’s) case as a whole exceeds the applicable probabilistic threshold. So, in our example, the plaintiff should lose since the overall probability is below 0.5. But this suggested solution is unsatisfactory. The required level of overall probability would then turn on how many elements the civil claim or criminal charge happens to have. The greater the number of elements, the higher the level of probability to which, on average, each of them must be proved. This is thought to be arbitrary and hence objectionable. As two commentators noted, the legal definition of theft contains more elements than that for murder. Criminal law is not the same in all countries. We may take the following as a convenient approximation of what the law is in some countries: murder is (1) an act that caused the death of a person (2) that was done with the intention of causing the death, and to constitute theft, there must be (1) an intention to take property, (2) dishonesty in taking the property, (3) removal of the property from the possession of another person, and (4) lack of consent by that person. Since the offence of theft contains twice the number of elements as compared to murder, the individual elements for theft would have to be proved to a much higher level of probability (in order for the probability of their conjunction to cross the overall threshold) than the individual elements for the much more serious crime of murder (Allen and Leiter 2001: 1504–5). This is intuitively unacceptable.

Another proposal for resolving the conjunction paradox is move away from thinking of the standard of proof as a quantified threshold of absolute probability and to construe it, instead, as a probability ratio. The fact-finder has to compare the probability of the evidence adduced at the trial under the plaintiff’s theory of the case with the probability of the evidence under the defendant’s theory of the case (the two need not add to 1), and award the verdict to the side with a higher probability (Cheng 2013). One criticism of this interpretation of the standard of proof is that it ignores, and does not provide a basis for ignoring, the margin by which one probability exceeds the other, and the difference in probability may vary significantly for different elements of the case (Allen and Stein 2013: 598).

There is a deeper problem with the probabilistic conception of the standard of proof. There does not seem to be a satisfactory interpretation of probability that suits the forensic context. The only plausible candidate is the subjective meaning of probability according to which probability is construed as the strength of belief. The evidence is sufficient to satisfy the legal standard of proof on a disputed question of fact—for example, it is sufficient to justify the positive finding of fact that the accused killed the victim—only if the fact-finder, having considered the evidence, forms a sufficiently strong belief that the accused killed the victim. Guidance on how to process evidence and form beliefs can be found in a mathematical theorem known as Bayes’ theorem; it is the method by which an ideal rational fact-finder would revise or update his beliefs in the light of new evidence.^[19] To return to our earlier hypothetical scenario, suppose the fact-finder initially believes the odds of the accused being guilty is 1:1 (“prior odds”) or, putting this differently, that there is a 0.5 probability of guilt. The fact-finder then receives evidence that blood of type A was found at the scene of the crime and that the accused has type A blood. Fifty percent of the population has this blood type. On the Bayesian approach, the posterior odds are calculated by multiplying the prior odds (1:1) by the likelihood ratio (which, as we saw in section 2.1.2 above, is 2:1). The fact-finder’s belief in the odds of guilt should now be revised to 2:1; the probability of guilt is now increased to 0.67 (Lempert 1977).

The subjectivist Bayesian theory of legal fact-finding has come under attack (see generally Amaya 2015: 82–93; Pardo 2013: 591). First, as we already saw in section 3.1, ascertainment of the likelihood ratios is highly problematic. Secondly, the Bayesian theory is not sensitive to the weight of evidence which, roughly put, is the amount of evidence that is available. This criticism and the concept of weight are further explored in section 3.3.

Thirdly, while the Bayesian theorem offers a method for updating probabilities in the light of new evidence, it is silent on what the initial probability should be. In a trial setting, the initial probability cannot be set at zero since this means certainty in the innocence of the accused. No new evidence can then make any difference; whatever the likelihood ratio of the evidence, multiplying it by zero (the prior probability) will still end up with a posterior probability of zero. On the other hand, starting with an initial probability is also problematic. This is especially so in a criminal case. To start a trial with some probability of guilt is to have the fact-finder harbouring some initial belief that the accused is guilty and this is not easy to reconcile with the presumption of innocence. (Tribe 1971: 1368–1372; cf. Posner 1999: 1514, suggesting starting the trial with prior odds of 50:50, criticized by Friedman 2000. The problem of fixing the prior probability is said to disappear if we base fact-finding simply on likelihood ratios: Sullivan, 2019: 45–59.)

Fourthly, we have thus far relied for ease of illustration on highly simplified—and therefore unrealistic—examples. In real cases, there are normally multiple and dependent items of evidence and the probabilities of all possible conjunctions of these items, which are numerous, will have to be computed. These computations are far too complex to be undertaken by human beings (Callen 1982: 10–15). The impossibility of complying with the Bayesian model undermines its prescriptive value.

Fifthly, according to Haack, the Bayesian theory has it the wrong way round. What matters is not the strength of the fact-finder’s belief itself. The standard of proof should be understood instead in terms of what it is reasonable for the fact-finder to believe in the light of the evidence presented, and this is a matter of the degree to which the belief is warranted by the evidence. Evidence is legally sufficient where it warrants the contested factual claim to the degree required by law. Whether a factual claim is warranted by the evidence turns on how strongly the evidence supports the claim, on how independently secure the evidence is, and on how much of the relevant evidence is available to the fact-finder (that is, the comprehensiveness of the evidence—see further discussion in section 3.3 below). Haack is against identifying degrees of warrant with mathematical probabilities. Degrees of warrant do not conform to the axioms of the standard probability calculus. For instance, where the evidence is weak, neither p nor not-p may be warranted; in contrast, the probability of p and the probability of not-p must add up to 1. Further, where the probability of p and the probability of q are both less than 1, the probability of p and q, being the product of the probability of p and the probability of q, is less than the probability of either. On the other hand, the degree of warrant for the conjunction of p and q may be higher than the warrant for either.^[20] (See Haack 2004, 2008a,b, 2012, 2014 for the legal application of her general theory of epistemology. For her general theory of epistemology, see Haack 1993: ch. 4; Haack 2009: ch. 4; Haack 2003: ch. 3.)

Sixthly, research in experimental psychology suggests that fact-finders do not evaluate pieces of evidence one-by-one and in the unidirectional manner required under the mathematical model (Amaya 2015: 114–5). A holistic approach is taken instead where the discrete items of evidence are integrated into large cognitive structures (variously labelled as “mental models”, “stories”, “narratives” and “theories of the case”), and they are assessed globally against the legal definition of the crime or civil claim that is in dispute (Pennington and Hastie 1991, 1993; Pardo 2000). The reasoning does not progress linearly from evidence to a conclusion; it is bi-directional, going forward and backward: as the fact-finder’s consideration of the evidence inclines him towards a particular verdict, his leaning towards that conclusion will often produce a revision of his original perception and his assessment of the evidence (Simon 2004, 2011).

The holistic nature of evidential reasoning as revealed by these studies has inspired alternative theories that are of a non-mathematical nature. One alternative, already mentioned, is the “explanatory” or “relative plausibility” theory advanced by Allen together with Pardo and other collaborators (Allen 1986, 1991, 1994; Pardo 2000; Allen and Leiter 2001; Allen and Jehl 2003; Pardo and Allen 2008; Allen and Pardo 2019; cf. Nance 2001, Friedman 2001).^[21] They contend that fact-finders do not reason in the fashion portrayed by the Bayesian model. Instead, they engage in generating explanations or hypotheses on the available evidence by a process of abductive reasoning or drawing “inferences to the best explanation”, and these competing explanations or hypotheses are compared in the light of the evidence.^[22] The comparison is not of a hypothesis with the negation of that hypothesis, where the probability of a hypothesis is compared with the probability of its negation. Instead, the comparison is of one hypothesis with one or more particular alternative hypotheses as advocated by a party or as constructed by the fact-finder himself. On this approach, the plausibility of X, the factual account of the case that establishes the accused’s guilt or defendant’s liability, is compared with the plausibility of a hypothesis Y, a specific alternative account that points to the accused’s innocence or the defendant’s non-liability, and there may be more than one such specific alternative account.

On this theory, the evidence is sufficient to satisfy the preponderance of proof standard when the best-available hypothesis that explains the evidence and the underlying events include all of the elements of the claim. Thus, in a negligence case, the best-available hypothesis would have to include a breach of duty of care by the plaintiff and causation of harm to the defendant as these are the elements that must be proved to succeed in the legal claim. For the intermediate “clear-and-convincing” standard of proof, the best-available explanation must be substantially better than the alternatives. To establish the standard of proof beyond reasonable doubt, there must be a plausible explanation of the evidence that includes all of the elements of the crime and, in addition, there must be no plausible explanation that is consistent with innocence (Pardo and Allen 2008: 238–240; Pardo 2013: 603–604).

The relative plausibility theory itself is perceived to have a number of shortcomings.^[23] First, the theory portrays the assessment of plausibility as an exercise of judgment that involves employment of various criteria such as coherence, consistency, simplicity, consilience, and more. However, the theory is sketchy on the meaning of plausibility and the criteria for evaluating plausibility are left largely unanalyzed.^[24]

A second criticism of the relative plausibility theory is that, despite the purported utilisation of “inference to the best explanation” reasoning, the verdict is not controlled by the best explanation. For instance, even if the prosecution’s hypothesis is better than the defence’s hypothesis, neither may be very good. In these circumstances, the court must reject the prosecution’s hypothesis even though it is the best of alternatives (Laudan 2007). One suggested mitigation of this criticism is to place some demand on the epistemic effort that the trier of fact must take (for example, by being sufficiently diligent and thorough) in constructing the set of hypotheses from which the best is to be chosen (Amaya 2009: 155).

The third criticism is targeted at holistic theories of evidential reasoning in general and not specifically at the relative plausibility theory. While it may be descriptively true that fact-finders decide verdicts by holistic evaluation of the plausibility of competing explanations, hypotheses, narratives or factual theories that are generated from the evidence, such forms of reasoning may conceal bias and prejudice that stand greater chances of exposure under a systematic approach such as Bayesian analysis (Twining 2006: 319; Simon 2004, 2011; Griffin 2013). A hypothesis constructed by the fact-finder may be shaped subconsciously by a prejudicial generalisation or background belief about the accused based on a certain feature, say, his race or sexual history. Individuating this feature and subjecting it to Bayesian scrutiny has the desirable effect of putting the generalisation or background belief under the spotlight and forcing the fact-finder to confront the problem of prejudice.

3.3 The Weight of Evidence as the Degree of Evidential Completeness

A third idea of evidential weight is prompted by this insight from Keynes (1921: 71):

As the relevant evidence at our disposal increases, the magnitude of the probability of the argument may either decrease or increase, according as the new knowledge strengthens the unfavourable or the favourable evidence; but something seems to have increased in either case,—we have a more substantial basis upon which to rest our conclusion. I express this by saying that an accession of new evidence increases the weight of an argument. New evidence will sometimes decrease the probability of an argument, but it will always increase its “weight”.

This idea of evidential weight has been applied by some legal scholars in assessing the sufficiency of evidence in satisfying legal standards of proof.^[25] At its simplest, we may think of weight in the context of legal fact-finding as the amount of evidence before the court. Weight is distinguishable from probability. The weight of evidence may be high and the mathematical probability low, as in the situation where the prosecution adduces a great deal of evidence tending to incriminate the accused but the defence has an unshakeable alibi (Cohen 1986: 641). Conversely, the state of evidence adduced in a case might establish a sufficient degree of probability—high enough to cross the supposed threshold of proof on the mathematical conception of the standard of proof—and yet lack adequate weight. In the much-discussed gate-crasher’s paradox, the only available evidence shows that the defendant was one of a thousand spectators at a rodeo show and that only four hundred and ninety nine tickets were issued. The defendant is sued by the show organiser for gate-crashing. The mathematical probability that the defendant was a gate-crasher is 0.501 and this meets the probabilistic threshold for civil liability. But, according to the negation principle of mathematical probability, there is probability of 0.499 that the defendant did pay for his entrance. In these circumstances, it is intuitively unjust to find him liable (Cohen 1977: 75). A possible explanation for not finding him liable is that the evidence is too flimsy or of insufficient weight.

Proponents of the mathematical conception of the standard of proof have stood their ground even while acknowledging that weight has a role to play in the Bayesian analysis of probative value and the sufficiency of evidence. If a party does not produce relevant evidence that is in his possession, resulting in the court facing an evidential deficiency, it may draw an adverse inference against him when computing the posterior probability (Kaye 1986b: 667; Friedman 1997). One criticism of this approach is that, in the absence of information about the missing evidence, the drawing of the adverse inference is open to the objection of arbitrariness (Nance 2008: 274). A further objection is that the management of parties’ conduct relating to evidence preservation and presentation should be left to judges and not to the jury. What a judge may do to optimize evidential weight is to impose a burden of producing evidence on a party and to make the party suffer an adverse finding of fact if he fails to produce the evidence. This will serve as an incentive for the party to act in a manner that promotes the interest in evidential completeness (Nance 2008, 2010, 2016).

Cohen suggests that the standard of proof should be conceived entirely as a matter of evidential weight which, on his theory, is a matter of the number of tests or challenges to which a factual hypothesis is subjected to in court. He offers an account of legal fact-finding in terms of an account of inductive probability that was inspired by the work of writers such as Francis Bacon and J.S. Mill. Inductive probability operates differently from the classical calculus of probability. It is based on inductive support for the common-sense generalisation that licences the drawing of the relevant inference. Inductive support for a generalisation is graded according to the number of tests that it has passed, or, putting this in another way, by the degree of its resistance to falsification by relevant variables. The inductive probability of an argument is equal to the reliability grade of the inductive support for the generalisation which covers the argument.

Proof beyond reasonable doubt represents the maximum level of inductive probability. The prosecution may try to persuade the court to infer that the accused was guilty of burglary by producing evidence to establish that he was found in the vicinity of the victim’s house late at night with the stolen object on him. This inference is licensed by the generalisation that normally if a stranger is found immediately after a burglary in possession of the stolen object, he intentionally removed it himself. The defence may try to defeat the inference by showing that the generalization does not apply in the particular case, for example, by presenting evidence to show that the accused had found the object on the street. The prosecution’s hypothesis is now challenged or put to the test. As a counter-move, it may produce evidence to establish that the object could not have been lying in the street as alleged. If the generalisations on which the prosecution’s case rest survive challenges by the defence at every possible point, then guilt is proved beyond reasonable doubt.^[26] The same reasoning structure applies in the civil context except that in a civil case, the plaintiff succeeds in proof on the preponderance of evidence so long as the conclusion to be proved by him is more inductively probable than its negation. (Cohen 1977, 1986; cf. Schum 1979.)^[27]

Cohen’s theory seems to require that each test to which a hypothesis is put can be unequivocally and objectively resolved. But usually this is not the case. In our example, we may not be entirely convinced that the accused found or did not find the object on the street, and our evaluation would involve the exercise of judgment that is no less subjective as the sort of judgments required when applying the standard probabilistic conception of proof (Nance 2008: 275–6; Schum 1994: 261).

Bibliography

Abimbola, A., 2001, “Abductive Reasoning in Law: Taxonomy and Inference to the Best Explanation”, Cardozo Law Review, 22: 1683–1689.
Aitken, C., P. Roberts, and G. Jackson, 2010, Fundamentals of Probability and Statistical Evidence in Criminal Proceedings: Guidance for Judges, Lawyers, Forensic Scientists and Expert Witnessess, London: Royal Statistical Society. [Aitken, Roberts, and Jackson 2010 available online]
Allen, R., 1986, “A Reconceptualization of Civil Trials”, Boston University Law Review, 66: 401–437.
–––, 1991, “The Nature of Juridical Proof”, Cardozo Law Review, 13: 373–422.
–––, 1992, “The Myth of Conditional Relevancy”, Loyola of Los Angeles Law Review, 25: 871–884.
–––, 1994, “Factual Ambiguity and a Theory of Evidence”, Northwestern University Law Review, 88: 604–640.
–––, 2020, “Naturalized Epistemology and the Law of Evidence Revisited”, Quaestio Facti: International Journal on Evidential Reasoning, 2: 1–32.
Allen, R. and S. Jehl, 2003, “Burdens of Persuasion in Civil Cases: Algorithms v. Explanations”, Michigan State Law Review, 4: 893–944.
Allen, R. and B. Leiter, 2001, “Naturalized Epistemology and the Law of Evidence”, Virginia Law Review, 87: 1491–1550.
Allen, R. and M. Pardo, 2007a, “The Problematic Value of Mathematical Models of Evidence”, Journal of Legal Studies, 36: 107–140.
–––, 2007b, “Probability, Explanation and Inference: a Reply”, International Journal of Evidence and Proof, 11: 307–317.
–––, 2019, “Relative Plausibility and its Critics”, International Journal of Evidence and Proof, 23: 5–59.
Allen, R. and P. Roberts (eds.), 2007, International Journal of Evidence and Proof (Special Issue on the Reference Class Problem), vol. 11, no.4.
Allen, R. and A. Stein, 2013, “Evidence, Probability and the Burden of Proof”, Arizona Law Review, 55: 557–602.
Amaya, A., 2008, “Justification, Coherence, and Epistemic Responsibility in Legal Fact-finding”, Episteme, 5: 306–319.
–––, 2009, “Inference to the Best Explanation”, in Legal Evidence and Proof: Statistics, Stories and Logic, H. Kaptein, H. Prakken, and B. Verheij (eds.), Burlington: Ashgate, pp. 135–159.
–––, 2011, “Legal Justification by Optimal Coherence”, Ratio Juris, 24: 304–329.
–––, 2013, “Coherence, Evidence, and Legal Proof”, Legal Theory, 19: 1–43.
–––, 2015, The Tapestry of Reason: An Inquiry into the Nature of Coherence and its Role in Legal Argument, Oxford: Hart and Portland.
Anderson, T., D. Schum, and W. Twining, 2009, Analysis of Evidence, Cambridge: Cambridge University Press, 3^rd edition.
Ball, V., 1980, “The Myth of Conditional Relevancy”, Georgia Law Review, 14: 435–469.
Bartels, R., 1981, “Punishment and the Burden of Proof in Criminal Cases: A Modest Proposal”, Iowa Law Review, 66: 899–930.
Beltrán, J., 2006, “Legal Proof and Fact Finders’ Beliefs”, Legal Theory, 12: 293–314.
Bentham, J., 1825, A Treatise on Judicial Evidence, M. Dumont (ed.), London: Paget.
–––, 1827, Rationale of Judicial Evidence, Specially Applied to English Practice, J. Mill (ed.), London: Hunt and Clarke.
Blackstone, W., 1770, Commentaries on the Laws of England, vol. 4, Dublin.
Blome-Tillmann, M., 2017, “‘More Likely Than Not’ – Knowledge First and the Role of Bare Statistical Evidence in Courts of Law”, in Knowledge First: Approaches in Epistemology and Mind, J. Carter, E. Gordon, and B. Jarvis (eds.), Oxford: Oxford University Press, pp. 278–292.
Buchak, L., 2014, “Belief, Credence, and Norms”, Philosophical Studies, 169: 285–311.
Callen, C., 1982, “Notes on a Grand Illusion: Some Limits on the Use of Bayesian Theory in Evidence Law”, Indiana Law Journal, 57: 1–44.
Cheng, E., 2009, “A Practical Solution to the Reference Class Problem”, Columbia Law Review, 109: 2081–2105.
–––, 2013, “Reconceptualising the Burden of Proof”, Yale Law Journal, 122: 1254–1279.
Cohen, L., 1977, The Probable and the Provable, Oxford: Oxford University Press.
–––, 1986, “The Role of Evidential Weight in Criminal Proof”, Boston University Law Review, 66: 635–649.
–––, 1991, “Should a Jury Say What It Believes or What It Accepts?”, Cardozo Law Review, 13: 465–483.
–––, 1992, An Essay on Belief and Acceptance, Oxford: Clarendon Press.
Colyvan, M., H. Regan, and S. Ferson, 2001, “Is it a Crime to Belong to a Reference Class?”, Journal of Political Philosophy, 9: 168–181.
Cullison, A., 1969, “Probability Analysis of Judicial Fact-finding: A Preliminary Outline of the Subjective Approach”, Toledo Law Review, 1: 538–598.
Damaška, M., 1973, “Evidentiary Barriers to Conviction and Two Models of Criminal Procedure: A Comparative Study”, University of Pennsylvania Law Review, 121: 506–589.
–––, 1975, “Presentation of Evidence and Factfinding Precision”, University of Pennsylvania Law Review, 123: 1083–1105.
–––, 1992, “Of Hearsay and Its Analogues”, Minnesota Law Review, 76: 425–458.
–––, 1994, “Propensity Evidence in Continental Legal Systems”, Chicago Kent Law Review, 70: 55–67.
–––, 1997, Evidence Law Adrift, New Haven: Yale University Press.
–––, 2006, “The Jury and the Law of Evidence: Real and Imagined Interconnections”, Law, Probability and Risk, 5: 255–265.
–––, 2019, Evaluation of Evidence: Pre-modern and Modern Approaches, Cambridge: Cambridge University Press.
Davis, D. and W. Follette, 2002, “Rethinking the Probative Value of Evidence: Base Rates, Intuitive Profiling and the ‘Postdiction’ of Behavior”, Law and Human Behavior, 26: 133–158.
–––, 2003, “Toward an Empirical Approach to Evidentiary Ruling”, 27 Law and Human Behavior, 27: 661–684.
Dawid, P., W. Twining, and M. Vasilaki, 2011, Evidence, Inference and Enquiry, Oxford: Oxford University Press for the British Academy.
Duff, A., et al., 2007, The Trial on Trial (Volume 3: Towards a Normative Theory of the Criminal Trial), Oxford: Hart.
Dworkin, R., 1981, “Principle, Policy, Procedure”, in Crime, Proof and Punishment, Essays in Memory of Sir Rupert Cross, C. Tapper (ed.), London: Butterworths, pp. 193–225.
Eggleston, R., 1983, Evidence, Probability and Proof, London: Weidenfeld & Nicolson, 2^nd edition.
Enoch, D., L. Spectre, and T. Fisher, 2012, “Statistical Evidence, Sensitivity, and the Legal Value of Knowledge”, Philosophy and Public Affairs, 40(3): 197–224.
Enoch, D. and L. Spectre, 2019, “Sensitivity, Safety, and the Law: a Reply to Pardo”, Legal Theory, 25: 178–199.
Enoch, D. and T. Fisher, 2015, “Sense and ‘Sensitivity’: Epistemic and Instrumental Approaches to Statistical Evidence”, Stanford Law Review, 67: 557–611.
Finkelstein, M. and B. Levin, 2003, “On the Probative Value of Evidence from a Screening Search”, Jurimetrics, 43: 265–290.
Franklin, J., 2010, “Feature Selection Methods for Solving the Reference Class Problem: Comment on Edward K. Cheng, ‘A Practical Solution to the Reference Class Problem’”, Columbia Law Review Sidebar, 110: 12–23.
–––, 2011, “The Objective Bayesian Conceptualisation of Proof and Reference Class Problems”, Sydney Law Review, 33: 545–561.
–––, 2012, “Discussion Paper: How much of Commonsense and Legal Reasoning is Formalizable? A Review of Conceptual Obstacles”, Law, Probability and Risk, 11: 225–245.
Friedman, R., 1986, “A Close Look at Probative Value”, Boston University Law Review, 33: 733–759.
–––, 1994, “Conditional Probative Value: Neoclassicism Without Myth”, Michigan Law Review, 93:439–484.
–––, 1997, “Dealing with Evidential Deficiency”, Cardozo Law Review, 18: 1961–1986.
–––, 2000, “A Presumption of Innocence, Not of Even Odds”, Stanford Law Review, 52:873–887.
–––, 2001, “‘E’ is for Eclectic: Multiple Perspectives on Evidence”, Virginia Law Review, 87: 2029–2054.
Friedman, R. and R. Park, 2003, “Sometimes What Everybody Thinks They Know Is True”, Law and Human Behavior, 27: 629–644.
Gardiner, G., 2019a, “The Reasonable and the Relevant: Legal Standards of Proof”, Philosophy and Public Affairs, 47: 288–318.
–––, 2019b, “Legal Burdens of Proof and Statistical Evidence”, in The Routledge Handbook of Applied Epistemology, in D. Coady and J. Chase (eds.), Oxford: Routledge.
–––, 2020, “Profiling and Proof: Are Statistics Safe?”, Philosophy, 95: 161–183.
–––, forthcoming, “Legal Evidence and Knowledge”, in M. Lasonen-Aarnio and C. Littlejohn (eds.), The Routledge Handbook of the Philosophy of Evidence, Oxford: Routledge.
Goldman, A., 1999, Knowledge in a Social World, Oxford: Oxford University Press.
–––, 2002, “Quasi-Objective Bayesianism and Legal Evidence”, Jurimetrics, 42: 237–260.
–––, 2005, “Legal Evidence” in The Blackwell Guide to the Philosophy of Law and Legal Theory, M. Goldring and W. Edmundson (eds.), Malden, MA: Blackwell, pp. 163–175.
Griffin, L., 2013, “Narrative, Truth, and Trial”, Georgetown Law Journal, 101: 281–335.
Haack, S., 1993,Evidence and Inquiry, Towards Reconstruction in Epistemology, Oxford: Blackwell.
–––, 2003, “Clues to the Puzzle of Scientific Evidence: a More-So Story” in S. Haack, Defending Science: Within Reasons, New York: Prometheus, pp. 57–91.
–––, 2004, “Epistemology Legalized: or, Truth, Justice and the American Way”, American Journal of Jurisprudence, 49: 43–61.
–––, 2008a, “Proving Causation: The Holism of Warrant and the Atomism of Daubert”, Journal of Health and Biomedical Law, 4: 253–289.
–––, 2008b, “Warrant, Causation, and the Atomism of Evidence Law”, Episteme, 5: 253–266.
–––, 2009, Evidence and Inquiry: A Pragmatist Reconstruction of Epistemology, New York: Prometheus (expanded edition of Haack 1993).
–––, 2012, “The Embedded Epistemologist: Dispatches from the Legal Front”, Ratio Juris, 25: 206–235.
–––, 2014, “Legal Probabilism: An Epistemological Dissent” in S. Haack, Evidence Matters: Science, Proof, and Truth in the Law, Cambridge: Cambridge University Press, pp. 47–77.
Ho, H.L., 2003–2004, “The Legitimacy of Medieval Proof”, Journal of Law and Religion, 19: 259–298.
–––, 2008, A Philosophy of Evidence Law: Justice in the Search for Truth, Oxford: Oxford University Press.
Jackson, J. and S. Doran, 2010, “Evidence” in A Companion to Philosophy of Law and Legal Theory, 2^nd edition, D. Patterson (ed.), Malden, MA : Wiley-Blackwell, pp. 177–187.
James, G., 1941, “Relevancy, Probability and the Law”, California Law Review, 29: 689–705.
Josephson, J., 2001, “On the Proof Dynamics of Inference to the Best Explanation”, Cardozo Law Review 22: 1621–1643.
Kaplan, J., 1968, “Decision Theory and the Fact-finding Process”, Stanford Law Review, 20: 1065–1092.
Kaplow, L., 2012, “Burden of Proof”, Yale Law Journal, 121: 738–859.
Kaye, D., 1986a, “Quantifying Probative Value”, Boston University Law Review, 66: 761–766.
–––, 1986b, “Do We Need a Calculus of Weight to Understand Proof Beyond Reasonable Doubt?”, Boston University Law Review, 66: 657–672.
Kaye, D. and J. Koehler, 2003, “The Misquantification of Probative Value”, Law and Human Behavior, 27: 645–659.
Keynes, J., 1921, A Treatise on Probability, London: MacMillan.
Laudan, L., 2006, Truth, Error, and Criminal Law: An Essay in Legal Epistemology, Cambridge: Cambridge University Press.
–––, 2007, “Strange Bedfellows: Inference to the Best Explanation and the Criminal Standard of Proof”, International Journal of Evidence and Proof, 11: 292–306.
Laudan, L. and H. Saunders, 2009, “Re-Thinking the Criminal Standard of Proof: Seeking Consensus about the Utilities of Trial Outcomes”, International Commentary on Evidence, 7(2), article 1 (online journal).
Lawson, G., 2017, Evidence of the Law: Proving Legal Claims, Chicago: University of Chicago Press.
Leiter, B., 1997, “Why Even Good Philosophy of Science Would Not Make for Good Philosophy of Evidence”, Brigham Young University Law Review, 803–819.
Lempert, R., 1977, “Modeling Relevance”, Michigan Law Review, 75: 1021–1057.
Lillquist, E., 2002, “Recasting Reasonable Doubt: Decision Theory and the Virtues of Variability”, University of California Davies Law Review, 36: 85–197.
Littlejohn, C., 2020, “Truth, Knowledge, and the Standard of Proof in Criminal Law”, Synthese, 197: 5253–5286.
–––, 2021, “Justified Belief and Just Conviction” in The Social Epistemology of Legal Trials, Z. Hoskins and J. Robson (eds.), New York: Routledge, pp. 106–123.
MacCrimmon, M., 2001–2002, “What is ‘Common’ about Common Sense?: Cautionary Tales for Travelers Crossing Disciplinary Boundaries”, Cardozo Law Review, 22: 1433–1460.
McCormick, C., 2013, McCormick on Evidence, K. Broun et al. (eds.), St. Paul, Minnesota: Thomson Reuters/WestLaw, 7^th edition.
McNamara, P., 1986, “The Canons of Evidence: Rules of Exclusion or Rules of Use?”, Adelaide Law Review, 10: 341–364.
Mnookin, J., 2006, “Bifurcation and the Law of Evidence”, University of Pennsylvania Law Review PENNumbra, 155: 134–145.
–––, 2013, “Atomism, Holism, and the Judicial Assessment of Evidence”, University of California at Los Angeles Law Review, 60: 1524–1585.
Montrose, J., 1954, “Basic Concepts of the Law of Evidence”, Law Quarterly Review, 70: 527–555.
Morgan, E., 1929, “Functions of Judge and Jury in the Determination of Preliminary Questions of Fact”, Harvard Law Review, 43: 165–191.
–––, 1936–37, “The Jury and the Exclusionary Rules of Evidence”, University of Chicago Law Review, 4: 247–258.
Moss, S., 2018, Probabilistic Knowledge, Oxford: Oxford University Press.
–––, forthcoming, “Knowledge and Legal Proof” in Oxford Studies in Epistemology (Volume 7), T. Gendler and J. Hawthorne (eds.), Oxford: Oxford University Press.
Nance, D., 1988, “The Best Evidence Principle”, Iowa Law Review, 73: 227–297.
–––, 1990, “Conditional Relevance Reinterpreted”, Boston University Law Review, 70: 447–507.
–––, 2001, “Naturalized Epistemology and the Critique of Evidence Theory”, Virginia Law Review, 87: 1551–1618.
–––, 2007a, “Allocating the Risk of Error”, Legal Theory, 13: 129–164.
–––, 2007b, “The Reference Class Problem and Mathematical Models of Inference”, International Journal of Evidence and Proof, 11: 259–273.
–––, 2008, “The Weights of Evidence”, Episteme, 5: 267–281.
–––, 2010, “Adverse Inferences About Adverse Inferences: Restructuring Juridical Roles for Responding to Evidence Tampering by Parties to Litigation”, Boston University Law Review, 90: 1089–1146.
–––, 2016, The Burdens of Proof – Discriminatory Power, Weight of Evidence and Tenacity of Belief, Cambridge: Cambridge University Press.
Nance, D. and S. Morris, 2002, “An Empirical Assessment of Presentation Formats for Trace Evidence with a Relatively Large and Quantifiable Random Match Probability”, Jurimetrics, 42: 403–447.
Nelkin, D., 2021, “Rational Belief and Statistical Evidence — Blame, Bias and the Law” in Lotteries, Knowledge, and Rational Belief, I. Douven (ed.), Cambridge: Cambridge University Press, pp. 6–27.
Nesson, C., 1979, “Reasonable Doubt and Permissive Inferences: the Value of Complexity”, Harvard Law Review, 92: 1187–1225.
–––, 1985, “The Evidence or the Event? On Judicial Proof and the Acceptability of Verdicts”, Harvard Law Review, 98: 1357–1392.
Pardo, M., 2000, “Juridical Proof, Evidence, and Pragmatic Meaning: Toward Evidentiary Holism”, Northwestern University Law Review, 95: 399–442.
–––, 2005, “The Field of Evidence and the Field of Knowledge”, Law and Philosophy, 24: 321–392.
–––, 2007, “The Political Morality of Evidence Law”, International Commentary on Evidence, 5(2), essay 1 (online journal).
–––, 2010, “The Gettier Problem and Legal Proof”, Legal Theory, 16: 37–57.
–––, 2013, “The Nature and Purpose of Evidence Theory”, Vanderbilt Law Review, 66: 547–613.
–––, 2018, “Safety vs. Sensitivity: Possible Worlds and the Law of Evidence”, Legal Theory, 24: 50–75.
Pardo, M.S. and R.J. Allen, 2008, “Juridical Proof and the Best Explanation”, Law and Philosophy, 27: 223–268.
Park, R., 1986, “The Hearsay Rule and the Stability of Verdicts: A Response to Professor Nesson”, Minnesota Law Review, 70: 1057–1072.
Park, R. et al., 2010, “Bayes Wars Redivivus: An Exchange”, International Commentary on Evidence, 8(1), article 1 (online journal).
Pattenden, R., 1996–7, “The Discretionary Exclusion of Relevant Evidence in English Civil Proceedings”, International Journal of Evidence and Proof, 1: 361–385.
Pennington, N. and R. Hastie, 1991, “A Cognitive Model of Juror Decision Making: The Story Model”, Cardozo Law Review, 13: 519–557.
–––, 1993, “The Story Model for Juror Decision-making” in Inside the Juror: The Psychology of Juror Decision Making, R. Hastie (ed.), Cambridge: Cambridge University Press, pp. 192–221.
Picinali, F., 2013, “Two Meanings of ‘Reasonableness’: Dispelling the ‘Floating’ Reasonable Doubt”, Modern Law Review, 76: 845–875.
Pollock, F., 1876, “Stephen’s Digest of the Law of Evidence”, The Forthnightly Review, 20: 383–394.
–––, 1899, “Review of A Preliminary Treatise on Evidence at the Common Law by James Bradley Thayer, Law Quarterly Review”, 15: 86–87.
Posner, R., 1999, “An Economic Approach to the Law of Evidence”, Stanford Law Review, 51: 1477–1546.
Pritchard, D., 2015, “Risk”, Metaphilosophy, 46: 436–461.
–––, 2018, “Legal Risk, Legal Evidence and the Arithmetic of Criminal Justice”, Jurisprudence, 9: 108–119.
Rescher, N. and C. Joynt, 1959, “Evidence in History and in the Law”, Journal of Philosophy, 56: 561–578.
Redmayne, M., 1996, “Standards of Proof in Civil Litigation”, Modern Law Review, 62: 167–195.
–––, 2006, “The Structure of Evidence Law”, Oxford Journal of Legal Studies, 26: 805–822.
–––, 2008, “Exploring the Proof Paradoxes”, Legal Theory, 14: 281–309.
Ribeiro, G., 2019, “The Case for Varying Standards of Proof”, San Diego Law Review, 56: 161–219.
Roberts, P. and C. Aitken, 2014, The Logic of Forensic Proof: Inferential Reasoning in Criminal Evidence and Forensic Science, London: Royal Statistical Society. [Roberts and Aitken 2014 available online]
Roberts, P. and A. Zuckerman, 2010, Criminal Evidence , Oxford: Oxford University Press, 2^nd edition.
Robertson, B. and G. Vignaux, 1995, Interpreting Evidence: Evaluating Forensic Science in the Courtroom, Chichester: John Wiley.
Schauer, F., 2006, “On the Supposed Jury-Dependence of Evidence Law”, University of Pennsylvania Law Review, 155: 165–202.
–––, 2008, “In Defense of Rule-Based Evidence Law: And Epistemology Too”, Episteme 5: 295–305.
Schum, D., 1979, “A Review of a Case Against Blaise Pascal and His Heirs”, Michigan Law Review, 77:446–483.
–––, 1994, The Evidential Foundations of Probabilistic Reasoning, New York: John Wiley & Sons.
–––, 1998, “Legal Evidence and Inference” in Routledge Encyclopedia of Philosophy, E. Craig (ed.), London: Routledge, pp. 500–506.
–––, 2001, “Species of Abductive Reasoning in Fact Investigation in Law”, Cardozo Law Review, 22:1645–1681.
Simon, D., 2004, “A Third View of the Black Box: Cognitive Coherence in Legal Decision Making”, University of Chicago Law Review, 71: 511–586.
–––, 2011, “Limited Diagnosticity of Criminal Trials”, Vanderbilt Law Review, 64: 143–223.
Smith, M., 2018, “When Does Evidence Suffice for Conviction?”, Mind, 127: 1193–1218.
Stein, A., 2005, Foundations of Evidence Law, Oxford: Oxford University Press.
Stephen, J., 1872, The Indian Evidence Act, with an Introduction on the Principles of Judicial Evidence, Calcutta: Thacker, Spink & Co.
–––, 1886, A Digest of the Law of Evidence, London: William Clowes & Sons, 5^th edition.
Sullivan, S., 2019, “A Likelihood Story: The Theory of Legal Fact-finding”, University of Colorado Law Review, 90: 1–66.
Thayer, J., 1898, A Preliminary Treatise on Evidence at the Common Law, Boston: Little, Brown & Co.
Thomson, J., 1986, “Liability and Individualized Evidence”, Law and Contemporary Problems, 49(3): 199–219.
Tillers, P., 2005, “If Wishes were Horses: Discursive Comments on Attempts to Prevent Individuals from Being Unfairly Burdened by their Reference Classes”, Law, Probability and Risk, 4: 33–39.
–––, 2008, “Are there Universal Principles or Forms of Evidential Inference? Of Inference Networks and Onto-Epistemology” in Crime, Procedure and Evidence in a Comparative and International Context, J. Jackson, M. Langer, and P. Tillers (eds.), Oxford: Hart, pp. 179–198.
Tillers, P. and E. Green (eds.), 1988, Probability and Inference in the Law of Evidence: The Limits and Uses of Bayesianism, Dordrecht: Kluwer.
Trautman, H., 1952, “Logical or Legal Relevancy: A Conflict in Theory”, Vanderbilt Law Review, 5: 385–413.
Tribe, L., 1971, “Trial by Mathematics: Precision and Ritual in the Legal Process”, Harvard Law Review, 84: 1329–1393
Twining, W., 1985, Theories of Evidence: Bentham and Wigmore, London: Weidenfeld and Nicolson.
–––, 2006, Rethinking Evidence: Exploratory Essays, Cambridge: Cambridge University Press, 2^nd edition.
Twining, W. and I. Hampsher-Monk, 2003, Evidence and Inference in History and Law: Interdisciplinary Dialogues, Evanston, Illinois: Northwestern University Press.
Whitworth, G., 1881, The Theory of Relevancy for the Purpose of Judicial Evidence, Bombay: Thacker & Co.
Wigmore, J., 1913, “Review of A Treatise on Facts, or the Weight and Value of Evidence by Charles C. Moore”, Illinois Law Review, 3: 477–478.
–––, 1935, A Students’ Textbook of the Law of Evidence, Brooklyn: Foundation Press.
–––, 1937, Science of Judicial Proof, as Given by Logic, Psychology, and General Experience and Illustrated in Judicial Trials, Boston: Little, Brown and Co.
–––, 1983a, Evidence in Trials at Common Law, vol. 1, P. Tillers (ed.), Boston: Little, Brown and Co.
–––, 1983b, Evidence in Trials at Common Law, vol. 1A, P. Tillers (ed.), Boston: Little, Brown and Co.
Wills, W., 1852, An Essay on the Principles of Circumstantial Evidence, Philadelphia: T & J W Johnson, reprint from the third London edition.

Academic Tools

How to cite this entry.

Preview the PDF version of this entry at the Friends of the SEP Society.

Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO).

Enhanced bibliography for this entry at PhilPapers, with links to its database.

Other Internet Resources

Legal Information Institute, at Cornell Law School. This site makes available the full text of the Federal Rules of Evidence with commentaries by the Advisory Committee on Rules.
Statistics and the Law, page at the Royal Statistical Society.

Open access to the SEP is made possible by a world-wide funding initiative.
The Encyclopedia Now Needs Your Support
Please Read How You Can Help Keep the Encyclopedia Free