Scientific Explanation
Issues concerning scientific explanation have been a focus of philosophical attention from Pre-Socratic times through the modern period. However, modern discussion really begins with the development of the Deductive-Nomological (DN) model. This model has had many advocates (including Popper 1959, Braithwaite 1953, Gardiner, 1959, Nagel 1961) but unquestionably the most detailed and influential statement is due to Carl Hempel (1942, 1965a, and Hempel & Oppenheim 1948). These papers and the reaction to them have structured subsequent discussion concerning scientific explanation to an extraordinary degree. After some general remarks by way of background and orientation (Section 1), this entry describes the DN model and its extensions, and then turns to some well-known objections (Section 2). It next describes a variety of subsequent attempts to develop alternative models of explanation, including Wesley Salmon’s Statistical Relevance (Section 3) and Causal Mechanical (Section 4) models, Unificationist models due to Michael Friedman and Philip Kitcher (Section 5), and Pragmatic theories found in the work of van Fraassen (Section 6). Section 7 provides a summary and discusses directions for future work. This article thus discusses treatments of scientific explanation up to the end of the twentieth century.
- 1. Background and Introduction
- 2. The DN Model
- 3. The SR Model
- 4. The Causal Mechanical Model
- 5. A Unificationist Account of Explanation
- 6. Pragmatic Theories of Explanation
- 7. Conclusions, Open Issues, and Future Directions
- Bibliography
- Academic Tools
- Other Internet Resources
- Related Entries
1. Background and Introduction
As will become apparent, “scientific explanation” is a topic that raises a number of interrelated issues. Some background orientation will be useful before turning to the details of competing models. A presupposition of most recent discussion has been that science sometimes provides explanations (rather than something that falls short of explanation—e.g., “mere description”) and that the task of a “theory” or “model” of scientific explanation is to characterize the structure of such explanations. It is thus assumed that there is (at some suitably abstract and general level of description) a single kind or form of explanation that is “scientific”. In fact, the notion of “scientific explanation” suggests at least two contrasts—first, a contrast between those “explanations” that are characteristic of “science” and those explanations that are not, and, second, a contrast between “explanation” and something else. However, with respect to the first contrast, much of the recent philosophical literature assumes that there is substantial continuity between explanations found in science and some forms of explanation found in ordinary, non-scientific contexts. It is further assumed that it is the task of a theory of explanation to capture what is common to both scientific and some ordinary, non-scientific forms of explanation. These assumptions help to explain (what may otherwise strike the reader as curious) why, as this entry will illustrate, discussions of scientific explanation often move back and forth between examples drawn from bona-fide science (e.g., explanations of the trajectories of the planets that appeal to Newtonian mechanics) and more homey examples (e.g., the tipping over of inkwells).
With respect to the second contrast, most models of explanation assume that it is possible for a set of claims to be true, accurate, supported by evidence, and so on and yet unexplanatory. For example, all of the accounts of scientific explanation described below would agree that an account of the appearance of a particular species of bird of the sort found in a bird guidebook is, however accurate, not an explanation of anything of interest to biologists (such as the development, characteristic features, or behavior of that species). Instead, such an account is “merely descriptive”. However, different models of explanation provide different accounts of what the contrast between the explanatory and merely descriptive consists in.
A related point is that, while most theorists of scientific explanation have proposed models that are intended to cover at least some cases of explanation that we would not think of as part of science, they have nonetheless assumed some implicit restriction on the kinds of explanation they have sought to reconstruct. It has often been noted that the word “explanation” is used in a wide variety of ways in ordinary English—we speak of explaining the meaning of a word, explaining how to bake a pie, explaining why one made a certain decision (where this is to offer a justification) and so on. Although the various models discussed below have sometimes been criticized for their failure to capture all of these forms of “explanation” (see, e.g., Scriven 1959), it is clear that they were never intended to do this. Instead, their intended explicandum is, very roughly, explanations of why things happen, where the “things” in question can be either particular events or something more general—e.g., regularities or repeatable patterns in nature. Paradigms of this sort of explanation include: the explanation for the advance in the perihelion of mercury provided by General Relativity, the explanation of the extinction of the dinosaurs in terms of the impact of a large asteroid at the end of the Cretaceous period, the explanation provided by the police for why a traffic accident occurred (the driver was speeding and there was ice on the road), and the standard explanation provided in economics textbooks for why monopolies will, in comparison with firms in perfectly competitive markets, raise prices and reduce output.
Finally, a few words about the broader epistemological and methodological background to the models described below. Many philosophers think of concepts like “explanation”, “law”, “cause”, and “support for counterfactuals” as part of an interrelated family of concepts that are “modal” in character. For familiar “empiricist” reasons, Hempel and many other early defenders of the DN model regarded these concepts as not well understood, at least prior to analysis. It was assumed that it would be “circular” to explain one concept from this family in terms of others from the same family and that they must instead be explicated in terms of other concepts from outside the modal family—concepts that more obviously satisfied (what were taken to be) empiricist standards of intelligibility and testability. For example, in Hempel’s version of the DN model, the notion of a “law” plays a key role in explicating the concept of “explanation”, and his assumption is that laws are just regularities that meet certain further conditions that are also acceptable to empiricists. As we shall see, these empiricist standards (and an accompanying unwillingness to employ modal concepts as primitives) have continued to play a central role in the models of explanation developed subsequent to the DN model.
A related issue has to do with whether all scientific explanations are causal and if not, what distinguishes causal from non-causal explanations. Hempel recognized both causal and non-causal forms of explanation but held that both were captured by the DN model –in his view, causal explanations are simply DN explanations that cite causal laws (which he regarded as a proper subset of all laws). Many but not all of the accounts discussed below in effect assume that many of the problems with the DN model can be traced to its commitment to an inadequate account of causation; thus that getting clearer about causal notions would lead to more adequate accounts of explanation. By contrast, a substantial amount of recent discussion of explanation has moved away from this focus on causation and instead explores the possibility of non-causal forms of explanation.[1]
Suggested Readings: Salmon (1989) is a superb critical survey of all the models of scientific explanation discussed in this entry. Kitcher and Salmon (1989), Pitt (1988), and Ruben (1993) are anthologies that contain a number of influential articles.
2. The DN Model
2.1 The Basic Idea
According to the Deductive-Nomological Model, a scientific explanation consists of two major “constituents”: an explanandum, which is a sentence “describing the phenomenon to be explained” and an explanans, “the class of those sentences which are adduced to account for the phenomenon” (Hempel & Oppenheim 1948 [1965: 247]). For the explanans to successfully explain the explanandum several conditions must be met. First, “the explanandum must be a logical consequence of the explanans” and “the sentences constituting the explanans must be true” (Hempel 1948 [1965: 248]). That is, the explanation should take the form of a sound deductive argument in which the explanandum follows as a conclusion from the premises in the explanans. This is the “deductive” component of the model. Second, the explanans must contain at least one “law of nature” and this must be an essential premise in the derivation in the sense that the derivation of the explanandum would not be valid if this premise were removed. This is the “nomological” component of the model—“nomological” being a philosophical term of art which, suppressing some niceties, means (roughly) “lawful”. In its most general formulation, the DN model is meant to apply both to the explanation of “general regularities” or “laws” such as (to use Hempel and Oppenheim’s examples) why light conforms to the law of refraction and also to the explanation of particular events, conceived as occurring at a particular time and place, such as the bent appearance of the partially submerged oars of a rowboat on a particular occasion of viewing. As an additional illustration of a DN explanation of a particular event, consider a derivation of the position of Mars at some future time from Newton’s laws of motion, the Newtonian inverse square law governing gravity, and information about the mass of the sun, the mass of Mars and the present position and velocity of each. In this derivation the various Newtonian laws figure as essential premises and they are used, in conjunction with appropriate information about initial conditions (the masses of Mars and the sun and so on), to derive the explanandum (the future position of Mars) via a deductively valid argument. The DN criteria are thus satisfied.
2.2 The Role of Laws in the DN Model
The notion of a sound deductive argument is (arguably) relatively clear (or at least something that can be regarded as antecedently understood from the point of view of characterizing scientific explanation). But what about the other major component of the DN model—that of a law of nature? The basic intuition that guides the DN model goes something like this: Within the class of true generalizations, we may distinguish between those that are only “accidentally true” and those that are “laws”. To use Hempel’s examples, the generalization
- (1)
- All members of the Greensbury School Board for 1964 are bald
is, if true, only accidentally so. In contrast,
- (2)
- All gases expand when heated under constant pressure
is a law. Thus, according to the DN model, the latter generalization can be used, in conjunction with information that some particular sample of gas has been heated under constant pressure, to explain why it has expanded. By contrast, the former generalization (1) in conjunction with the information that a particular person n is a member of the 1964 Greensbury school board, cannot be used to explain why n is bald.
While this example may seem clear enough, what exactly is it that distinguishes true accidental generalizations from laws? This has been the subject of a great deal of philosophical discussion, most of which must be beyond the scope of this entry.[2] For reasons explained in Section 1, Hempel assumes that an adequate account must explain the notion of law in terms of notions that lie outside the modal family.[3] He considers (1965b) a number of familiar proposals having this character[4] and finds them all wanting, remarking that the problem of characterizing the notion of law has proved “highly recalcitrant” (1965b: 338). It seems fair to say, however, that his underlying assumption is that, at bottom, laws are just exceptionless generalizations describing regularities that meet certain additional distinguishing conditions that he is not at present able to formulate. In subsequent decades, a variety of criteria for lawhood have been proposed. Of these the so-called best systems analysis (Lewis 1973) is probably the most popular, but no single account has won general acceptance. Finding an adequate characterization of lawhood is thus an ongoing issue for the DN model.
One point at which this issue is particularly pressing concerns the explanatory status of the so-called special sciences—biology, psychology, economics and so on. These sciences are full of generalizations that appear to play an explanatory role and yet fail to satisfy many of the standard criteria for lawfulness. For example, although Mendel’s law of segregation (M) (which states that in sexually reproducing organisms each of the two alternative forms (alleles) of a gene specifying a trait at a locus in a given organism has 0.5 probability of ending up in a gamete) is widely used in models in evolutionary biology, it has a number of exceptions, such as meiotic drive. A similar point holds for the principles of rational choice theory (such as the generalization that preferences are transitive) which figure centrally in economics. Other widely used generalizations in the special sciences have very narrow scope in comparison with paradigmatic laws, hold only over restricted spatio-temporal regions, and lack explicit theoretical integration.
There is considerable disagreement over whether such generalizations are laws. Some philosophers (e.g., Woodward 2000) suggest that such generalizations satisfy too few of the standard criteria to count as laws but can nevertheless figure in explanations; if so, it apparently follows that we must abandon the DN requirement that all explanations must appeal to laws. Others (e. g., Mitchell 1997), emphasizing different criteria for lawfulness, conclude instead that generalizations like (M) are laws and hence no threat to the requirement that explanations must invoke laws. In the absence of a more principled account of laws, it is hard to evaluate these competing claims and hence hard to assess the implications of the DN model for the special sciences. At the very least, providing such an account is an important item of unfinished business for advocates of the DN model.
2.3 Inductive Statistical Explanation
The DN model is meant to capture explanation via deduction from deterministic laws and this raises the obvious question of the explanatory status of statistical laws. Do such laws explain at all and if so, what do they explain, and under what conditions? Hempel (1965b) distinguishes two varieties of statistical explanation. The first of these, deductive-statistical (DS) explanation, involves the deduction of “a narrower statistical uniformity” from a more general set of premises, at least one of which involves a more general statistical law. Since DS explanation involves deduction of the explanandum from a law, it conforms to the same general pattern as the DN explanation of regularities. However, in addition to DS explanation, Hempel also recognizes a distinctive sort of statistical explanation, which he calls inductive-statistical or IS explanation, involving the subsumption of individual events (like the recovery of a particular person from streptococcus infection) under (what he regards as) statistical laws (such as a law specifying the probability of recovery, given that penicillin has been taken).
While the explanandum of a DN or DS explanation can be deduced from the explanans, one cannot deduce that some particular individual, John Jones, has recovered from the above statistical law and the information that he has taken penicillin. At most what can be deduced from this information is that recovery is more or less probable. In IS explanation, the relation between explanans and explanandum is, in Hempel’s words, “inductive,” rather than deductive—hence the name inductive-statistical explanation. The details of Hempel’s account are complex, but the underlying idea is roughly this: an IS explanation will be good or successful to the extent that its explanans confers high probability on its explanandum outcome.
Thus if it is a statistical law that the probability of recovery from streptococcus, given that one has taken penicillin, is high, and Jones has taken penicillin and recovered, this information can be used to provide an IS explanation of Jones’s recovery. However if the probability of recovery is low (e.g., less than 0.5), given that Jones has taken penicillin, then, even if Jones recovers, we cannot use this information to provide an IS explanation of his recovery.
2.4 Motivation for the DN Model: Nomic Expectability and a Regularity Account of Causation
Why suppose that all (or even some) explanations have a DN or IS structure? There are two ideas which play a central motivating role in Hempel’s (1965b) discussion. The first connects the information provided by a DN argument with a certain conception of what it is to achieve understanding of why something happens—it appeals to an idea about the object or point of giving an explanation. Hempel writes
… a DN explanation answers the question “Why did the explanandum-phenomenon occur?” by showing that the phenomenon resulted from certain particular circumstances, specified in \(C_1,C_2,\ldots,C_k\), in accordance with the laws \(L_1,L_2,\ldots,L_{\gamma}\). By pointing this out, the argument shows that, given the particular circumstances and the laws in question, the occurrence of the phenomenon was to be expected; and it is in this sense that the explanation enables us to understand why the phenomenon occurred. (1965b: 337, italics in original)
One can think of IS explanation as involving a natural generalization of this idea. While an IS explanation does not show that the explanandum-phenomenon was to be expected with certainty, it does the next best thing: it shows that the explanandum-phenomenon is at least to be expected with high probability and in this way provides understanding. Stated more generally, both the DN and IS models, share the common idea that, as Salmon (1989) puts it,
the essence of scientific explanation can be described as nomic expectability—that is expectability on the basis of lawful connections. (1989: 57)
The second main motivation for the DN/IS model has to do with the role of causal claims in scientific explanation. There is considerable disagreement among philosophers about whether all explanations in science and in ordinary life are causal and also disagreement about what the distinction (if any) between causal and non-causal explanations consists in. Nonetheless, virtually everyone, including Hempel, agrees that many scientific explanations cite information about causes. However, Hempel, along with most other early advocates of the DN model, is unwilling to take the notion of causation as primitive in the theory of explanation—that is, he was unwilling to simply say that X figures in an explanation of Y if and only if X causes Y. Instead, adherents of the DN model have generally looked for an account of causation that satisfies the empiricist requirements described in Section 1. In particular, advocates of the DN model have generally accepted a broadly Humean or regularity theory of causation, according to which (very roughly) all causal claims imply the existence of some corresponding regularity (a “law”) linking cause to effect. This is then taken to show that all causal explanations “imply,” perhaps only “implicitly,” that such a law/regularity exists and hence that laws are “involved” in all such explanations, just as the DN model claims.
To illustrate this line of argument, consider
- (3)
- The impact of my knee on the desk caused the tipping over of the inkwell.
(3) is a so-called singular causal explanation, advanced by Michael Scriven (1962) as a counterexample to the claim that the DN model describes necessary conditions for successful explanation. According to Scriven, (3) explains the tipping over of the inkwell even though no law or generalization figures explicitly in (3) and (3) appears to consist of a single sentence, rather than a deductive argument. Hempel’s response (1965b: 360ff) is that the occurrence of “caused” in (3) should not be left unanalyzed or taken as explanatory just as it stands. Instead (3) should be understood as “implicitly” or “tacitly” claiming there is a “law” or regularity linking knee impacts to tipping over of inkwells. According to Hempel, it is the implicit claim that some such law holds that “distinguishes” (3) from “a mere sequential narrative” in which the spilling is said to follow the impact but without any claim of causal connection—a narrative that (Hempel thinks) would clearly not be explanatory. This linking law is the nomological premise in the DN argument that, according to Hempel, is “implicitly” asserted by (3).
The basic idea is thus that a proper explication of the role of causal claims in explanation leads via a Humean or regularity theory of causation, to the conclusion that, at least ideally, explanations should satisfy the DN/IS model. Let us call this line of argument the “hidden structure” argument in recognition of the role it assigns to a hidden (or at least non-explicit) DN structure that is claimed to be associated with (3).
At this point a comment is in order regarding a feature of this proposal that may seem puzzling. The boundaries of the category “scientific explanation” are far from clear, but while (3) is arguably an explanation, it is not what one usually thinks of as “science”—instead it is a claim from “ordinary life” or “common sense”. This raises the question of why adherents of the DN/IS model don’t simply respond to the alleged counterexample (3) by denying that it is an instance of the category “scientific explanation”—that is, by claiming that the DN/IS model is not an attempt to reconstruct the structure of explanations like (3) but is rather only meant to apply to explanations that are properly regarded as “scientific”. The fact that this response is not often adopted by advocates of the DN model is an indication of the extent to which, as noted in Section 1, it is implicitly assumed in most discussions of scientific explanation that there are important similarities or continuities in structure between explanations like (3) and explanations that are more obviously scientific and that these similarities that should be captured by some common account that applies to both. Indeed, it is a striking feature not just of Hempel (1965b) but of many other treatments of scientific explanation that much of the discussion in fact focuses on “ordinary life” singular causal explanations similar to (3), the tacit assumption being that conclusions about the structure of such explanations have fairly direct implications for understanding explanation in science.
2.5 Explanatory Understanding and Nomic Expectability: Counterexamples to Sufficiency
As explained above, examples like (3) are potential counterexamples to the claim that the DN model provides necessary conditions for explanation. There are also a number of well-known counterexamples to the claim that the DN model provides sufficient conditions for successful scientific explanation. Here are two illustrations.
Explanatory Asymmetries. There are many cases in which a derivation of an explanandum E from a law L and initial conditions I seems explanatory but a “backward” derivation of I from E and the same law L does not seem explanatory, even though the latter, like the former, appears to meet the criteria for successful DN explanation. For example, one can derive the length s of the shadow cast by a flagpole from the height h of the pole and the angle θ of the sun above the horizon and laws about the rectilinear propagation of light. This derivation meets the DN criteria and seems explanatory. On the other hand, the following derivation from the same laws also meets the DN criteria but does not seem explanatory:
- (4)
- One can derive the height h of a flagpole from the length s of its shadow and the angle θ of the sun above the horizon.
Examples like this suggest that at least some explanations possess directional or asymmetric features to which the DN model is insensitive.
Explanatory Irrelevancies. A derivation can satisfy the DN criteria and yet be a defective explanation because it contains irrelevancies besides those associated with the directional features of explanation. Consider an example due to Wesley Salmon (1971a: 34):
- (5)
-
- (L)
- All males who take birth control pills regularly fail to get pregnant
- (K)
- John Jones is a male who has been taking birth control pills regularly
- (E)
- John Jones fails to get pregnant
It is arguable that (L) meets the criteria for lawfulness imposed by Hempel and many other writers. (If one wants to deny that (L) is a law one needs some principled, generally accepted basis for this judgment and, as explained above, it is unclear what this basis is.) Moreover, (5) is certainly a sound deductive argument in which (L) occurs as an essential premise. Nonetheless, most people judge that (L) and (K) are no explanation of (E). There are many other similar illustrations. For example (Kyburg 1965), it is presumably a law (or at least an exceptionless, counterfactual supporting generalization) that all samples of table salt that have been hexed by being touched with the wand of a witch dissolve when placed in water. One may use this generalization as a premise in a DN derivation which has as its conclusion that some particular hexed sample of salt has dissolved in water. But again the hexing is irrelevant to the dissolving and such a derivation is no explanation.
One obvious diagnosis of the difficulties posed by examples like (4) and (5) focuses on the role of causation in explanation. According to this analysis, to explain an outcome we must cite its causes and (4) and (5) fail to do this. As Salmon (1989a: 47) puts it,
a flagpole of a certain height causes a shadow of a given length and thereby explains the length of the shadow.
By contrast,
the shadow does not cause the flagpole and consequently cannot explain its height.
Similarly, taking birth control pills does not cause Jones’ failure to get pregnant and this is why (5) fails to be an acceptable explanation. On this analysis, what (4) and (5) show is that a derivation can satisfy the DN criteria and yet fail to identify the causes of an explanandum—when this happens the derivation will fail to be explanatory.
As explained above, advocates of the DN model would not regard this diagnosis as very illuminating, unless accompanied by some account of causation that does not simply take this notion as primitive. (Salmon in fact provides such an account, which we will consider in Section 4.) We should note, however, that an apparent lesson of (4) and (5) is that the regularity account of causation favored by DN theorists is at best incomplete: the occurrence of c, e, and the existence of some regularity or law linking them (or x’s having property P and x’s having property Q and some law linking these) is not a sufficient condition for the truth of the claim that c caused or x’s having P is causally or explanatorily relevant to x’s having Q. More generally, if the counterexamples (4) and (5) are accepted, it follows that the DN model fails to state sufficient conditions for explanation. Explaining an outcome isn’t just a matter of showing that it is nomically expectable.
There are two possible reactions one might have to this observation. One is that the idea that explanation is a matter of nomic expectability is correct as far as it goes, but that something more is required as well. According to this assessment, the DN/IS model does state a necessary condition for successful explanation and, moreover, a condition that is a non-redundant part of a set of conditions that are jointly sufficient for explanation. However, some other, independent feature, X (which will account for the directional features of explanation and insure the kind of explanatory relevance that is apparently missing in the birth control example) must be added to the DN model to achieve a successful account of explanation. The idea is thus that Nomic Expectability + X = Explanation. Something like this idea is endorsed, by the unificationist models of explanation developed by Friedman (1974) and Kitcher (1989), which are discussed in Section 5 below.
A second, more radical possible conclusion is that the DN account of the goal or rationale of explanation is mistaken in some much more fundamental way and that the DN model does not even state necessary conditions for successful explanation. As noted above, unless the hidden structure argument is accepted, this conclusion is strongly suggested by examples like (3) (“The impact of my knee caused the tipping over of the inkwell”) which appear to involve explanation without the explicit citing of a law or a deductive structure.
Suggested Readings. The most authoritative and comprehensive statement of the DN and IS models is probably Hempel (1965b). This is reprinted in Hempel 1965a, along with a number of other papers that touch on various aspects of the problem of scientific explanation. In addition to the references cited in this section, Salmon (1989: 46ff.) describes a number of well-known counterexamples to the DN/IS models and discusses their significance.
3. The SR Model
3.1 The Basic Idea
Much of the subsequent literature on explanation has been motivated by attempts to capture the features of causal or explanatory relevance that appear to be left out of examples like (4) and (5), typically within the empiricist constraints described above. Wesley Salmon’s statistical relevance (or SR) model (Salmon 1971a) is a very influential attempt to capture these features in terms of the notion of statistical relevance or conditional dependence relationships. Given some class or population \(A\), an attribute \(C\) will be statistically relevant to another attribute \(B\) if and only if \(P(B\pmid A.C) \ne P(B\pmid A)\)—that is, if and only if the probability of \(B\) conditional on \(A\) and \(C\) is different from the probability of \(B\) conditional on \(A\) alone. The intuition underlying the SR model is that statistically relevant properties (or information about statistically relevant relationships) are explanatory and statistically irrelevant properties are not. In other words, the notion of a property making a difference for an explanandum is unpacked in terms of statistical relevance relationships.
As an illustration, suppose that in the birth control pills example (5) the original population T includes both sexes. Then
\[\begin{align} P(\text{Pregnancy} \pmid T.&\text{Male.Takes birth control pills})\\ & {} = P(\text{Pregnancy} \pmid T.\text{Male})\\ & {} = 0 \end{align} \]while
\[ \begin{align} P(\text{Pregnancy} \pmid T.&\text{Female.Takes birth control pills})\\ &{} \ne P(\text{Pregnancy} \pmid T.\text{Female}) \end{align} \]assuming that not all women in the population take birth control pills. In other words, if you are a male in this population, taking birth control pills is statistically irrelevant to whether you become pregnant, while if you are a female it is relevant. Thus taking birth control pills is explanatorily irrelevant to pregnancy among males but not among females.
To characterize the SR model more precisely we need the notion of a homogenous partition. A homogenous partition of \(A\) is a set of subclasses or cells \(C_i\) of \(A\) that are mutually exclusive and exhaustive, where \(P(B\pmid A.C_i) \ne P(B\pmid A.C_j)\) for all \(C_i \ne C_j\) and where no further statistically relevant partition of any of the cells \(A\), \(C_i\) can be made with respect to \(B\)—that is, there are no additional attributes \(D_k\) in \(A\) such that
\[P(B\pmid A.C_i) \ne P(B\pmid A.C_i.D_k).\]On the SR model, an explanation of why some member x of the class characterized by attribute \(A\) has attribute \(B\) consists of the following information:
- The prior probability of \(B\) within \(A\): \(P(B\pmid A) = p\).
- A homogeneous partition of A with respect to B, (\(A.C_1,\ldots ,A.C_n)\), together with the probability of B within each cell of the partition: \(P(B\pmid A.C_i) = p_i\) and
- The cell of the partition to which x belongs.
To employ one of Salmon’s examples, suppose we want to construct an SR explanation of why x who has a strep infection = S, recovers quickly = Q. Let \(T(-T)\) according to whether x is (is not) treated with penicillin, and \(R(-R)\) = according to whether x has a penicillin-resistant strain. Assume for the sake of argument that no other factors are relevant to quick recovery. There are four possible combinations of these properties: \(T.R,\) \(-T.R,\) \(T.{-R},\) \({-T}.{-R},\) but let us assume that
\[ \begin{align} P(Q\pmid S.T.R) & = P(Q\pmid S.{-T}.R)\\ & = P(Q\pmid S.{-T}.{-R})\\ & \ne P(Q\pmid S.{T}.{-R})\\ \end{align} \]That is, the probability of quick recovery, given that one has strep, is the same for those who have the resistant strain regardless of whether or not they are treated and also the same for those who have not been treated. By contrast, the probability of recovery is different (presumably greater) among those with strep who have been treated and do not have the resistant strain.
In this case
\[[S.(T.R \lor {-T}.R \lor {-R}.{-T})], [S.T.{-R}]\]is a homogenous partition of S with respect to Q. The SR explanation of x’s recovery will consist of a statement of the probability of quick recovery among all those with strep ((i) above), a statement of the probability of recovery in each of the two cells of the above partition ((ii) above), and the cell to which x belongs, which is \(S.T.R\) ((iii) above). Intuitively, the idea is that this information tells us about the relevance of each of the possible combinations of the properties T and R to quick recovery among those with strep and is explanatory for just this reason.
3.2 The SR Model and Low Probability Events
The SR model has a number of distinctive features that have generated substantial discussion. First, note that according to the SR model, and in contrast to the DN/IS model, an explanation is not an argument—either in the sense of a deductively valid argument in which the explanandum follows as a conclusion from the explanans or in the sense of a so-called inductive argument in which the explanandum follows with high probability from the explanans, as in the case of IS explanation. Instead, an explanation is an assembly of information that is statistically relevant to an explanandum. Salmon argues (and takes the birth control example (5) to illustrate) that the criteria that a good argument must satisfy (e.g., criteria that insure deductive soundness or some inductive analogue) are simply different from those a good explanation must satisfy. Among other things, as Salmon puts it, “irrelevancies [are] harmless in arguments but fatal in explanations” (1989: 102). As explained above, in associating successful explanation with the provision of information about statistical relevance relationships, the SR model attempts to accommodate this observation.
A second, closely related point is that the SR model departs from the IS model in abandoning the idea that a statistical explanation of an outcome must provide information from which it follows the outcome occurred with high probability. As the reader may check, the statement of the SR model above imposes no such high probability requirement; instead, even very unlikely outcomes will be explained as long as the criteria for SR explanation are met. Suppose that, in the above example, the probability of quick recovery from strep, given treatment and the presence of a non-resistant strain, is rather low (e.g., 0.2). Nonetheless, if the criteria (i)–(iii) above—a homogeneous partition with correct probability values for each cell in the partition—are satisfied, we may use this information to explain why x, who had a non-resistant strain of strep and was treated, recovered quickly. Indeed, according to the SR model, we may explain why some x which is A is B, even if the conditional probability of B given A and the cell \(C_i\) to which x belongs \((p_i = P(B\pmid A.C_i))\) is less than the prior probability \((p = P(B\pmid A))\) of B in A. For example, if the prior probability of quick recovery among all those with any form of strep is 0.5 and the probability of quick recovery of those with a resistant strain who are untreated is 0.1, we may nonetheless explain why y, who meets these last conditions \(({-T}.R),\) recovered quickly (assuming he did) by citing the cell to which he belongs, the probability of recovery given that he falls in this cell, and the other sort of information described above. More generally, what matters on the SR model is not whether the value of the probability of the explanandum-outcome is high or low (or even high or low in comparison with its prior probability) but rather whether the putative explanans cites all and only statistically relevant factors and whether the probabilities it invokes are correct. One consequence of this, which Salmon endorses while acknowledging that many will regard it as unintuitive, is that on the SR model, the same explanans E may explain both an explanandum M and explananda that are inconsistent with M, such as \(-M\). For example, the same explanans will explain both why a subject with strep and certain other properties (e.g., T and \(-R\)) recovers quickly, if he does, and also why he does not recover if he does not. By contrast, on the DN or IS models, if E explains M, E cannot also explain \(-M\).
This judgment that, contrary to the IS model, the value that a candidate explanans assigns to an explanandum-outcome should not matter for the goodness of the explanation, is motivated as follows: When an outcome is the result of a genuinely indeterministic process we understand both high probability and low probability outcomes (the latter of which of course will sometimes occur) equally well: in both cases, once an SR model has been constructed, there are no additional factors that distinguish the two outcomes.
3.3 What Do Statistical Theories Explain? What Sorts of Examples are Accounts of Statistical Explanation Intended to Capture?
Stepping back from the issues (such as the status of the high probability requirement) that have dominated discussions of statistical explanation, there are several more general issues that deserve mention. One is that these models have been applied to a range of examples that seem prima-facie to be quite different, including quantum mechanical examples (e.g., radioactive decay) but also more ordinary examples such as recovery from disease and (to use an example of Salmon’s) causes of juvenile delinquency. Radioactive decay is a process that is usually taken to be irreducibly indeterministic and hence is the sort of thing that Salmon’s objective homogeneity requirement is designed to capture. By contrast, although the evidence for many models of juvenile delinquency comes from population level statistics these models do not assume that delinquency is the outcome of an irreducibly indeterministic process (and the models themselves are very far from satisfying an objective homogeneity requirement). Indeed, taken literally, standard causal models assume the opposite: the assumed equations are deterministic with the stochastic element supplied by an “error term”. This raises the question of whether it is sensible to look for a single model that captures all of these examples.
A second, more radical assessment focuses on the question of whether it is appropriate to think of the sorts of statistical theories and hypotheses on which Hempel and Salmon discuss as explaining individual events or outcomes at all. For example, why not instead take quantum mechanics to explain (i) the probabilities with which individual outcomes like decay events occur but not (ii) those individual outcomes themselves? If we adopt (i) the relationship between a quantum mechanical model and such explananda will be deductive and thus subsumable under whatever model of deductive explanation we favor. This raises the question of what additional work is accomplished by models of statistical explanation of either the IS or SR sort.
3.4 Causation and Statistical Relevance Relationships
Putting aside the issues raised in the previous section, the SR model embodies several generic assumptions of ongoing philosophical interest. In particular the model assumes that (i) explanations must cite causal relationships and that (ii) causal relationships are fully captured by statistical relevance (or conditional dependence and independence) relationships. While (i) is a matter of current controversy, (ii) is clearly false. As a substantial body of work[5] has made clear, causal relationships are greatly underdetermined by statistical relevance relationships, even given additional assumptions. For example, a structure in which B is a common cause of the joint effects A and C implies (assuming the Causal Markov assumption which is the appropriate generalization of Salmon’s assumptions connecting causation and probability) implies the same statistical relevance relations as a chain structure in which A causes B which causes C.
Selected Readings. Salmon (1971a) provides a detailed statement and defense of the SR model. This essay, as well as papers by Jeffrey (1969) and Greeno (1970) which defend views broadly similar to the SR model, are collected in Salmon (1971b). Additional discussion of the model as well as a more recent characterization of “objective homogeneity” can be found in Salmon (1984). Cartwright (1979) contains some influential criticisms of the SR model. Theorems specifying the precise extent of the underdetermination of causal claims by evidence about statistical relevance relationships can be found in Spirtes, Glymour and Scheines (1993 [2000: chapter 4]). For additional discussion of “screening off” and the principle of the common cause, see the entry on Reichenbach’s Principle of the Common Cause.
4. The Causal Mechanical Model
4.1 The Basic Idea
In more recent work (especially, Salmon 1984) Salmon abandoned the attempt to characterize explanation or causal relationships in purely statistical terms. Instead, he developed a new account which he called the Causal Mechanical (CM) model of explanation—an account which is similar in both content and spirit to so-called causal process theories of causation of the sort defended by philosophers like Philip Dowe (2000). We may think of the CM model as an attempt to capture the “something more” involved in causal and explanatory relationships over and above facts about statistical relevance, again while attempting to remain within a broadly Humean framework.
The CM model employs several central ideas. A causal process is a physical process, like the movement of a baseball through space, that is characterized by the ability to transmit a mark in a continuous way. (“Continuous” generally, although perhaps not always, means “spatio-temporally continuous”.) Intuitively, a mark is some local modification to the structure of a process—for example, a scuff on the surface of a baseball or a dent an automobile fender. A process is capable of transmitting a mark if, once the mark is introduced at one spatio-temporal location, it will persist to other spatio-temporal locations even in the absence of any further interaction. In this sense the baseball will transmit the scuff mark from one location to another. Similarly, a moving automobile is a causal process because a mark in the form of a dent in a fender will be transmitted by this process from one spatio-temporal location to another. Causal processes contrast with pseudo-processes which lack the ability to transmit marks. An example is the shadow of a moving physical object. The intuitive idea is that, if we try to mark the shadow by modifying its shape at one point (for example, by altering a light source or introducing a second occluding object), this modification will not persist unless we continually intervene to maintain it as the shadow occupies successive spatio-temporal positions. In other words, the modification will not be transmitted by the structure of the shadow itself, as it would in the case of a genuine causal process.
We should note for future reference that, as characterized by Salmon, the ability to transmit a mark is clearly a counterfactual notion, in several senses. To begin with, a process may be a causal process even if it does not in fact transmit any mark, as long as it is true that if it were appropriately marked, it would transmit the mark. Moreover, the notion of marking itself involves a counterfactual contrast—a contrast between how a process behaves when marked and how it would behave if left unmarked. Although Salmon, like Hempel, has always been suspicious of counterfactuals, his view at the time that he first introduced the CM model was that the counterfactuals involved in the characterization of mark transmission were relatively unproblematic, in part because they seemed experimentally testable in a fairly direct way. Nonetheless the reliance of the CM model, as originally formulated, on counterfactuals shows that it does not completely satisfy the Humean strictures described above. In subsequent work, described in Section 4.4 below, Salmon attempted to construct a version of the CM model that completely avoids reliance on counterfactuals.
The other major element in Salmon’s model is the notion of a causal interaction. A causal interaction involves a spatio-temporal intersection between two causal processes which modifies the structure of both—each process comes to have features it would not have had in the absence of the interaction. A collision between two cars that dents both is a paradigmatic causal interaction.
According to the CM model, an explanation of some event E will trace the causal processes and interactions leading up to E (Salmon calls this the etiological aspect of the explanation), or at least some portion of these, as well as describing the processes and interactions that make up the event itself (the constitutive aspect of explanation). In this way, the explanation shows how E “fit[s] into a causal nexus”(1984: 9). For example, when two billiard balls collide (event E), the trajectory of each of the balls is a causal process (as shown by the fact that if the balls were scratched, such marks would persist) and their collision is a causal interaction. Explaining E will involve both tracing these trajectories and noting that E involves an interaction.
4.2 The CM Model and Explanatory Relevance
As the billiard example illustrates, the CM model takes as its paradigms of causal interaction examples such as collisions in which there is “action by contact” and no spatio-temporal gaps in the transmission of causal influence. There is little doubt that explanations in which there are no such gaps (no “action at a distance”) often strike us as particularly satisfying. However, as Christopher Hitchcock shows in an illuminating paper (Hitchcock 1995), even here the CM model leaves out something important. Consider the usual elementary textbook “scientific explanation” of the motion of the balls in the above example following their collision. This explanation proceeds by deriving that motion from information about their masses and velocity before the collision, the assumption that the collision is perfectly elastic, and the law of the conservation of linear momentum. We usually think of the information conveyed by this derivation as showing that it is the mass and velocity of the balls, rather than, say, the scratches on their surface or chalk that may be transmitted by the cue stick that is explanatorily relevant to their subsequent motion. However, it is hard to see what in the CM model allows us to pick out the linear momentum of the balls, as opposed to these other features, as explanatorily relevant. Part of the difficulty is that to express such relatively fine-grained judgments of explanatory relevance (that it is linear momentum rather than chalk marks that matters) we need to talk about relationships between properties or magnitudes and it is not clear how to express such judgments purely in terms of facts about causal processes and interactions. Both the linear momentum and the chalk mark communicated to the cue ball by the cue stick are marks transmitted by the spatio-temporally continuous causal process consisting of the motion of the cue ball.
Ironically, as Hitchcock goes on to note, a similar observation may be made about the birth control pills example (5) originally devised by Salmon to illustrate the failure of the DN model to capture the notion of explanatory relevance. Spatio-temporally continuous causal processes that transmit marks as well as causal interactions are at work when male Mr. Jones ingests birth control pills—the pills dissolve, components enter his bloodstream, are metabolized or processed in some way, and so on. Similarly, spatio-temporally continuous causal processes (albeit different processes) are at work when female Ms. Jones takes birth control pills. However, the pills are irrelevant to Mr. Jones’s non-pregnancy, and relevant to Ms. Jones’s non-pregnancy. Again, it looks as though the relevance or irrelevance of the birth control pills to Mr. or Ms. Jones’s failure to become pregnant cannot be captured just by asking whether the processes leading up to these outcomes are causal processes in Salmon’s sense.
A more general way of putting the problem revealed by these examples is that those features of a process P in virtue of which it qualifies as a causal process (ability to transmit mark M) may not be the features of P that are causally or explanatorily relevant to the outcome E that we want to explain (M may be irrelevant to E with some other property R of P being the property which is causally relevant to E). So while mark transmission may well be a criterion that correctly distinguishes between causal processes and pseudo-processes, it does not, as it stands, provide the resources for distinguishing those features or properties of a causal process that are causally or explanatorily relevant to an outcome and those features that are irrelevant.
4.3 The CM Model and Complex Systems
A second set of worries has to do with the application of the CM model to systems which depart in various respects from simple physical paradigms such as the collision described above. There are a number of examples of such systems. First, there are theories like Newtonian gravitational theory which involve “action at a distance” in a physically interesting sense. Second, there are a number of examples from the literature on causation that do not involve physically interesting forms of action at a distance but which arguably involve causal interactions without intervening spatio-temporally continuous processes or transfer of energy and momentum from cause to effect. These include cases of causation by omission and causation by “double prevention” or “disconnection.”[6] In all these cases, a literal application of the CM model seems to yield the judgment that no explanation has been provided—that Newtonian gravitational theory is unexplanatory and so on. Many philosophers have been reluctant to accept this assessment.
Yet another class of examples that raise problems for the CM model involves putative explanations of the behavior of complex or “higher level” systems—explanations that do not explicitly cite spatio-temporally continuous causal processes involving transfer of energy and momentum, even though we may think that such processes are at work at a more “underlying” level. Most explanations in disciplines like biology, psychology and economics fall under this description, as do a number of straightforwardly physical explanations.
As an illustration, suppose that a mole of gas is confined to a container of volume \(V_1,\) at pressure \(P_1,\) and temperature \(T_1\). The gas is then allowed to expand isothermally into a larger container of volume \(V_2\). One standard way of explaining the behavior of the gas—its rate of diffusion and its subsequent equilibrium pressure \(P_2\)—appeals to the generalizations of phenomenological thermodynamics—e.g., the ideal gas law, Graham’s law of diffusion, and so on. Salmon appears to regard putative explanations based on at least the first of these generalizations as not explanatory because they do not trace continuous causal processes—he thinks of the individual molecules as causal processes but not the gas as a whole.[7] However, it is plainly impossible to trace the causal processes and interactions represented by each of the \(6 \times 10^{23}\) molecules making up the gas and the successive interactions (collisions) it undergoes with every other molecule. Even the usual statistical mechanical treatment, which Salmon presumably would regard as explanatory, does not attempt to do this. Instead, it makes certain general assumptions about the distribution of molecular velocities and the forces involved in molecular collisions and then uses these, in conjunction with the laws of mechanics, to derive and solve a differential equation (the Boltzmann transport equation) describing the overall behavior of the gas. This treatment abstracts radically from the details of the causal processes involving particular individual molecules and instead focuses on identifying higher level variables that aggregate over many individual causal processes and that figure in general patterns that govern the behavior of the gas.
This example raises a number of questions. Just what does the CM model require in the case of complex systems in which we cannot trace individual causal processes, at least at a fine-grained level? How exactly does the causal mechanical model avoid the (disastrous) conclusion that any successful explanation of the behavior of the gas must trace the trajectories of individual molecules? Does the statistical mechanical explanation described above successfully trace causal processes and interactions or specify a causal mechanism in the sense demanded by the CM model, and if so, what exactly does tracing causal processes and interactions involve or amount to in connection with such a system? A fully adequate development of the CM model needs to address such questions.
There is another aspect of this example that is worthy of comment. Suppose that a particular sample of gas expands in a way that meets the conditions described above and that it is somehow possible to provide an account that traces each of the individual molecular trajectories of its component molecules. Such an account would nonetheless leave out information that seems explanatorily relevant. This information has to do with the fact that there are a large number of alternative trajectories besides the actual trajectories followed by the component molecules on this particular occasion that would lead to the same final pressure \(P_2\) which is what we want to explain. Modal information of this sort—both about what would happen if the molecules followed different trajectories consistent with the initial conditions \(P_1\), \(V_1\) and \(T_1\) and what would happen if they instead followed initial conditions consistent trajectories consistent with different macroscopic initial conditions is better captured by a treatment in terms of upper level thermodynamic variables.
4.4 More Recent Developments
In more recent work (e.g., Salmon 1994), prompted in part by a desire to avoid certain counterexamples advanced by Philip Kitcher (1989) to his characterization of mark transmission, Salmon attempted to fashion a theory of causal explanation that completely avoids any appeal to counterfactuals. In this new theory which is influenced by the conserved process theory of causation of Dowe (2000), Salmon defined a causal process as a process that transmits a non-zero amount of a conserved quantity at each moment in its history. Conserved quantities are quantities so characterized in physics—linear momentum, angular momentum, charge, and so on. A causal interaction is an intersection of world lines associated with causal processes involving exchange of a conserved quantity. Finally, a process transmits a conserved quantity from A to B if it possesses that quantity at every stage without any interactions that involve an exchange of that quantity in the half-open interval \((A,B]\).[8]
One may doubt that this new theory really avoids reliance on counterfactuals, but an even more fundamental difficulty is that it still does not adequately deal with the problem of causal or explanatory relevance described above. That is, we still face the problem that the feature that makes a process causal (transmission of some conserved quantity or other) may tell us little about which features of the process are causally or explanatorily relevant to the outcome we want to explain. For example, a moving billiard ball will transmit many conserved quantities (linear momentum, angular momentum, charge etc.) and many of these may be exchanged during a collision with another ball. What is it that entitles us to single out the linear momentum of the balls, rather than these other conserved quantities as the property that is causally relevant to their subsequent motion? In cases in which there appear to be no conservation laws governing the explanatorily relevant property (i.e., cases in which the explanatorily relevant variables are not conserved quantities) this difficulty seems even more acute. Properties like “having ingested birth control pills,” “being pregnant”, or “being a sample of hexed salt” do not themselves figure in conservation laws. While one may say that both birth control pills and hexed salt are causal processes because both consist, at some underlying level, of processes that unambiguously involve the transmission of conserved quantities like mass and charge, this observation does not by itself tell us what, if anything, about these underlying processes is relevant to pregnancy or dissolution in water.
In a more recent paper (Salmon 1997), Salmon conceded this point. He agreed that the notion of a causal process cannot by itself capture the notion of causal and explanatory relevance. He suggested, however, that this notion can be adequately captured by appealing to the notion of a causal process and information about statistical relevance relationships (that is, information about conditional and unconditional (in)dependence relationships), with the latter capturing the element of causal or explanatory dependence that was missing from his previous account:
I would now say that (1) statistical relevance relations, in the absence of information about connecting causal processes, lack explanatory import and that (2) connecting causal processes, in the absence of statistical relevance relations, also lack explanatory import. (1997: 476)
This suggestion is not developed in any detail in Salmon’s paper, and it is not easy to see how it can be made to work. We noted above that statistical relevance relationships often greatly underdetermine the causal relationships among a set of variables. What reason is there to suppose that appealing to the notion of a causal process, in Salmon’s sense, will always or even usually remove this indeterminacy? We also noted that the notion of a causal process cannot capture fine-grained notions of relevance between properties, that there can be causal relevance between properties instances of which (at least at the level of description at which they are characterized) are not linked by spatio-temporally continuous or transference of conserved quantities, and that properties can be so linked without being causally relevant (recall the chalk mark that is transmitted from one billiard ball to another). As long as it is possible (and why should it not be?) for different causal claims to imply the same facts about statistical relevance relationships and for these claims to differ in ways that cannot be fully cashed out in terms of Salmon’s notions of causal processes and interactions, this new proposal will fail as well.
Selected Readings: Salmon (1984) provides a detailed statement of the Causal Mechanical model, as originally formulated. Salmon (1994 and 1997) provide a restatement of the model and respond to criticisms. For discussion and criticism of the CM model, see Kitcher (1989, especially pages 461ff), Woodward (1989), and Hitchcock (1995).
5. A Unificationist Account of Explanation
5.1 The Basic Idea
In unificationist accounts of explanation developed by philosophers, scientific explanation is a matter of providing a unified account of a range of different phenomena.[9] This idea is unquestionably intuitively appealing. Successful unification may exhibit connections or relationships between phenomena previously thought to be unrelated and this seems to be something that we expect good explanations to do. Moreover, theory unification has clearly played an important role in science. Paradigmatic examples include Newton’s unification of terrestrial and celestial theories of motion and Maxwell’s unification of electricity and magnetism. The key question, however, is whether (and which) intuitive notions of unification can be made more precise in a way that allows us to recover the features that we think that good explanations should possess.
Michael Friedman (1974) is an important early attempt to do this. Friedman’s formulation of the unificationist idea was subsequently shown to suffer from various technical problems (Kitcher 1976) and subsequent development of the unificationist treatment of explanation has been most closely associated with Philip Kitcher (especially Kitcher 1989).
Let us begin by introducing some of Kitcher’s technical vocabulary. A schematic sentence is a sentence in which some of the nonlogical vocabulary has been replaced by dummy letters. To use Kitcher’s examples, the sentence “Organisms homozygous for the sickling allele develop sickle cell anemia” is associated with a number of schematic sentences including “Organisms homozygous for A develop P” and “For all X if X is O and A then X is P”. Filling instructions are directions that specify how to fill in the dummy letters in schematic sentences. For example, filling instructions might tell us to replace A with the name of an allele and P with the name of a phenotypic trait in the first of the above schematic sentences. Schematic arguments are sequences of schematic sentences. Classifications describe which sentences in schematic arguments are premises and conclusions and what rules of inference are used. An argument pattern is an ordered triple consisting of a schematic argument, a set or sets of filling instructions, one for each term of the schematic argument, and a classification of the schematic argument. The more restrictions an argument pattern imposes on the arguments that instantiate it, the more stringent it is said to be.
Roughly speaking, Kitcher’s guiding idea is that explanation is a matter of deriving descriptions of many different phenomena by using as few and as stringent argument patterns as possible over and over again-the fewer the patterns used, the more stringent they are, and the greater the range of different conclusions derived, the more unified our explanations. Kitcher summarizes this view as follows:
Science advances our understanding of nature by showing us how to derive descriptions of many phenomena, using the same pattern of derivation again and again, and in demonstrating this, it teaches us how to reduce the number of facts we have to accept as ultimate. (Kitcher 1989: 432)
Kitcher does not propose a completely general theory of how the various considerations he describes—number of conclusions, number of patterns and stringency of patterns—are to be traded-off against one another, but does suggest that it often will be clear enough what these considerations imply about the evaluation of particular candidate explanations. His basic strategy is to attempt to show that the derivations we regard as good or acceptable explanations are instances of patterns that, taken together, score better according to the criteria just described than the patterns instantiated by what we regard as defective explanations. Following Kitcher, let us define the explanatory store \(E(K)\) as the set of argument patterns that maximally unifies K, the set of beliefs accepted at a particular time in science. Showing that a particular derivation is a good or acceptable explanation is then a matter of showing that it belongs to the explanatory store.
5.2 Illustrations of the Unificationist Model
As an illustration, consider Kitcher’s treatment of the problem of explanatory asymmetries (recall Section 2.5). Our present explanatory practices—call these P—are committed to the idea that derivations of a flagpole’s height from the length of its shadow are not explanatory. Kitcher compares P with an alternative systemization in which such derivations are regarded as explanatory. According to Kitcher, P includes the use of a single “origin and development” (OD) pattern of explanation, according to which the dimensions of objects-artifacts, mountains, stars, organisms etc. are traced to “the conditions under which the object originated and the modifications it has subsequently undergone” (1989: 485). Now consider the consequences of adding to P an additional pattern S (the shadow pattern) which permits the derivation of the dimensions of objects from facts about their shadows. Since the OD pattern already permits the derivation of all facts about the dimensions of objects, the addition of the shadow pattern S to P will increase the number of argument patterns in P and will not allow us to derive any new conclusions. On the other hand, if we were to drop OD from P and replace it with the shadow pattern, we would have no net change in the number of patterns in P, but would be able to derive far fewer conclusions than we would with OD, since many objects do not have shadows (or enough shadows) from which to derive all of their dimensions. Thus OD belongs to the explanatory store, and the shadow pattern does not. Kitcher’s treatment of other familiar problem cases is similar. For example he claims that explanations that contain irrelevancies (such as Salmon’s birth control pills case are less unifying than competing explanations that do not contain such irrelevancies.
Kitcher acknowledges that there is nothing in the unificationist account per se that requires that all explanation be deductive: “there is no bar in principle to the use of non-deductive arguments in the systemization of our beliefs”. Nonetheless, “the task of comparing the unifying power of different systemizations looks even more formidable if non-deductive arguments are considered” and in part for this reason Kitcher endorses the view that “in a certain sense, all explanation is deductive” (1989: 448).
What is the role of causation on this account? Kitcher claims that “the ‘because’ of causation is always derivative from the ‘because’ of explanation.” (1989: 477). That is, our causal judgments simply reflect the explanatory relationships that fall out of our (or our intellectual ancestors’ ) attempts to construct unified theories of nature. There is no independent causal order over and above this which our explanations must capture. Like many other philosophers, Kitcher takes very seriously (even if in the end he perhaps does not fully endorse) standard empiricist or Humean worries about the epistemic accessibility and intelligibility of causal claims. Taking causal, counterfactual or other notions belonging to the same family as primitive in the theory of explanation is problematic. Kitcher believes that it is a virtue of his theory that it does not do this. Instead, Kitcher proposes to begin with the notion of explanatory unification, characterized in terms of constraints on deductive systemizations, where these constraints can be specified in a quite general way that is independent of causal or counterfactual notions, and then show how the causal claims we accept derive from our efforts at unification.
5.3 The Illustrations Criticized
As remarked at the beginning of this section, the idea that explanation is connected in some way to unification is intuitively appealing. Nonetheless Kitcher’s particular way of cashing out this connection has been subject to criticism. In connection with Kitcher’s treatment of explanatory asymmetries, consider, following Barnes (1992), a time-symmetric theory like Newtonian mechanics, applied to a closed system like the solar system. Call derivations of the state of motion of planets at some future time t from information about their present positions (at time \(t_0\)), masses, and velocities, the forces incident on them at \(t_0\), and the laws of mechanics predictive. Now contrast such derivations with retrodictive derivations in which the present motions of the planets are derived from information about their future velocities and positions at t, the forces operative at t, and so on. It looks as though there will be just as many retrodictive derivations as predictive derivations, and each will require premises of exactly the same general sort—information about positions, velocities, masses, etc. and the same laws. Thus the pattern or patterns instantiated by the retrodictive derivations look(s) exactly as unified as the pattern or patterns associated with the predictive derivations. However, we ordinarily think of the predictive derivations and not the retrodictive derivations as explanatory and the present state of the planets as the cause of their future state and not vice-versa.
One possible response to this second example is to bite the bullet and to argue that from the point of view of fundamental physics, there really is no difference in the explanatory import of the retrodictive and predictive derivations, and that it is a virtue, not a defect, of the unificationist approach that it reproduces this judgment. Whatever might be said in favor of this response, it is not Kitcher’s. His claim is that our ordinary judgments about causal asymmetries can be derived from the unificationist account. The example just described casts doubt on this claim. More generally, it casts doubt on Kitcher’s contention that one can begin with the notion of explanatory unification, understood in a way that does not presuppose causal notions, and use it to derive the content of causal judgments.
Selected Readings: The most detailed statement of Kitcher’s position can be found in Kitcher (1989). Salmon (1989: 94ff.) contains a critical discussion of Friedman’s version of the unificationist account of explanation but ends by advocating a “rapprochement” between unificationist approaches and Salmon’s own causal mechanical model. Woodward (2003) contains additional criticisms of Kitcher’s version of unificationism.
6. Pragmatic Theories of Explanation
6.1 Introduction
Despite their many differences, the accounts of Hempel (focusing now on just the DN rather than the IS model), Salmon, Kitcher, and others discussed above, largely share a common overall conception of what the project of constructing a theory of explanation should involve and (to a considerable extent) what criteria such a theory should satisfy if it is to be successful. “Pragmatic” theories of explanation depart from this consensus in important respects. Let us say that a theory of explanation contains “pragmatic” elements if (i) according to the theory, those elements require irreducible reference to facts about the interests, beliefs or other features of the psychology of those providing or receiving the explanation and/or (ii) irreducible reference to the “context” in which the explanation occurs. (For what this means, see below.) Although the writers discussed above agree that pragmatic elements play some role in the activity of giving and receiving explanations, they assume that there is a non-pragmatic core to the notion of explanation, which it is the central task of a theory of explanation to capture. That is, it is assumed that this core notion can be specified without reference to psychological features of explainers or their audiences and with reference to non-contextual features that are sufficiently general, abstract and “structural,” in the sense that they hold across a range of explanations with different contents and across a range of different contexts. Relations like deductive entailment or statistical relevance are examples of candidates for such “structural” relationships. In addition, these writers see the goal of a theory of explanation, as capturing the notion of a correct explanation, as opposed to the notion of an explanation’s being considered explanatory by a particular audience or not, a matter which presumably depends on such considerations as whether the audience understands the terms in which the explanation is framed. (In this sense, “correctness” requires (at least) that the explanans be true or “well-confirmed” and that the explanans stands in the right relationship to the explanandum.) Finally, as noted in the Introduction to this entry, writers in this tradition have not had the goal of capturing all the various ways in which the word “explanation” is used in ordinary English. They have instead focused on a much more restricted class of examples in which what is of interest is (something like) explaining “why” some outcome or general phenomenon occurred, as opposed to explaining, e.g., the meaning of a word or how to solve a differential equation. The motivation for this restriction is simply the judgment that an interesting and non-trivial theory is more likely to emerge if it is restricted in scope in this way. For ease of reference, let us call this the “traditional” conception of the task of a theory of explanation.
Some or all of these assumptions and goals are rejected in pragmatic accounts of explanation. Early contributors to this approach include Michael Scriven (e.g., 1962) and Sylvan Bromberger (e.g., 1966), with more systematic statements, due to van Fraassen (1980) and Achinstein (1983) appearing in the 1980s. Since it is not always clear just what the points of disagreement are between pragmatic and traditional accounts, some orienting remarks about this will be useful before turning to details. Defenders of pragmatic approaches to explanation typically stress the point that whether provision of a certain body of information to some audience produces understanding or is illuminating for that audience depends on the background knowledge of the audience members and on other factors having to do with the local context. For example, an explanation of the deflection of starlight by the sun that appeals to the field equations of General Relativity may be highly illuminating to a trained physicist but unintelligible to a layperson because of their background. Factors of this sort are grouped together as “pragmatic” and their influence is taken to illustrate at least one way in which pragmatic considerations enter into the notion of explanation.
Taken in itself, the observation just described seems completely uncontroversial and not in conflict with traditional approaches to explanation. Indeed, as remarked above writers like Hempel and Salmon explicitly agree that explanation has a pragmatic dimension in the sense just described—in fact, Hempel invokes the role of pragmatic factors at a number of points to address prima-facie counterexamples to the DN model. This suggests that, often at least, what is distinctive about pragmatic approaches to explanation is not just the bare idea that explanation has a “pragmatic dimension” but rather the much stronger claim that the traditional project of constructing a model of explanation pursued by Hempel and others has so far been unsuccessful ( and perhaps is bound to be unsuccessful) and that this is so because pragmatic or contextual factors play a central and ineliminable role in explanation in a way that resists incorporation into models of the traditional sort. On this view, much of what is distinctive about pragmatic accounts is their opposition to traditional accounts and their diagnosis of why such accounts fail—they fail because they omit pragmatic or contextual elements. It will be important to keep this point in mind in what follows because there is a certain tendency among advocates of pragmatic theories to argue as though the superiority of their approach is established simply by the observation that explanation has a pragmatic dimension; instead it seems more appropriate to think that the real issue is whether traditional approaches are inadequate in principle because of their neglect of the pragmatic dimension of explanation.
A second issue concerns an important ambiguity in the notion of “pragmatic”. On one natural understanding of this notion, a pragmatic consideration is one that has to do with utility or usefulness in the service of some goal connected to human interests, where these interests are in some relevant sense “practical”. Call this notion “pragmatic1”. On this construal, Hempel’s DN model might be correctly characterized as a pragmatic1 theory since it links explanatory information closely to the provision of information that is useful for purposes of prediction and prediction certainly qualifies as a pragmatic goal. For similar reasons, Woodward’s (2003) theory of explanation might also be counted as a pragmatic1 theory since it connects explanation with the provision of information that is useful for manipulation and control—unquestionably useful goals. As these examples suggest, models of explanation that aspire to traditional goals can be pragmatic1 theories.
In the context of theories of explanation, however, the label “pragmatic” is usually intended to suggest a somewhat different set of associations. In particular, as noted above, “pragmatic” is typically used to characterize considerations having to do with facts about the psychology (interests, beliefs etc.) of those involved in providing or receiving explanations and/or to characterize considerations involving the local context, often with the suggestion that both sets of considerations may vary in complex and idiosyncratic ways that resist incorporation into the sort of general theory sought by traditional models.[10] Call this set of associations “pragmatic2”. Neither Hempel’s nor Woodward’s theory is pragmatic2. In particular, as the example of the DN model illustrates, the fact that a theory is pragmatic1 in the sense that it appeals to facts about goals generally shared by human beings (such as prediction) to help to motivate a model of explanation does not preclude attempting to construct models of explanation satisfying traditional goals and does not require commitment to the idea that explanation must be understood as a pragmatic2 notion. We need to be careful to distinguish these two different ways of thinking about the “pragmatic” dimension of explanation.
Finally, as emphasized above, a concern with the pragmatics of explanation naturally connects with an interest in the “psychology” of explanation and this in turn suggests the relevance of empirical studies of sorts of information that various subjects (ordinary folks, scientists) find explanatory, treat as providing “understanding”, the distinctions subjects make among explanations and so on. Although there is a growing literature in this area, the most prominent philosophical advocates of pragmatic approaches to explanation have so far tended not to make use of it. In this connection, it is worth pointing out that this psychological literature goes well beyond the truisms found in philosophical discussion about different people finding different sorts of information explanatory depending on their interests. In particular, psychologists have been very interested in exploring general features or structural patterns present in information that various subjects find explanatory. For example, Lombrozo (2010) finds evidence that subjects prefer explanations that appeal to relationships that are relatively stable (in the sense of continuing to hold across changing circumstances)[11] and Lien and Cheng (2000) present evidence that in cases in which the explanandum \(E\) has a single candidate cause \(C\), subjects prefer levels of explanation/causal description that maximize \(\Delta p = \Pr(E\pmid C) - \Pr(E \pmid \text{not-}C)\).
Notice that in both cases these are relationships or patterns of the sort that traditional accounts of explanation attempt to capture. As these examples bring out, there is no necessary incompatibility between the project of trying to formulate an account of explanation that satisfies traditional goals and an interest in the psychology of explanation. It may be that subjects find certain sorts of information explanatory or understanding-producing because certain structural features of the sort that traditional accounts attempt to characterize are present in that information—indeed this is what the Lombrozo and Lien and Cheng papers suggest. Thus, we should distinguish the project of investigating the empirical psychology of explanation (which can be pursued with a variety of different commitments about how to best theorize about explanation) from the more specific claim that the characterization of what it is for an explanatory relationship to hold between explanans and explanandum must be given in “psychologistic” terms, in the sense that this requires irreducible reference to psychological facts about particular audiences such as the vagaries of what they happen to be interested in.
6.2 Constructive Empiricism and the Pragmatic Theory of Explanation
One of the most influential recent pragmatic accounts of explanation is associated with constructive empiricism.[12] This is the thesis, defended by Bas van Fraassen in his 1980 book, The Scientific Image, that the aim of science (or at least “pure” science) is the construction of theories that are “empirically adequate” (that is, that yield a true or correct description of observables) and not, as scientific realists suppose, theories that aim to tell literally true stories about unobservables. Relatedly, “acceptance” of a theory involves only the belief that it is empirically adequate (van Fraassen 1980: 12). van Fraassen’s account of explanation, which is laid out in several articles and, most fully, in Chapter Six of his book, is meant to fit with this overall conception of science: it is a conception according to which explanation per se is not an epistemic aim of “pure” science (empirical adequacy is the only such aim), but rather a “pragmatic” virtue, having to do with the “application” of science. (Note that the application of science is arguably a matter of pragmatics1. However, the idea that explanation has to do with the application of science is used to motivate the adoption of a pragmatic2 theory of explanation. We thus have an elision of the two notions of “pragmatic” distinguished above.) According to van Fraassen, because explanation is a merely pragmatic virtue, a concern with explanation is not something that can require scientists to move beyond belief in the empirical adequacy of their theories to belief in the literal truth of claims about unobservable entities.
According to van Fraassen, explanations are answers to questions and getting clear about the logic of questions is central to constructing a theory of explanation. Questions can take many different forms, but when the question of interest is a “why” question, explanatory queries will typically take the following form: a query about why some explanandum \(P_k\) rather than any one of the members of a contrast class \(X\) (a set of possible alternatives to \(P_k\)) obtained. In addition, some “relevance relation” \(R\) is assumed by the question. An answer \(A\) to this question will take the form “\(P_k\) in contrast to (the rest of) \(X\) because \(A\), where \(A\) bears the relevance relation \(R\) to \([P_k,X]\)”. To use van Fraassen’s example, consider “Why is this conductor warped?” Depending on the context, the intended contrast might have to with, e.g., why this particular conductor is warped in contrast to some other conductor that is unwarped. Alternatively, it might have to do with why this particular conductor is warped now when it was previously unwarped. The relevance relation \(R\) similarly depends on the context and the information which the questioner is interested in obtaining. For example, \(R\) might involve causal information (the question might be a request for what caused the warping) but it also might have to do with information about function, if the context was one in which it is assumed that the shape of the conductor plays some functional role in a power station which the questioner wants to know about. Thus “context” enters into the explanation both by playing a role in specifying the contrast class \(X\) and the relevance relation \(R\). van Fraassen describes various rules for the “evaluation” of answers. For example, \(P_k\) and \(A\) must be true, the other members of the contrast class must not be true, \(A\) must “favor” (raise the conditional probability of) \(P_k\) against alternatives, and \(A\) must compare favorably with other answers to the same question, a condition which itself has several aspects including, for example, whether \(A\) favors the topic more than these other answers and whether \(A\) is screened off by other answers. However, he also makes it clear (as the example above suggests) that a variety of different relevance relations may be appropriate depending on context and that the evaluation of answers also depends on context. Moreover, he explicitly denies that there is anything distinctive about the category of scientific explanation that has to do with its structure or form—instead, a scientific explanation is simply an explanation that makes use of information that is (or at least, is treated as) grounded in a “scientific” theory.
Van Fraassen sums up his view of explanation (and gestures at his grounds for rejecting traditional objectivist approaches) as follows:
The discussion of explanation went wrong at the very beginning when explanation was conceived of as a relation like description: a relation between a theory and a fact. Really, it is a three-term relation between theory, fact, and context. No wonder that no single relation between theory and fact ever managed to fit more than a few examples! Being an explanation is essentially relative for an explanation is an answer… it is evaluated vis-à-vis a question, which is a request for information. But exactly… what is requested differs from context to context. (1980: 156)
Van Fraassen begins his chapter on explanation with a brief story that provides a good point of entry into how he intends his account to work. Recall from Section 2.5 that a well-known counterexample to the DN model involves the claim that one can explain the length S of the shadow cast by a flagpole in terms of the height H of the flagpole but that (supposedly) one cannot explain H in terms of S, despite the fact that one can construct a DN derivation from S to H. This is commonly taken to show that the DN model has left out some factor having to do with the directional or asymmetric features of explanation—e.g., perhaps an asymmetry in the relation between cause and effect that ought to be incorporated into one’s model of explanation. In van Fraassen’s story, a straightforward causal explanation of the usual sort of S in terms of H (although the object in question is a tower rather than a flagpole) is first offered. Then a second explanation, according to which the height of the tower is “explained” by the fact that it was designed to cast a shadow of a certain length is advanced. Presumably the moral we are to draw is that as the context and perhaps the relevance relation R are varied, both
\[ H \text{ explains } S\]and
\[ S \text{ explains } H\]are acceptable (legitimate, appropriate etc.) explanations. Moreover, since these variations in context and relevance relation turn on variations in what is of interest to the explainer and his audience, we are further encouraged to conclude that explanatory asymmetries have their source in psychological facts about people’s interests and background beliefs, rather than in, say, some asymmetry that exits in nature independently of these. Pragmatists about explanation think that a similar conclusion holds for other features of the explanatory relevance relation that philosophers have tried to characterize in terms of traditional models of explanation.
One obvious response to this claim, made by several critics (e.g., Kitcher & Salmon 1987: 317), is that the example does not really involve a case in which, depending on context, H causally explains S and S causally explains H. Instead, although H does causally explain S, it is (something like) the desire for a shadow of length S (rather than S itself) that explains (or at least causally explains) the height (or the choice of height) for the tower. Or, if one prefers, in the latter case we are given something like a functional explanation (but not a causal explanation) for the height of the tower, in the sense that we are told what the intended function of that choice of height is. On either of these diagnoses, this will not be a case in which whether H provides a causal explanation of S or whether instead S provides a causal explanation of H shifts depending on factors having to do with the interests of the speaker or audience or other contextual factors. If so, the story about the tower does not show that the asymmetry present in the flagpole example must be accounted for in terms of pragmatic factors. It may be accounted for in some other way. In fact, although discussion must be beyond the scope of this entry, a number of possible candidates for such a non-pragmatic account of causal asymmetries have been proposed, both in philosophy and outside of it (for example, in the machine learning literature[13]).
A much more general criticism has been advanced against van Fraassen’s version of a pragmatic theory by Kitcher and Salmon (1987). Basically, their complaint is that the relevance relation R in van Fraassen’s account is completely unconstrained, with the (in their view, obviously unacceptable) consequence that for any pair of true propositions P and A, answer A is relevant to P via some relevance relation and thus “explains” P. For example, according to Salmon and Kitcher, we might define a relationship of “astral influence” \(R^*\), meeting van Fraassen’s criteria for being a relevance relation, such that the time t of a person’s death is explained in terms of \(R^*\) and the position of various heavenly bodies at t. Here it may seem that van Fraassen has a ready response. As noted above, on van Fraassen’s view, background knowledge and, in the case of scientific explanation, current scientific knowledge, helps to determine which are the acceptable relevance relations and acceptable answers to the questions posed in requests for explanation—such knowledge and the expectations that go along with it are part of the relevant context when one asks for an explanation of time of death. Obviously, astral influence is not an acceptable or legitimate relevance relation according to modern science—hence appeal to such a relation is not countenanced as explanatory by van Fraassen’s theory. More generally it might be argued that available scientific knowledge will provide constraints on the relevance relations and answers that exclude the “anything goes” worry raised by Salmon and Kitcher—at least insofar as the context is one in which a “scientific explanation” is sought.
While this response may seem plausible enough as far as it goes, it does bring out the extent to which much of the work of distinguishing the explanatory from the non-explanatory in van Fraassen’s account comes from a very general appeal to what is accepted as legitimate background information in current science. Put differently, this raises the worry that once one moves beyond van Fraassen’s formal machinery concerning questions and answers (which van Fraassen himself acknowledges is relatively unconstraining), one is left with an account according to which a scientific explanation is simply any explanation employing claims from current science and a currently scientifically approved relevance relation. Even if otherwise unexceptionable, this proposal is, if not exactly trivial, at least rather deflationary—it provides much less than many have hoped for from a theory of explanation. In particular, in cases (of which there are many examples) in which there is an ongoing argument or dispute in some area of science not about whether some proposed theory or model is true but rather about whether it explains some phenomenon, it is not easy to see how the proposal even purports to provide guidance. On the other hand, the obvious rejoinder that might be made on van Fraassen’s behalf is that no more ambitious treatment that would satisfy the expectations associated with more traditional accounts of explanation (including a demarcation of candidate explanations into those that are “correct” and “incorrect”) is possible—a theory like van Fraassen’s is as good as it gets. If there is no defensible theory of explanation embodying a non-trivially constraining relevance relation, it cannot be a good criticism of van Fraassen’s theory that he fails to provide this.
A final point that is suggested by van Fraassen’s theory is this: In considering pragmatic theories, it matters a great deal exactly where the “pragmatic” elements are claimed to enter into the account of explanation. One point at which such considerations seem clearly to enter is in the selection or characterization of what an audience wants explained. This is reflected in van Fraassen’s theory in the choice of a \(P_k\) and an associated contrast class \(X\). Obviously, whether we are looking for an explanation of why, say, this particular conductor is now bent when it was previously straight or whether instead we want to know why this conductor is bent while some other conductor is straight is a matter that depends on our interests. However, this particular sort of “interest relativity” (and associated phenomena having to do with the role of contrastive focus in the characterization of explananda, which really just serve to specify more exactly which particular explananda we want explained) seems something that can be readily acknowledged by traditional theories.[14] After all, it is not a threat to the DN or other models with similar traditional aspirations that one audience may be interested in an explanation of the photoelectric effect but not the deflection of starlight by the sun and another audience may have the opposite interests. What would be a threat to the DN and similar models would be an argument that once some explanandum E is fully specified, whether explanans M explains E (that is, whether there is an explanatory relation between M and E) is itself “interest-relative”. It is natural to interpret van Fraassen as making this latter claim, both in connection with explanatory asymmetries and more generally.
Selected Readings. van Fraassen (1980, especially Chapter Six) and Achinstein (1983) are classic statements of pragmatic approaches to explanation. These pragmatic accounts are discussed and criticized in Salmon (1989). van Fraassen’s account is also discussed in Kitcher and Salmon (1987). De Regt and Dieks (2005) is a recent defense of what the authors describe as a “contextual” account of scientific understanding and which engages with some of the themes in the “pragmatics of explanation” literature.
7. Conclusions, Open Issues, and Future Directions
What can we conclude from this recounting of some of the more prominent recent attempts to construct models of scientific explanation? What important issues remain open and what are the most promising directions for future work? Of course, any effort at stock-taking will reflect a particular point of view, but with this caveat in mind, several observations seem plausible, even if not completely uncontroversial.
7.1 The Role of Causation
One issue concerns the role of causal information in scientific explanation. All of the traditional models considered above attempt to capture causal explanations, although some attempt to capture non-causal explanations as well. It is a natural thought (endorsed by many) that many of the difficulties faced by the models described above derive at least in part from their reliance on inadequate treatments of causation.[15] The problems of explanatory asymmetries and explanatory irrelevance described in Section 2.5 seem to show that the holding of a law between C and E is not sufficient for C to cause E; hence not a sufficient condition for C to figure in an explanation of E. If the argument of section 3.3 is correct, a fundamental problem with the SR model is that statistical relevance information is insufficient to fully capture causal information in the sense that different causal structures can be consistent with the same information about statistical relevance relationships. Similarly, the CM model faces the difficulty that information about causal processes and interactions is also insufficient to fully capture causal relevance relations and that there is a range of cases in which causal relationships hold between C and E (and hence in which C figures in an explanation of E) although there is no connecting causal process between C and E. Finally, a fundamental problem with unificationist models is that the content of our causal judgments does not seem to fall out of our efforts at unification, at least when unification is understood along the lines advocated by Kitcher. For example, as discussed above, considerations having to do with unification do not by themselves explain why it is appropriate to explain effects in terms of their causes rather than vice-versa.
These observations suggest that insofar as we are interested in causal forms of scientific explanation progress may require more attention to the notion of causation and a more thorough-going integration of discussions of explanation with the burgeoning literature on causation, both within and outside of philosophy.[16] A number of steps in this direction have been taken. (cf. Woodward 2003).
Does this mean that a focus on causation should entirely replace the project of developing models of explanation or that philosophers should stop talking about explanation and instead talk just about causation? Despite the apparent centrality of causation to many explanations, it is arguable that completely subsuming the latter into the former loses connections with some important issues. For one thing, causal claims themselves seem to vary greatly in the extent to which they are explanatorily deep or illuminating. Causal claims found in Newtonian mechanics seem deeper or more satisfying from the point of view of explanation than causal claims of “the rock broke the window” variety. It is usually supposed that such differences are connected to other features—for example to how general, stable, coherent with background knowledge a causal claim is. However, notions like “generality” are vague and not all forms of generality seem to be connected to explanatory goodness. So even if one focuses only on causal explanation, there remains the important project of trying to understand better what sorts of distinctions among causal claims matter for goodness in explanation. To the extent this is so, the kinds of concerns that have animated traditional treatments of explanation don’t seem to be entirely subsumable into standard accounts of causation, which have tended to focus largely on the project of distinguishing causal from non-causal relationships rather than on the features that make causal relationships “good” for purposes of explanation.
Another important question has to do with whether there are forms of why-explanation that are non-causal. If so, how important are these are in science and what is their structure? Hempel seems to have thought of causal explanations simply as those DN explanations that appeal to causal laws which he regarded as a proper subset of all laws. Thus on his view, causal and non-causal explanations share a common structure. Kitcher’s unificationist model was also intended to apply to both causal and non-causal explanations such as unifying argument patterns in linguistics. More recently, there has been a great upsurge of interest in whether there are non-causal forms of explanation, with some claiming they are ubiquitous in science (e.g., Lange 2017, Reutlinger & Saatsi 2018). If there are such explanations, this raises the issue of what distinguishes them from causal explanations and whether there is some overarching theory that subsumes both causal and non-causal explanations.
7.2 A Single Model of Explanation?
As noted above, one way in which the attempt to develop a single general model of explanation might fail is that we might conclude that there are causal and non-causal forms of explanation that have little in common. But even putting this possibility aside, another possibility is that explanation differs across different areas of science in a way that precludes the development of a single, general model. It is, after all, uncontroversial that explanatory practice—what is accepted as an explanation, how explanatory goals interact with others, what sort of explanatory information is thought to be achievable, discoverable, testable etc.—varies in significant ways across different disciplines. Nonetheless, all of the models of explanation surveyed above are “universalist” in aspiration—they claim that a single, “one size” model of explanation fits all areas of inquiry in so far as these have a legitimate claim to explain. Although the extreme position that explanation in biology or history has nothing interesting in common with explanation in physics seems unappealing (and in any case has attracted little support), it seems reasonable to expect that more effort will be devoted in the future to developing models of explanation that are more sensitive to disciplinary differences. Ideally, such models would reveal commonalities across disciplines but they should also enable us to see why explanatory practice varies as it does across different disciplines and the significance of such variation. For example, as noted above, biologists, in contrast to physicists, often describe their explanatory goals as the discovery of mechanisms rather than the discovery of laws. Although it is conceivable that this difference is purely terminological, it is also worth exploring the possibility that there is a distinctive story to be told about what a mechanism is, as this notion is understood by biologists, and how information about mechanisms contributes to explanation.
A closely related point is that at least some of the models described above impose requirements on explanation that may be satisfiable in some domains of inquiry but are either unachievable (in any practically interesting sense) in other domains or, to the extent that they may be achievable, bear no discernible relationship to generally accepted goals of inquiry in those domains. For example, we noted above that many scientists and philosophers hold that there are few if any laws to be discovered in biology and the social and behavioral sciences. If so, models of explanation that assign a central role to laws may not be very illuminating regarding how explanation works in these disciplines. As another example, even if we suppose that the partition into objectively homogeneous reference classes recommended by the SR model is an achievable goal in connection with certain quantum mechanical phenomena, it may be that (as suggested above) it is simply not a goal that can be achieved in a non-trivial way in economics and sociology, disciplines in which causal inference from statistics also figures prominently. In such disciplines, it may be that additional statistically relevant partitions of any population or subpopulation of interest will virtually always be possible, so that the activity of finding such partitions is limited only by the costs of gathering additional information. A similar assessment may hold for most applications of the CM model to the social sciences.
Bibliography
- Achinstein, Peter, 1983, The Nature of Explanation, New York: Oxford University Press.
- Barnes, Eric, 1992, “Explanatory Unification and the Problem of Asymmetry”, Philosophy of Science, 59(4): 558–571. doi:10.1086/289695
- Braithwaite, R. B., 1953, Scientific Explanation: A Study of the Function of Theory, Probability and Law in Science, Cambridge: Cambridge University Press.
- Bromberger, Sylvain, 1966, “Why-Questions”, in Mind and Cosmos: Essays in Contemporary Science and Philosophy, Robert G. Colodny, (ed), Pittsburgh, PA: University of Pittsburgh Press, 86–111.
- Cartwright, Nancy, 1979, “Causal Laws and Effective Strategies”, Noûs, 13(4): 419–437. doi:10.2307/2215337
- –––, 1983, How the Laws of Physics Lie, Oxford: Clarendon Press.
- De Regt, Henk W. and Dennis Dieks, 2005, “A Contextual Approach to Scientific Understanding”, Synthese, 144(1): 137–170. doi:10.1007/s11229-005-5000-4
- Dowe, Phil, 2000, Physical Causation, Cambridge: Cambridge University Press. doi:10.1017/CBO9780511570650
- Earman, John, 1986, A Primer on Determinism, Dordrecht: Reidel.
- Friedman, Michael, 1974, “Explanation and Scientific Understanding”, The Journal of Philosophy, 71(1): 5–19. doi:10.2307/2024924
- Gardiner, Patrick L., 1959, The Nature of Historical Explanation, Oxford: Oxford University Press.
- Goodman, Nelson, 1955, Fact, Fiction, and Forecast, Cambridge, MA: Harvard University Press.
- Greeno, James G., 1970, “Evaluation of Statistical Hypotheses Using Information Transmitted”, Philosophy of Science, 37(2): 279–294. Reprinted in Salmon, 1971b: 89–104. doi:10.1086/288301
- Hall, Ned, 2004, “Two Concepts of Causation”, in Causation and Counterfactuals, John Collins, Ned Hall, and L. A. Paul (eds), Cambridge: MIT Press, pp. 225–276.
- Hausman, Daniel M., 1998, Causal Asymmetries, Cambridge: Cambridge University Press. doi:10.1017/CBO9780511663710
- Hempel, Carl G., 1965a, Aspects of Scientific Explanation and Other Essays in the Philosophy of Science, New York: Free Press.
- –––, 1965b, “Aspects of Scientific Explanation”, in Hempel 1965a: 331–496.
- –––, 1942 [1965], “The Function of General Laws in History”, Journal of Philosophy, 39(2): 35–48; reprinted with slight modifications in Hempel 1965a: 231–244. doi:10.2307/2017635
- Hempel, Carl G. and Paul Oppenheim, 1948 [1965], “Studies in the Logic of Explanation”, Philosophy of Science, 15(2): 135–175. Reprinted in Hempel 1965a: 245–290. doi:10.1086/286983
- Janzing, Dominik, Joris Mooij, Kun Zhang, Jan Lemeire, Jakob Zscheischler, Povilas Daniušis, Bastian Steudel, and Bernhard Schölkopf, 2012, “Information-Geometric Approach to Inferring Causal Directions”, Artificial Intelligence, 182–183: 1–31. doi:10.1016/j.artint.2012.01.002
- Jeffrey, Richard C., 1969, “Statistical Explanation vs. Statistical Inference”, in Essays in Honor of Carl G. Hempel, Nicholas Rescher (ed.), Dordrecht: D. Reidel, 104–113. Reprinted in Salmon 1971b: 19–28. doi:10.1007/978-94-017-1466-2_6
- Hitchcock, Christopher Read, 1995, “Discussion: Salmon on Explanatory Relevance”, Philosophy of Science, 62(2): 304–320. doi:10.1086/289858
- Kitcher, Philip, 1976, “Explanation, Conjunction, and Unification”, The Journal of Philosophy, 73(8): 207–212. doi:10.2307/2025559
- –––, 1989, “Explanatory Unification and the Causal Structure of the World”, in Kitcher and Salmon 1989: 410–505.
- Kitcher, Philip and Wesley Salmon, 1987, “Van Fraassen on Explanation”, The Journal of Philosophy, 84(6): 315–330. doi:10.2307/2026782
- ––– (eds), 1989, Scientific Explanation (Minnesota Studies in the Philosophy of Science, Volume 13), Minneapolis, MN: University of Minnesota Press.
- Kyburg, Henry E., 1965, “Discussion: Salmon’s Paper”, Philosophy of Science, 32(2): 147–151. doi:10.1086/288034
- Lange, M., 2017, Because Without Cause: Non-causal Explanations in Science and Mathematics, Oxford: Oxford University Press.
- Lewis, David K., 1973, Counterfactuals, Cambridge, MA: Harvard University Press.
- –––, 1986, Philosophical Papers (Volume II), Oxford: Oxford University Press. doi:10.1093/0195036468.001.0001
- –––, 2000, “Causation as Influence”, The Journal of Philosophy, 97(4): 182–197. doi:10.2307/2678389
- Lien, Yunnwen and Patricia W. Cheng, 2000, “Distinguishing Genuine from Spurious Causes: A Coherence Hypothesis”, Cognitive Psychology, 40(2): 87–137. doi:10.1006/cogp.1999.0724
- Lombrozo, Tania, 2010, “Causal–Explanatory Pluralism: How Intentions, Functions, and Mechanisms Influence Causal Ascriptions”, Cognitive Psychology, 61(4): 303–332. doi:10.1016/j.cogpsych.2010.05.002
- Mitchell, Sandra D., 1997, “Pragmatic Laws”, Philosophy of Science, 64(Supplement PSA 1996): S468–S479. doi:10.1086/392623
- Nagel, Ernest, 1961, The Structure of Science: Problems in the Logic of Scientific Explanation, New York: Harcourt, Brace and World.
- Pearl, Judea, 2000, Causality: Models, Reasoning and Inference, Cambridge: Cambridge University.
- Pitt, Joseph C. (ed.), 1988, Theories of Explanation, New York: Oxford University Press.
- Popper, Karl, 1959, The Logic of Scientific Discovery, London: Hutchinson.
- Reutlinger, Alexander and Juha Saatsi (eds.), 2018, Explanation Beyond Causation; Philosophical Perspectives on Non-Causal Explanations, Oxford: Oxford University Press. doi:10.1093/oso/9780198777946.001.0001
- Ruben, David-Hillel (ed.), 1993, Explanation (Oxford Readings in Philosophy), Oxford: Oxford University Press.
- Salmon, Wesley C., 1971a, “Statistical Explanation”, in Salmon 1971b: 29–87.
- ––– (ed.), 1971b, Statistical Explanation and Statistical Relevance, Pittsburgh, PA: University of Pittsburgh Press.
- –––, 1984, Scientific Explanation and the Causal Structure of the World, Princeton, NJ: Princeton University Press.
- –––, 1989, “Four Decades of Scientific Explanation”, in Kitcher and Salmon 1989: 3–219. Reprinted as a separate monograph, Minneapolis, MN: University of Minnesota Press, 1989. Page numbers are from the monograph.
- –––, 1994, “Causality without Counterfactuals”, Philosophy of Science, 61(2): 297–312. doi:10.1086/289801
- –––, 1997, “Causality and Explanation: A Reply to Two Critiques”, Philosophy of Science, 64(3): 461–477. doi:10.1086/392561
- Schaffer, Jonathan, 2000, “Causation by Disconnection”, Philosophy of Science, 67(2): 285–300. doi:10.1086/392776
- Scriven, Michael, 1959, “Truisms as the Grounds of Historical Explanations”, in Theories of History: Readings from Classical and Contemporary Sources, Patrick Gardiner (ed.), Glencoe, IL: The Free Press, 443–475.
- –––, 1962, “Explanations, Predictions, and Laws”, in Scientific Explanation, Space, and Time (Minnesota Studies in the Philosophy of Science: Vol. 3), Herbert Feigl and Grover Maxwell (eds), Minneapolis: University of Minnesota Press, 170–230.
- Spirtes, Peter, Clark Glymour, and Richard Scheines, 1993 [2000], Causation, Prediction and Search, New York: Springer-Verlag. Second Edition, Cambridge, MA: MIT Press, 2000.
- van Fraassen, Bas. C., 1980, The Scientific Image, Oxford: Oxford University Press. doi:10.1093/0198244274.001.0001
- –––, 1989, Laws and Symmetry, Oxford: Oxford University Press. doi:10.1093/0198248601.001.0001
- Whewell, William, 1840, The Philosophy of the Inductive Sciences, Founded upon their History, two volumes, London: John W. Parker.
- Woodward, James, 1989, “The Causal/Mechanical Model of Explanation”, in Kitcher and Salmon 1989: 357–383.
- –––, 2000, “Explanation and Invariance in the Special Sciences”, The British Journal for the Philosophy of Science, 51(2): 197–254. doi:10.1093/bjps/51.2.197
- –––, 2002, “What Is a Mechanism?: A Counterfactual Account”, Philosophy of Science, 69: S366–S377.
- –––, 2003, Making Things Happen: A Theory of Causal Explanation, Oxford: Oxford University Press. doi:10.1093/0195155270.001.0001
- –––, 2006, “Sensitive and Insensitive Causation”, The Philosophical Review, 115(1): 1–50. doi:10.1215/00318108-2005-001
Academic Tools
How to cite this entry. Preview the PDF version of this entry at the Friends of the SEP Society. Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers, with links to its database.
Other Internet Resources
- “Theories of Explanation”, by G. Randolph Mayes (CSU/Sacramento), in the Internet Encyclopedia of Philosophy (edited by J. Fieser)