First published Thu Jul 14, 2005; substantive revision Fri Apr 10, 2020

Philosophy of perception typically centered on colors, as did the metaphysics of mind when discussing the mind-dependence of secondary qualities. Possibly, the philosophical privilege of the visible just reflects the cognitive privilege of the visible—as vision is considered to account for most of useful sensory information gathering.

What makes sounds worth of philosophical analysis is that they are not only an important element of the perceptual scene but are also philosophically idiosyncratic in many intriguing ways; in particular, their temporal and spatial unfolding, as presented in perception, has interesting metaphysical and epistemological aspects. There is, however, an advantage of the neglect. Many philosophical aspects of sound and sound perception are not idiosyncratic and indeed make for general issues in philosophy of perception. Hence in this article we will take advantage of the many discussions that have used other sensory features such as colors as a paradigm of a sensory feature. For instance, we shall not rehearse the discussion about the subjectivity of secondary qualities, as the example of sounds does not seem to introduce new philosophically interesting elements that could challenge generalizations obtained, say, from the example of colors.

The main issues which are on the table concern the nature of sounds. Sounds enter the content of auditory perception. But what are they? Are sounds individuals? Are they events? Are they properties of sounding objects? If they are events, what type of event are they? What is the relation between sounds and sounding objects? Temporal and causal features of sounds will be important in deciding these and related questions. However, it turns out that a fruitful way to organize these issues deals with the spatial properties of sounds.

Indeed, the various philosophical pronouncements about the nature of sounds can be rather neatly classified according to the spatial status each of them assigns to sounds. Where are sounds? Are they anywhere? The main relevant families of answers include proximal, medial, distal, and aspatial theories. Proximal theories would claim that sounds are where the hearer is. Medial theories—exemplified by mainstream acoustics—locate sounds in the medium between the resonating object and the hearer. Distal theories consider sounds to be located at the resonating object. Finally, aspatial theories deny spatial relevance to sounds. There are significant variants of each of these. Sound theories can also be classified according to other dimensions, such as the metaphysical status they accord to sounds (for instance, as occurring events as opposed to properties or dispositions). We shall see some of the interactions between these different accounts. For a discussion that is more focused on perception, see the entry on auditory perception.

1. Proximal Theories of Sounds

Proximal theories of sounds construe sounds as located at or beneath the bodily surface of the hearer. We distinguish two main strands.

1.1 Sounds as Sensations

Modern philosophical accounts of sounds informed by psychology of perception construe sounds as “sensations”, or states/properties of hearers. Consider,

It seems…reasonable to suggest that the sounds directly perceived are sensations of some sort produced in the observer when the sound waves strike the ear. (Maclachlan 1989: 26)

The sound-as-sensations theory is justified by some facts about auditory experience. People report hearing voices and bells even when no one is speaking and no bell is ringing. Various examples of subjective sounds are documented under the label of tinnitus. In an anechoic chamber, most subjects experience subjective buzzing and whistlings. Some subjects undergo pathologic tinnitus when they hear sounds that disrupt their normal auditory capabilities. When Russian composer Shostacovich turned his head on a side, he was subject to a flow of melodies (Sacks 2008). Tinnitus and other subjective auditory phenomena have different causes, that can be related to the mechanical properties of the inner ear or to more central features. The objects of these experiences are naturally and spontaneously categorized as sounds. If sounds are simply defined as the objects of audition, then they are easily identified with the qualitative aspects of auditory perception. Various strands of indirect realism in perception would make this view mandatory. According to them, it is by hearing the immediate, proximal items that we hear some distal events or objects. In such a case sounds would be defined as the immediate objects of auditory perception.

1.2 Sounds as Proximal Stimuli

A bit more peripheral, albeit still proximal, is a position, defended by (O’Shaughnessy 2000) claiming that the sound heard is where the hearer is:

…while the sound originates at a distance and we can hear that it is coming from a direction and even place, and while there is no auditory experience of hearing that the sound is where we are, the sound that we hear is nonetheless where we are (2000: 447).

This leaves open the possibility that an unheard sound be located away from the hearer.

Support for this position comes from the following example. If I hear the noise of a motorcycle far away, the physical event at my ear is qualitatively different from the physical event that is produced at the motorcycle’s place. If I was close to the motorcycle, the physical event at my ear would be completely different from, and would not correspond to that of a motorcycle driving far away. A party upstairs does not sound the way it sounds to participants to the party. However, these arguments can be resisted on two counts. First, one should distinguish between source and informational channel. Consider a visual analogue. From the fact that reflected light is the only light that hits the eye, it does not follow that one does not see reflecting surfaces or is not aware of incident light. Reflected light contains information about the reflecting object and the illumination, which is unpacked by the brain. In the case of sound, distance, echoes, reverberations, and filtering affect the informational channel in a way that informs about the position of the source. Second, the examples used to support the proximal theory are unable to account for the perceived constancy of auditory features of distant objects. The motorcycle and the party can be judged to be very noisy, even if the physical events at the hearer are faint because of the distance from the respective sources. We do possess a notion of distal volume of a sound.

The locatedness of the sound at the hearer’s position entails that there are as many sounds as there are actual (or potential) hearers around. An alternate account would consider that one and the same sound is present which is, however, multiply located.

These examples, and the related difficulties, thus suggest that a major shortcoming of proximal theories is that they do not locate sounds where an untutored description of what is perceived suggests they are. This means that if sounds were inner sensations, or mechanical events at the ear, we would be almost always mistaken in our aural perceptions, at least on important aspects of the sound. In turn, this amounts to accounting for auditory perception in terms of a massive error theory. We shall see that proximal theories are not alone in endorsing it.

2. Medial Theories of Sounds

Medial theories construe sounds as features of the medium in which a sounding object and a hearer are immersed. The identification of sounds with sound waves is the major example of medial theories.

2.1 Sounds as Events or Properties of the Medium

When speaking about voice in his treatise De Anima (On the Soul), Aristotle wrote that sound is a “certain movement of air” (De Anima II.8 420b12) but, even though he claimed that sound and motion are tightly connected, he did not seem to identify them (Pasnau 2000: 32). The natural scientists of the seventeenth century refined the intuition that sound is a movement of air into the wave theory of sounds, which appeared to be an obvious competitor for the quality or sensation (proximal) view. Galileo registered that

sounds are made and heard by us when…the air…is ruffled…and moves certain cartilages of a tympanum in our ear.…high tones are produced by frequent waves and low tones by sparse ones. (1623 [1957: 276])

Descartes joined in and in his Passions of the Soul considered that what we actually hear are not the objects themselves, but some “movements coming from them” (1649: XXIII). Indeed, around 1636, Mersenne measured the speed of propagation of sound waves.

Both Galileo and Descartes were aware that the medial account was revisionary relative to a common sense view of sounds, or at least as revisionary as is the sensation view. Sounds for the wave view or the sensation view are not what we unreflectively take them to be on the basis of the content of auditory perception. (Indeed, Galileo himself endorsed both a proximal theory—sounds as sensations—and a medial theory, thereby possibly originating a dualist account.) At the same time Galileo and Descartes, as well as other modern philosophers, were not particularly keen in detailing the phenomenological content of auditory perception.

2.2 Sounds as Waves

The wave account is, of course, endorsed by modern acoustics. Sounds are construed as mechanical vibrations transmitted by an elastic medium. They are thus described as longitudinal waves, defined by their frequency and amplitude. A vibrating object (the sound source, such as a moving vocal chord or a vibrating tuning fork) creates a disturbance in the surrounding medium (say, air, or water). Each particle of the medium is set in back-and-forth motion at a given frequency and with a given amplitude, and the motion propagates to neighboring particles at the same frequency, undergoing an energy loss that entails a decrease in amplitude. Seen macroscopically, the propagation of sound is the propagation of a compression in the medium followed by a depression, that is, the propagation of a wave. The behavior of each particle is described by a sinusoid that maps the cyclical pattern of compressions and depressions against time. A cycle is the complete path of the sinusoid from crease to crease, at the end of which the particle is back to its starting position. Amplitude is the distance between creases and valleys in the sinusoid, period is the distance between a crease and its successor, and frequency is the number of cycles per time unit.

Contemporary philosophers of perception of the physicalist strand tend to align themselves on the wave theory (Nudds 2010b, 2018; Kalderon 2017; Meadows 2018). Perkins thus summarizes the view: “…the sound we hear is identical with the train of airwaves that stretches from the distant sounding object to our ear” (1983: 168). And indeed, the physicalist account of sounds seems to make a good claim to successful reduction of key auditory phenomena based on the identification of sounds with sound waves in a medium in which a sounding object (and possibly a hearer) is present.

2.3 Assessing the Wave Theory

2.3.1 Arguments for the wave theory

Many perceptual properties of sounds are neatly explained by the presence of strong correlations with properties of waves, in particular pitch and intensity (i.e., volume).

  • The felt quality of high pitch is correlated with high frequencies; low pitch is correlated with low frequencies;
  • High volumes are correlated with high, low volumes with low, amplitudes.
  • The directionality of sounds (the fact that they appear to be “in a direction”) is related to the fact that the hearer is located on a propagation line from the source.
  • Even more accomplished is the explanation of particular auditory effects, such as the Doppler effect, whereby the speed of a sounding object in motion contributes to a change in the sound’s heard pitch (so that the whistle of an engine passing by is heard to drop in pitch as it travels past us).

Interestingly enough, the reduction of sounds to waves in a medium is arguably more successful than the corresponding attempt to reduce colors to properties of electromagnetic waves. The latter attempt is affected by some major problems, such as the existence of non-spectral colors like purple, or the fact that some spectral monochromatic colors such as orange are seen as being composed colors.

2.3.2 Arguments against the wave theory

However, for all its merits, the medial identification of sounds with sound waves raises some objections and leaves some matters wanting.

For instance, there are metameric sounds (as there are metamers among colors), that is, sounds that feel identical to the ear although the corresponding medial properties are different. There is no one-to-one psychophysical correlation between auditory content and sounds as waves (Churchland 2007: 222). Moreover, ultrasounds, above 20.000 Hz and infrasounds, below 20 Hz, have the same physical nature as sounds—they are mechanical vibrations transmitted by an elastic medium—but they are not audible (as infrared and ultraviolet are not visible in the domain of colors): do they count as sounds? It further appears that the relationship between sound and sounding object remains underspecified. Do sound waves depend upon sounding objects in the sense in which we usually think sounds (as auditory events) do?

Most importantly, as happened with proximal theories, medial theories do not locate sounds where an untutored conception of auditory perception suggests they are. If sounds were sound waves, we would be almost always mistaken in our aural perceptions on important aspects, which fact, once more, amounts to accounting for auditory perception in terms of a massive error theory.

What is the nature of sounds under the wave theory? Relevant to our purposes, there are two main metaphysical conceptions about waves, in both cases construed as individuals. Either (a) waves are considered to be of the same nature as processes (temporally extended entities, with temporal parts), or (b) they are taken as individuals of a peculiar kind. In case (a), it may be argued that they do not move, for arguably processes do not move (Dretske 1967), but rather have phases (temporal parts) located at different spatial regions. Although we do say that the party moved from John’s room to Mary’s room, the party was never fully in either room. Objects like people (John, Mary, Sue, Lynn), on the other hand, were completely in John’s room first, and completely in Mary’s room later. Objects and people moved, as movement occurs whenever a whole entity is fully first in a place and then in another. The party did not move, but part of it was in John’s room and part of it in Mary’s at different times.

If waves were processes like parties, one would have to construe sound waves’ “movement” in terms of the presence of different phases of the wave at different spatial locations. However, one does not hear a sound wave’s phase as being at a particular location, in particular at any of the locations between the source and oneself. Sound waves do not appear to be perceptually located where sounds are. A wave-process has some starting phases in the object, and some end-phases around the perceiver: but perception locates sounds wholly where the object is.

In case (b), (sound) waves are different from processes and are peculiar just because waves move, as individual substances do. But again, if sound waves move, the corresponding sounds are not generally heard as moving. (Contrast this case with seeing a sea wave’s phase at a particular place.) Sound waves propagate in all or most directions from a sounding object, but the corresponding sounds are not actually heard as propagating in any direction: the only moving sounds are the sounds emitted by a moving source. It follows that if sounds were sound waves in this sense, we would not be hearing them as they are.

Consider an analogy with light. In the realm of vision, the closest analogue to sounds are the activities of light sources. We perceive these activities, and we perceive them as located where their sources are: the emission of a light bulb out there, the glowing of candlelight over the table, the irradiation of the sun at the horizon; each with its own respective color. Do we perceive light itself, as opposed to the events of light emission at light sources? Clearly, light travels from the source to our eye—if it didn’t, we would not perceive the source. But what we perceive is the emission event, and not the light. An irrelevant element of disanalogy is related to the temporal unfolding of sounds and light events respectively. Typically in the environment light is emitted continuously, whereas sounds occur episodically and have short lives. If most light sources were intermittent as in piezoelectric lighters, or if most objects were buzzing all the time, this element of disanalogy would be less salient (Pasnau 2007).

In some conditions it looks as if we perceive light rays, e.g., in a dusty room. But what is actually perceived is a set of particles of water or of dust that intercept light. In order for a light ray to be visible, it would have to send information to our eye without the mediation of interposed matter. Coming back to sounds, the argument based on the constraint of fidelity to the content of auditory perception is thus, in a compact form: in order for sound waves to be audible, they would have to transmit audible information that reaches our ears. They don’t. So we do not hear them. But we do hear sounds. Hence sounds are not (medial) sound waves.

2.3.3 Arguments for or against the wave theory reassessed

Some other remarks against the identification of sounds with sound waves in the medium between the object and the perceiver apply independently of the metaphysical construal of sound waves as processes or sui generis individuals.

Consider first the fact that sounds are sometimes loud and sometimes soft; as we have seen, in the wave theory this feature is correlated with amplitude of the sound waves. However, the location of the sound plays a role in establishing the correlation. A loud sound, which is heard as being far from us, is different from a soft sound, which is heard as being close to us. The spectra of the two sounds are different. If you want a vivid example, consider what happens if you amplify the sound of a person who is speaking low. You do not have the impression of a person who speaks loud. You have, on the other hand, the impression of having come very close to a low-speaking giant.

In some conditions, however, it can be difficult for us to tell a loud sound in the distance from a soft sound nearby, because in the part of space closer to our ears the sound waves that reach us and correspond to the two sounds can have the same amplitude. (This only holds for an ideal case of a long lasting sine wave with no sharp attack. This type of sound is practically distance-proof because having a single component, the only possible variation is in amplitude, and no other spectral differences arise.)

Indeed, the fact that the two sine waves have the same amplitude when they are close to the ear accounts for the indistinguishability of the two sounds. Nevertheless, the two sounds, even if indistinguishable, are distinct, whereas the two sound waves are not. A way to put the difficulty is as follows: a loud sinusoid heard at a distance is still a loud sinusoid; but the corresponding sound waves decrease in amplitude. As we have seen discussing proximal theories, we can make perfect sense of the notion of distal volume of sounds, even in the case in which we cannot distinguish a loud distant sound from a soft sound in the vicinity. Something (decrease in amplitude) happens to sound waves which does not happen to sounds (the distal volume does not change). Hence it can be argued that on this account too sounds are distinct from sound waves.

As a matter of fact, one may want to distinguish two possible lines of argument here:

  1. A phenomenological argument, claiming that sometimes we hear a constant sound (constant in intensity, for instance), even though the source is moving away from us. Here different sound waves can correspond to one and the same sound.
  2. A metaphysical argument, holding that sometimes the sound waves corresponding to two sounds are the same while the sounds are clearly different: a loud sound in the distance, or a faint sound close to us. Here the very same sound waves can correspond to distinct sounds. The latter argument is not lethal, but is a useful reminder that there is no fully developed theory of sounds as medial sound waves as yet. Should the sound be identified with the sound waves when they reach the ears, or with the whole train of sound waves from the source to our ears, or with the sphere filled with sound waves and centered on the source?

The thesis that sounds are sound waves is also often motivated by the argument from vacuums. Surely, it is argued, sounds cannot exist in a vacuum. As Hylas says in Berkeley’s Three Dialogues Between Hylas and Philonous,

a bell struck in the exhausted receiver of an air-pump, sends forth no sound. The air therefore must be thought the subject of sound. (1713 [1975: 171–172])

But is the claim that there are no sounds in vacuums really obvious? On pain of question begging, it cannot be made to follow from any particular metaphysics of sounds. In order to assess it on its own merits, consider once more the analogy between, on the one hand, sounds and air, and, on the other hand, emission events at light sources and light. Air is the medium of auditory perception, and light is the medium of visual perception. The reasoning now is that just as things can sensibly be taken to have colors in the dark, they can sensibly be taken to produce sounds in a vacuum.

In the above arguments an important role is played by the following requirement: A theory of sounds should be true to the phenomenological content of auditory perception. It seems quite reasonable to require that as sounds are the objects of hearing, whatever they are should be somewhat revealed in hearing. In point of fact, there are two ways in which the fidelity to auditory content requirement can enter the arguments. In a strong version (O’Callaghan 2009) it may be held that no theory of sounds “should make the fact of location perception a wholesale illusion” (2009: 29); hence, as sounds are represented as located, it would follow that they are correctly so represented. It may be argued that this principle is too strong because it is unjustifiably specific. Why should location, among all possible features of sounds, be protected against the possibility of systematic illusion?

In another, weaker version (Casati & Dokic 1994) it is the representational power of auditory content that is appealed to. Auditory experience has the power to represent sounds, and it has the power to represent movement (as when one hears the sound of a moving train’s whistle). It is then natural to assume that auditory experience would be able to represent sounds as moving, if sounds were indeed moving entities by their nature. But such is not the case; hence auditory experience correctly represents them as firmly located. This construal of the requirement is compatible with the existence of systematic, though possibly selective, illusion.

The requirement of fidelity to auditory content may be challenged on a number of grounds. First, it may be challenged by opposing its rationale. Auditory content may well be massively illusory, and this could be just the price to pay for any realist account of sounds.

Possibly there is here an analogue to the case of colors, in which there is room for selective illusion and for the choice between realism about location, say, and realism about hue. Most arguments for the subjectivity of colors start from the existence of strong correlations between phenomenal hue structures (such as the relation of complementarity between colors, e.g., red and green, which is manifest in phenomena such as afterimages or color-blindness) and neural structures (such as bipolar cells). These arguments then stress the absence of physical correlates for hue structure (nothing in the wavelengths corresponding to red and green hues can make one predict the subjective complementarity between red and green). They finally conclude that colors are mind-dependent. However, the arguments only establish the mind-dependence of hue structures, not of colors per se. The location of colors as “outside the brain” can still be taken for granted. Brought back to sounds, this form of selective realism would allow for subjective tonal qualities and for an external (non proximal) location of sounds.

Second, the requirement of fidelity to auditory content may be challenged by questioning the phenomenological claim that motivates it. The move would consist in suggesting that sounds do not have the strong spatial property of locatedness, but the weaker property of directionality. The distinction between two senses of “locatedness” in relation to sounds can be traced back to Malpas (1965), based on ordinary language arguments, and is echoed in Urmson (1968) and Hacker (1987: 102 ff.); cf. O’Shaughnessy (1957). Locatedness in the strong sense specifies an address for the sound, e.g., by specifying both directionality and distance from the hearer (thus including egocentrical directionality as a component), or by locating sounds in allocentric space (e.g., a siren from the boat at Pier 3). According to a weaker sense of “locatedness”, sounds would only be perceived as “coming from” a certain direction, without any information about the distance they travel. Now surely this is not the general case. Although in some cases (e.g., the decrease in amplitude of a sine wave) it may be difficult or impossible to tell, say, whether what we hear is a soft sound nearby or a loud sound far away, in most cases the distinction is perfectly available to the subject, as we noticed earlier when we introduced the notion of distal volume: someone screaming in a distance is never confused with someone speaking low near you ear. Indeed the issue of the locatedness of sounds is the subject matter of specialized branches of cognitive science (Blauert 1974; Bregman 1990; Schnupp, Nelken, & King 2011; Grimshaw & Garner 2015).

Incidentally, claiming that sounds are heard in a direction rather than at a location mixes up two ways of accounting for auditory experience: phenomenology on the one hand and commonsense reflections on the directional transmission of auditory information on the other hand. The commonsense picture may have been made a bit too sophisticated by exposure to some physical accounts of sounds.

Finally, the requirement of fidelity to auditory content may also be challenged by proposing a different phenomenology. For example, Kalderon (2017) argues that sound is an event that is identical to the propagation in every direction of a pattered disturbance by means of a medium, such as air or water (2017: 105). He claims, indeed, that auditory phenomenology is essentially emanative phenomenology. According to emanative phenomenology, we hear sound as an ever-expanding sphere which is the medium disturbance propagating in every direction from its source. Sound is like

an expanding ripple caused by a drop in an otherwise calm body of water, except that the sound event occurs in three dimensions, not two, and so takes the form of a sphere rather than a circle. (2017:. 106)

When facing the challenge of what we hear when we say we hear birds in the garden outside or people in the corridor outside our office—which are cases in which it is clear that auditory phenomenology is distally locating sounds and not making us hearing sounds as pervading the surrounding medium—Kalderon replies by saying that what we hear in these cases as distally located are sound sources, rather than sounds (2017: 115). Nevertheless, this reply raises the question of how we make the distinction between hearing sound as located and sound sources as located. Is there a way to distinguish their different location phenomenologically? Another question which might challenge the medial view based on emanative phenomenology is what are exactly the properties of sound sources which are audible and which provide us with spatial information on their location.

2.3.4 Defending the wave theory: giving up fidelity to auditory content

Wave theorists typically give up the requirement that an account of sound be faithful to auditory content (although they would not typically acknowledge this; as Pasnau 1999 has remarked, the very same textbook on sounds may simultaneously endorse a medial, a proximal and a distal theory). However, wave theorists may also try to reconcile auditory content with the wave conception. Sorensen (2008) proposes for instance that it is not a purely auditory phenomenon that we do sometimes identify and localize objects and events at their center, whenever a center is available. For instance earthquakes are localized (in the epistemic sense) at their hypocenter, although it is admitted that they are not located at their hypocenter. An earthquake is everywhere it can be felt or measured. Analogously, according to the wave theorist sounds are waves in a medium, but they are located at their center (at their origin). The wave conception considers as relatively benign the error of locating a sound (i.e., a sound wave) at its center/origin.

An important dialectical limitation of Sorensen’s suggestion is that it does not provide us with an independent argument in favor of the Wave Theory. The identification of sounds with sound waves is of course compatible with the fact that we locate sounds at a point (the sounding object’s location) which happens to be the center of expanding sound waves. However, the analogy with the localization of earthquakes breaks down at a crucial point. Of course, we can usefully identify a certain region as the center of a particular earthquake. (In general, earthquakes can be localized at their hypocenter only when we have at least a rough representation of their full extension in space.) By contrast, as Sorensen admits, the auditory system does not identify the sounding object’s location as the center of expanding sound waves. Indeed, it does not identify this location as the center of anything. (Compare the way the visual system identifies the landing area of a stone thrown in water as the center of a series of concentric, expanding waves. None of the sort is available to the auditory system.) Now many entities other than sound waves are at the object’s location when we hear a sound, including events (monadic or relational, as we shall see below). Thus, facts about the apparent location of sounds do not justify the Wave Theory better than the Event Theory; quite the contrary, given our other independent considerations against the former approach.

2.3.5 Source medium and environment medium

Several among the previous remarks jointly point to the necessity of better accounting for the distinction between events in the sounding thing and events in the surrounding medium. As for sounds, this distinction is consequent upon the distinction between two kinds of medium: the source medium (that is, the stuff the thing is composed of) and the medium proper or environment medium, surrounding the sounding thing, the one in which the hearer could be immersed as well. Let us take them in turn. First, a thing is a sounding entity only insofar the stuff or the stuffs the thing is composed of is or are vibrating. For a simple example, there are no properties of a tuning fork which account for its sounding, which are not properties of the stuff(s) the fork consists of (including shape). A more complex example is the case of the flute, in which the “sounding object” is air inside the flute. In both cases a portion of matter—the source medium—is vibrating.

But, second, do we ever happen to hear events in the medium? We actually do, but in a somewhat indirect way. Consider the visual realm. In some cases both a thing and (a part of) a medium between us and the thing are seen. This happens when we look at things through irregularly warmed air, or through moving water. In these cases, we see both the thing and the medium, the thing in an unclear way, and the medium as that which makes the thing appear in an unclear way.

But these cases are not the norm. Perceptual media are in the norm cognitively transparent: they are imperceptible insofar as they transmit without significant alteration information about some relevant properties of the thing perceived through them. Media become perceivable when this transmitting function is impaired by some event or disturbance occurring in them. Auditorily, this occurs in the case of the Doppler effect. The vibration of the air carries information both about the sounding object and about the effect its speed has on the medium.

The affectedness of the medium is a feature which is mostly evident, and almost pervasive in the case of sounds, because of the relatively similar size of the phenomena involved at the source and in the medium proper. Vibrations in the sounding objects are macroscopic phenomena, of a size which is fairly comparable to the size of the sounding object itself. Therefore the interaction of these vibrations with the surrounding medium can easily be a source of misperception, for their impact on the medium brings about processes which are of the same order of magnitude of the object involved.

3. Distal Theories of Sounds

After proximal and medial theories, one should consider another candidate for the physical identification of sounds, namely distal properties, processes or events in the medium inside (or at the surface of) sounding objects, or in the stuff of the sounding object. Distal views claim their superiority to non-distal competitors in virtue of their adherence to the spatial structure of auditory content. As we have seen, we do hear sounds both as externalized (hence auditory content is at odds with proximal views) and as distally located (hence auditory content is at odds with medial views).

There exist at least four varieties of the distal account of sounds: the Property Theory, the Located Event Theory, the Relational Event Theory, and the Dispositional Theory. These accounts all subscribe to the idea that sounds are distally located, but they differ in ascribing to sounds different ontological status. Let us take them in turn.

3.1 Sounds as Properties

According to the Property View sounds are properties of material objects just like colors and shapes.

The property view is in part endorsed by the founding fathers of modern philosophy of perception, Galileo and Locke, who opened a tradition of lumping various sensory items in the class of secondary qualities. The typical seventeenth century list of secondary qualities includes colors, smells and sounds. No significant internal metaphysical differentiation is made within the class of sensory qualities, hence the charge of an oversimplification cannot be directly addressed to historical accounts. Other philosophers may have added shapes to the list (as Berkeley did), without addressing the issue of the homogeneity of the class: the issue was only whether shapes are secondary as sounds are, not whether they are on a par with sounds as to their structure.

The Property View has contemporary endorsers (Pasnau 1999; although Pasnau takes sounds to be properties like colors, he comes close to the event view when he writes that sounds “either are the vibrations of [objects that have sounds], or supervene on those vibrations” [1999: 316]; indeed Pasnau 2007 rejoined the Event Theory). Leddington (2019) also defends a property view of sound within a distal view positions, since he claims that sound itself is not an event but it is a property of the event which is producing it (i.e., a property of the collision).

The property view faces a number of objections. First, we ordinarily describe objects as “having” colors or shapes, but we do not ordinarily describe sources as “having sounds”.

Rather, we say that they make or produce sounds (conversely, a red table does not “produce” or “make” red). This is an ordinary language argument, and as such it might not be very strong.

The main consideration against the Property View is that it underestimates the important differences between colors and shapes on the one hand, and sounds on the other hand. The latter, unlike the former, are dynamic dependent individuals. And even if colors and shapes can be theoretically conceived as individuals, they are not dynamic. Sounds take up time. They start and cease. They are intrinsically temporal entities. Their temporal profile is essential to individuating them, in a way which has no analogue in the case of colors and shapes. However, Roberts (2017) explicitly defends the Property View by discussing salient disanalogies between sound and colors. He also suggests a very exhaustive taxonomy of the different positions available within the property view space. Cohen (2010), although not directly endorsing the property view, criticizes arguments that conclude to an asymmetry between sounds and colors, in particular with regard to temporality.

Di Bona and Santarcangelo (2018: chapter 4) discuss to a certain length the relationship between sound and time, especially when this relationship grounds a metaphysical difference between sounds and colors. They investigate different temporal experiences of sounds and conclude that temporal experiences of sounds are similar to some temporal experiences of colors. Sounds and colors differ with regard to temporality only insofar as one focuses on the role that time plays when the auditory system has to segregate auditory stimuli into auditory streams (Bregman 1990). When segregating colors, space is far more important than time. For a discussion of vision and audition with regard not only to sounds and colors but also to auditory and visual objects and always with relation to spatiality and temporality (see O’Callaghan 2008; Kubovy & Schutz 2010; and Di Bona & Santarcangelo 2017).

3.2 Sounds as Located Events

In this and the following section, we shall present two Event Theories of sounds, for which we use two distinct labels. An earlier one, The Located Event Theory, was defended by Casati and Dokic (1994). It has been rejoined by Pasnau (2009) and extended to the field of sonic art by Roden (2010). A second, more recent version, the Relational Event Theory, has received an articulated defense by O’Callaghan (2007, 2010a). The two accounts agree on categorizing sounds as events, that is, located temporal particulars, and diverge on some specifics of the class of particulars that are admitted to be sounds. The Relational Event Theory makes sounds depend upon the existence of a medium that carries information about them. In this sense, only a subclass of situations involving sounds for the Located Event Theory are situations involving sounds for the Relational Event Theory.

3.2.1 The located event theory

According to the Located Event Theory, sounds are events happening to material objects. They are located at their source, and are identical with, or at least supervene on, vibration processes in the source. On this view (Casati & Dokic 1994), auditory perception of sounds requires a medium which transmits information from the vibrating object to the ears; however, the transmitting medium is not essential to the existence of sounds. One can see at once the fit of this view with those features of sound which were sources of trouble in the cases discussed above when criticizing proximal and medial theories.

  1. Vibration processes in the sounding object do not move any more than sounds appear to.
  2. They do not propagate from the object, as much as sounds do not appear to.
  3. Distal volume is constant: Like sounds, and unlike sound waves in the ambient medium, the intensity of vibration processes can remain the same through a period of time, even if one distances oneself from the source and hence hears them as less loud.
  4. Finally, and most importantly, tuning-forks and other sounding objects can be taken as continuing to vibrate irrespective of their being or not being immersed in a vibrating medium. One does not create sounds by surrounding vibrating objects with a medium—one simply reveals them.

A by-product of the Located Event Theory is that it makes plain what category sounds belong to, as opposed to views that construe sounds generically as qualities. Sounds are either instantaneous events or temporally extended processes. They start and cease. They are intrinsically temporal entities.

Another feature of the Located Event Theory is that it provides us with a clear example of the compatibility of a theory of non-direct perception, according to which we hear external events by hearing their perceptual deputies, with a non-phenomenalistic theory, according to which perceptual deputies are not mental items. The case of sound perception shows that there can be indirect perception without mental deputies. We hear coaches and telephones by hearing their sounds, i.e., by hearing some (cluster of) vibratory processes or events occurring in those objects. Sounds are both physical events and perceptual deputies.

3.2.2 Objections to the located event theory

We shall now briefly discuss some objections to the identification proposed by the Located Event Theory, as they allow us to highlight certain metaphysically interesting features of sounds.

Imprecise Location and Echoes

The first objection concerns sound location. Even if sounds are heard as located, it could be held that location is often imprecise or even erroneous, this in turn depending on—and being explained by—the nature of sound waves. Here is a relatively common echo example. Suppose you walk under the rain, your umbrella open. At some point you enter a building with a glass roof. Rain drops on the roof, and no longer on your umbrella. The umbrella attenuates the noise from the roof, which is reflected by the ground. You hear the raindrops as if they were below you, and not above you. Erroneous location is here explained by the path of sound waves.

The temptation of identifying sounds with sound waves can arise because of this fact: that sounds can be mislocated in audition. They can be heard as located in a region which is larger than, or removed from, the one occupied by a sounding object, a region which it is reasonable to take as being occupied by sound waves.

This example poses no particular threat to the distal view. Consider again a visual analogy. Seeing an object in a mirror is not seeing another, immaterial object located in an immaterial space beyond the mirror-plane. There is no such immaterial object; we see one and the only material object, and we locate it incorrectly as if it was behind the mirror.

The mirror sophism should be credited to Hobbes’ Leviathan (1651: I, I; cf. Casati & Dokic 1994: 49–51), which explicitly linked perception in a mirror and perception of echoes:

The cause of sense, is the external body, or object, which presseth the organ proper to each sense, either immediately, as in the taste and touch; or mediately, as in seeing, hearing, and smelling; which pressure, by the mediation of the nerves, and other strings and membranes of the body, continued inwards to the brain and hearth, causeth there a resistance, or counter-pressure, or endeavour of the earth to deliver itself, which endeavour, because outward, seemeth to be some matter without. And this seeming, or fancy, is that which men call sense; and consisteth, as to the eye, in a light, or colour figured; to the ear, in a sound…if those colours and sounds were in the bodies, or objects that cause them, they could not severed by them, as by glasses, and in echoes by reflection, we see they are; where we know the thing we see is in one place, the appearance in another.

The deviant paths of sound waves (in echoes) is responsible for the perceptual difficulty in locating sounds, much as the deviant path of light rays (in mirrors) is responsible for the analogous difficulty for visible objects (Casati & Dokic 1994; see also O’Callaghan 2007 for a lengthy discussion of echoes, and Fowler 2013 for arguments against O’Callaghan’s view). But it is not the case that sound waves are sounds just because of their responsibility. From the fact that a subject hears something as imperfectly located, it does not follow that she hears something which is imperfectly located.

Doppler Effect

The second objection concerns typical acoustical effects, like the Doppler effect, which are perfectly accounted for by appealing to (medial) sound waves. The Doppler effect is a shift in frequency of the sound heard by an observer who moves relative to the sound source. As waves in the direction of movement are compressed, and waves in the opposite direction are expanded, the frequency drops dramatically when the hearer and the source go past each other. Such explanations of the Doppler effect are harmless for a distal account. The Doppler effect is dependent on something going on in the medium, but this should not allow one to conclude that what we hear are sound waves in the medium. The situation can be described as follows in a way that is relatively uncommitting: When we hear sounds as undergoing the Doppler effect, we do not hear anything different from a vibration process in a sounding object, a process which is heard in a sort of perspectival shortening because the movement of the sounding object causes, among other things, the Doppler effect.

As a matter of fact, the objection could be turned on its head. On a train passing by, a trumpet player is delivering a concert. According to the distal view, the melody does not change, it is just perceived as changing. If we repeated the experience a number of times, we would find it suspect that the melody’s key drops only when the train passes by. Surely, we would infer, there is something wrong with the medium that blocks our perception of the true melody. And surely the train passengers would disconfirm our impression: they do not hear any key drop. The medial theory here indeed predicts that there are two melodies, the one that we hear from the platform, and the one the passengers hear.

Properties of Sounds

A third objection is as follows.

Sounds are phenomenologically high or low (they have high or low pitch). But processes in objects cannot be high or low. Therefore sounds are not processes in objects.

This can be answered in the following way. Notice first that sound waves fare no better on this objection—they cannot literally be said to be high or low. But a more substantial answer is available. What one needs is a way of systematically correlating predicates like “…is high”, “…is higher than…” to processes in sounding objects. It is likely that a high sound corresponds to a quickly vibrating process, and so forth.

The Causal Link

A fourth objection has it that

surely there are sound waves in the ambient medium, otherwise no causal link could be set between the sounding object and our perception of the latter.

And such sound waves can certainly be measured and physically described. Now there is no point in denying that there are sound waves in the ambient medium: of course there are, and they are causally responsible for our aural perceptions when these are perceptions of anything at all. A defender of the Located Event Theory ought to just contend that such sound waves are not what we hear.

Consider an analogy we discussed before. Light is causally responsible for your perception of an object’s surface. But this does not make you see the light when you see the surface. We can see luminescent sources, but never light in itself: in order to be seen, light should have to emit light carrying information about it.

Space Filling Sounds?

Finally, another objection concerns the alleged meaningful use of expressions such “the sound fills the room”, “sounds fill the room”. It seems that what makes these sentences true is best found in the spreading of sound waves, which could actually be everywhere in the room. But one should not be too much impressed by idioms. “The sound fills the room” does not describe any phenomenological fact which is different from the fact that the sound is audible from any place in the room (in this respect sounds are unlike fog, which can literally be seen to fill a room).

However, this point deserves closer attention. For sometimes sounds do seem to fill space. Thunders seem to. This is a case in which the only vibrating entity is the medium. Nonetheless this case too can be accommodated by the Located Event Theory: what we hear is sudden heating of air due to the electric discharge, whose impact is confusedly propagated by the medium. A portion of air (the portion that is suddenly heated up) is the vibrating object and another portion is the transmitting medium.

3.2.3 The relational event theory

According to the Relational Event Theory, sounds are events which involve both the source and the surrounding medium. They are relational rather than “monadic” events. (The distinction between monadic and relational events is not to be taken as cast in iron, since the latter can be reduced to the former by making the mereological sum of sources and surrounding medium the subject of sounds.) O’Callaghan (2002, 2007) has developed such a view at some length. He notes that the wave conception of sounds is not the only possible interpretation of Aristotle’s remarks about sounds in On the Soul. Aristotle writes that “everything that makes a sound does so by the impact of something against something else, across a space filled with air” (De Anima II.8 420b15). On O’Callaghan’s view, what Aristotle might have meant is that the sound itself is not a movement of the air, it is rather the event in which a vibrating object disturbs a surrounding medium and sets it “moving”. Waves in the medium are not the sounds themselves, but rather the effects of sounds. According to the Relational Event Theory, sounds are “disturbings” of a medium, hence depend existentially on a medium that is disturbed and that will transmit information to a listener. This account differs from the Located Event Theory insofar as the latter allows for sounds that exist in a vacuum, and thus distinguishes between a medium that hosts the vibration and a medium that transmits the vibration.

The Relational Event Theory shares with medial theories an endorsement of Berkeley’s argument that sounds do not exist in a vacuum. It is, in point of fact, a hybrid theory (O’Callaghan 2007: 55), sharing with medial theories the tenet of the indispensability of a medium to the existence of a sound.

An argument against the Relational Event Theory capitalizes upon the fact that we have the conceptual resources to distinguish between not hearing a sound because the sounding object is no longer resonating and not hearing it because we do not have informational access to the sounding object anymore. The Relational Event theory faces the risk of collapsing unto a medial position.

To develop this line of thought and make it vivid, imagine a vacuum jar which has the property of immediately creating a vacuum upon closing the lid, and of immediately recalling air upon opening the lid. Take now a sounding object like a tuning fork at 440Hz and have it vibrate, supposing that the vibration fades and becomes inaudible after 10 seconds. What you hear is an A that becomes feebler and feebler until it disappears. Now, place the tuning fork inside the jar, have it vibrate as before, and repeatedly open-and-close the lid of the jar, say once in a bit less than a second. What do you hear? You may have the feeling that a few short sounds, each feebler than the preceding one, come into existence and pass away. But you may as well have the impression of a sound that is revealed by the opening and closing of the jar. Indeed, the fading of the sound should be audible from each “window” to the next, implying that there is a single sound that fades. If sounds were either sound waves in the medium surrounding the object, or items dependent on the medium as per the Relational Event Theory, we would be forced to admit that the tuning-fork started and ceased to sound; because the relevant sound waves would not be present in the surrounding medium. Only the first impression, that of a series of short sounds, would be accounted for. But the second impression, that of a continuous underlying sound that is revealed, is supported too. A visual analogue of this would be the perception of an object in the dark, on which light is shed at intervals. We would not have the impression that the object gets its colors and then loses them at intervals.

An advantage of the Relational Event Theory over the Located Event Theory is that the former provides a criterion for specifying which among vibratory events at a source are sounds, namely, those that create medial disturbances that are (or can be) heard. The Located Event Theorist can, on the one hand, observe that the minimal requirement of a metaphysics of sounds is to specify which type of entities are sounds, and not, more specifically, which, among entities of that type, are sounds. Thus she would claim that sounds are events at a source, without caring to discuss whether some events at a source are not sounds. On the other hand, she can observe that too much fine-grained a classification would create a problem with those events that share with sounds all interesting metaphysical properties except for the property of being audible; a problem which, incidentally, affects a number of physicalist reductions of sounds.

3.2.4 Discussing the located event theory and the relational event theory

Relation between sound and sound source

The Relational Event Theory and the Located Event Theory differ concerning the way in which they account for the audible relationship between sound and sound source. A manner of characterizing this relation within the Relational Event Theory is the part-whole relationship, according to which sounds are heard as the constituent parts of wholes that are everyday audible events (O’Callaghan 2011a, 2011b). This is a Mereological View, which presupposes a distinction between sound and the broad event which sound is part of. The parthood relationship explains a striking and quite ambiguous aspect of audition, namely, the fact for which we hear the sound and its source as two different events but, at the same time, we hear them as not wholly distinct.

The advantages of the Mereological View over both the Property Instantiation view—which claims that sound is heard as a property of the sounding object—and the Causal View—for which sound is heard as the effect of its source (O’Callaghan 2011a: 383 and following)—are that:

  1. It explains why we often perceive sound as part of a multipartite occurrence which might also involve inaudible features. For O’Callaghan, there might be features of the event source that are not features of sound, and we may hear the sound without perceiving the features of the event source. For instance, the happening of “the stomping of the foot” (2011a: 396) includes non-audible features like a foot stomping or a release of energy, but also something that we hear, a sound.
  2. Mereology explains the intuitive fact that sounds may usually seem to be heard as bound or unified with their sources.
  3. Parthood explains the classical Berkeleyan account of auditory perception for which sound sources are perceived by virtue of the perception of the sounds they make.

The Located Event Theory proposes a metaphysical Ockhamization of the Mereological View (Casati, Di Bona, & Dokic 2013), which cleans the metaphysical landscape up from entities which are not necessary. As a starting point, it is useful to distinguish between two components of sound sources: thing sources (such as keys or musical instruments) and event sources (such as collisions or vibratory events at the sounding object) (2013: 462). The Located Event Theory maintains an Identity View for which sound, instead of being a proper part of a distinct event that is its source, is identical with the event source. That is, the collision we hear is the sound we hear. The Identity View appears to simplify the metaphysical landscape depicted by the Mereological View (2013: 463). The metaphysical claim which grounds the Identity View is supported by considerations about the phenomenology of audition. These considerations suggests that we do not commonly hear sounds as part of larger events—at least when, with “larger events”, we refer to event sources. We hear sounds as a unified entity, and not as parts of larger events. The biggest price the Identity View has to pay is for some ordinary language statements to which we have to renounce, meaning that instead of saying “I heard the bang (produced by/and) the collision”, it would be more appropriate to say that we heard the collision. This is a disadvantage of the theory which is acceptable, though, if the advantage is to propose a plausible theory of sound (2013: 465).

As for the advantages that the Mereological View has over the Property View and the Causal View, the Identity View can account for them as well. One of the advantages of the Mereological View is that it justifies the supposedly different properties that we usually attribute to sounds and to sound sources. The Identity View responds to this by arguing that:

  1. These differences in the attributable properties are only apparent. We are tempted to introduce this distinction since it seems that we ascribe properties to the source and not to sound because of the non-auditory character of these non-audible properties. But the Identity View claims that from the fact that some features—such as the release of energy, vibrations, a painful sensation—are perceivable through other sense modalities, it does not follow that such properties are not features of sounds themselves (2013: 464). Moreover, the extra-properties, if any, can be perfectly ascribed to the thing source.
  2. The sense of unification we experience when we hear bangs and collisions can perfectly be explained in terms of identity.
  3. Then Mereological View would explain the Berkeleyan claim for which we hear the source by virtue of hearing the sound since one perceives a whole by virtue of perceiving its parts. The Identity View as well accounts for the Berkeleyan claim since it states that, because the immediate object of auditory perception is just the event that is the sound, then, the latter is the thing source we perceive indirectly by virtue of hearing a sound.

The Identity View and the Mereological View have an equal explanatory capacity, but the Identity View, in addition to the metaphysical reduction, seems to untangle the ambiguity of hearing sound and sound sources as two different events and, at the same time, of hearing them as not wholly distinct. This ambiguity evaporates if we identify sound with the event source.

Different options concerning the relationship between sound and sound sources and how we perceive them have been proposed. According to Matthen (2010) both sounds (and sound composites like melodies, harmonies or sequences of phonemes) and sound sources can be heard directly. Leddington supports the Heideggerian view of hearing, for which we “hear sound sources directly, in hearing the sounds they make—not, à la Berkeley, merely in virtue of hearing those sounds” (2014: 340). Nudds, instead, suggests that when listening to sounds we hear them as apparently being produced by the same source (2010a, 2010b). He suggests that we can also perceive the production of sound bi-modally: by simultaneously listening to and seeing the cause of sound (2001).

Nevertheless, issues about the relationship between sound and the thing source (namely, sound and the material object which produces it) and about how we hear both the objects that produce sounds, and the relation between objects and sounds still have to be fully worked out within both views. As a starting point to develop an account on the perception of sound sources or, at least, of some aspects of the thing source, Di Bona (2017) proposes an argument for the perception of a specific characteristic of human voices. She offers an account of how a certain feature of the sound source when the source is the human voice is perceivable. Di Bona focuses on gender and engages with the debate on the admissible content of auditory experience, which has been mostly developed within the field of visual perception, and argues for a “rich” view of the auditory experience. This view is defended by means of the method of contrast applied to a case of auditory adaptation to human voices. The idea is that we hear not only the low-level auditory properties of pitch, timbre, and loudness but also the high-level property of gender, that is, the property commonly referred to as being a female voice or a male voice. This high-level property displays adaptational effects and, given that displaying these effects for a property is a clear mark that this property is perceivable, then we can conclude we hear gender properties instantiated by human voices. O’Callaghan (2011b) also engages with the debate on the admissible content of auditory perception, endorsing a more restricted view on the content of audition. He focuses on speech perception and argues that we do not hear the semantic properties of voices (which can also be seen as properties of sound sources, of voices) since what is auditory perceivable are, instead, the phonological properties.

Some objections to both event theories

Soteriou (2018) claims that both versions of the event theory are too strict about what they count as sounds: the Located Event Theory counts as sounds only monadic events happening at material objects; the Relational Event Theory, instead, admits only relational events that are bearers of acoustic properties. Soteriou challenges the assumption that sounds are one kind of thing and suggests that the sounds we hear are both pure audibilia (such as the barking of a dog), which are events or act-types bearing acoustic properties that can also be disconnected from their material causes and cannot exist in a vacuum, and events or act-types that can lack acoustic properties (footsteps), which are perceivable by means of modalities that are different from audition, and can exist in a vacuum. Soteriou’s “simple view” or “catholic view” (2018: 48) moves from the necessity of rejecting the idea of considering the experiences of echoes and recorded sounds as illusory experiences or distortion of space and time, which is how he thinks both event theories regard them in order to be consistent. The event theories classify the experiences of echoes and recorded sounds as illusory for the sake of preserving the idea that, usually, sounds are the bearers of acoustic properties located where we experience them to be located, namely distally. That is because if sound is something located at its own source and the echo seems to be a sound not perceived at its source, then we are hallucinating a sound. When hearing recorded sound, we experience and illusion since the bearer of the acoustic properties that we actually hear to be distally located is distinct from the “original” bearer of acoustic properties (2018: 44). Soteriou suggests that when hearing echoes or recorded sounds, even though we do not hear event-like individuals distally located, we do not have an illusory experience since what we hear are pure audibilia. Soteriou’s proposal has the merit of broadening up the range of the audible, but it might encounter some worries because of the non-univocal characterization of what sounds are. For example, a challenge to this view is to explain, on the one hand, how pure audibila are connected to the physical causes and, on the other hand, how collisions or non-audible act-types “acquire” acoustic properties that make them audible.

Nudds (2018) challenges both event theories by virtue of “the argument from the medium” which is elaborated to show that environmental events are not the bearers of acoustic features (2018: 54). The main premise of the argument, the existence premise, states that we can hear a sound without the occurrence of an environmental event. This premise can be used against both event theories, which claim that sounds are either identical to environmental events or to parts of such events (2018: 67). Nudds imagines two situations, S1 and S2. In S1 there is a disturbance of the medium and an environmental event, such as a collision; S2 is just like S1 except for the fact that the disturbance of the medium takes place without that there is an environmental event. In both situations you hear a sound, but in S2 there is not any environmental event; therefore, it is possible to have a sound and its acoustic features without the happening of an environmental event. In order to check whether this conclusion is true, Nudds elaborates an analogous case within vision and concludes that while there are two different arguments to show that the premise is false in vision, none of them apply also to the auditory case. He asks to imagine to have the situation S3 in which you see a red cube, and the situation S4 in which we keep the same pattern of light which determine the appearance of a red cube in S3, except for the fact that there is no red cube. The arguments to show that the premise is false in the visual case are as follows.

(1) The premise is false since you cannot see a red cube without that there is something, a real thing, that is the bearer of the visual features that usually a red cube has. When it seems to you that you are seeing a red cube without that there actually is a red cube, it is because you are hallucinating it: there is nothing that instantiates the visual features of a real red cube. Can we apply the same reasoning to audition in order to claim that the premise is false concerning auditory appearances too? When hearing a sound in S2, you do not hallucinate the sound of a collision since you can perfectly hear a sound without that there is a collision, as when listening to music played by loudspeakers. Therefore, there is still something that is the bearer of auditory qualities that you hear in S2 which can be produced, for example, by loudspeakers and you are clearly not having a hallucination (Nudds 2018: 56). That is to say that while it seems that, in the visual case, we hallucinate, in the auditory case we are clearly hearing a sound, and we can conclude that the premise is true in audition. Nudds’ conclusion can be challenged by the event view since one can say that an environmental event is still present in S2, that is, the event involved in the mechanics responsible for the functioning of the loudspeaker. Therefore, the premise is false in the auditory case as well, but for a different reason: not because you are hallucinating a sound, but because the loudspeaker case does not show that you hear sound in the absence of an environmental event.

(2) Nudds (2018: 56) adds that the falsity of the existence premise in the visual case can be showed also by imagining to see a hologram in S4. If what you see when it seems to you that you see a red cube is a hologram of a red cube—that is, you have the experience of seeing something that is indistinguishable from seeing a real red cube—you don’t literally see the object, but you do not hallucinate it as well. When seeing a hologram of a red cube, you see something which is the bearer of visual qualities without that the actual object having these visual features is present. Starting from Martin’s (2012) discussion of visual images, according to which a hologram is the appearance of an object “in the absence of any object which might possess it” (2012: 339), Nudds concludes that a hologram presents the appearance of something that it does not instantiate. Therefore, the existence premise is false not because you literally see nothing in S4, as usually happens in hallucinations, but because what you see is not the bearer of visual features; and yet it is indistinguishable from a real red cube. Is there a similar case in audition which can be used in order to show that the existence premise is false? Given that auditory holograms do not seem to exist, namely, “something that presents the acoustic appearance of a sound-producing event, an appearance it does not itself instantiate” (2018: 58), it is difficult to claim that the existence premise is false in audition for the same reasons for which it is false in vision; therefore, Nudds concludes that the existence premise is true: environmental events are not the bearers of acoustic features. This conclusion is based on the debatable notions of auditory hologram and auditory “image”. Is it really legitimate to look for analogous auditory concepts for notions that are authentically visual? Important questions that are triggered by Nudds’ argument are: how far we can go with the analogy between vision and audition, and how admissible is to ground our reasoning on the analogy between the two sensory modalities especially when dealing with items so intrinsically visual, such as holograms and images?

3.3 The Dispositional View

There are yet other ways to construe sounds under the distal umbrella. According to dispositional account of sounds,

sounds are dispositions of objects to vibrate in response to being stimulated. Sounds are perceived transiently, but they are not perceived as being transient and they are not in fact transient. (Kulvicki 2008: 2)

The account takes Pasnau’s original insight that sounds be considered as akin to colors in their being located features of objects. Indeed, Kulvicki draws a set of analogies between sounds and colors based on the assumption that colors are dispositions to reflect light in a certain manner. The analogies include the fact that as colors exist in the dark (being dispositions) sounds exist in the vacuum, and that as light waves are a way to get knowledge of colors, compression waves in a medium are a way to get knowledge of sounds. To complete the analogy, as “colors give off light when stimulated by light, objects in a medium give off compression waves when they are thwacked” (2008: 4). More controversially, “Without vibrating, objects have sounds, but these sounds cannot be heard” (2008: 4). Thwacking is the key element in the dispositional proposal. Thwacking reveals sounds. A good thwack (an impulse that contains all relevant frequencies) is considered to be to sounds what white light is to colors. White light contains waves of all lengths in roughly the same proportion and thus samples adequately the dispositions of the object to reflect some of them. If you lit up a surface with a monochromatic laser beam at 500nm, the disposition the surface may have to reflect light of 600nm will not be revealed. A good thwack makes the object resonate at all frequencies and thus reveals the vibratory modes of an object by stimulating it at the frequencies at which it responds better.

The impulse proposal solves an alleged disanalogy between color and sound, i.e., the absence of a normative analogue of white light in the case of sounds (O’Callaghan 2007). White light is normatively significant for revealing colors; a good impulse is normatively significant for revealing a sounding disposition.

Kulvicki’s dispositional theory neatly accounts for some distal intuitions about sounds. (Other intuitions, such as the idea that sound have a loudness, are beyond the descriptive power of the theory, that on that score considers loudness as a property not of the sound, but of the thwack). In particular it highlights the importance of action in bringing about auditory information about an object: most objects sound because we deliberately impart a thwack on them, and in many cases in which we want to know how an object sounds, we do impart the thwack, and subtly modulate it. When we hear a sound we get knowledge of a kind of resistance an object opposes to thwacking; it is knowledge of an elastic disposition. The account explains the fact that the more the object reacts, the more it is sonorous. And an object—say, a guitar string—is “A sharp” because at that frequency it responds optimally to thwacking.

Kulvicki’s alleges that an analysis of the phenomenon of sound constancy, modeled on an analogue to color constancy, militates in favor of the dispositional view. Colors are notoriously resilient to many changes of illumination, and even when incident light composition is very distant from standard light, surfaces may be seen as having their standard color. “The green grass looks green in the orange light at dusk, but it looks like green grass illuminated oddly” (Kulvicki 2008: 9). This is in sync with the likely function of vision—to inform about distal, stable properties of surfaces. Analogously, Kulvicki claims, “objects sound roughly the same whether it is white noise [i.e., an impulse] or something reasonably far from it that stimulates them” (2008: 9). Our ability to recognize voices across a large variation of ways to produce them is a case in point. In general, humans are able to identify sounds in terms of their spectral slopes and spectral envelope patterns, which are “fairly constant across changes in the fundamental frequency produced by the vocal chords” (2008: 10). Now, it can be objected that voices are but a part of the auditory world, and they are the object of dedicated neural machinery. More generally, however, sound constancy is exhibited in hearing objects of various sorts.

There is no obvious reason the auditory system should exhibit constancy for something like spectral slope or envelope pattern, unless the auditory system has the function of identifying stable properties of objects. (2008: 10)

Facts of auditory constancy thus help out the dispositional theory insofar as the function of the auditory system is to find out whatever stable property of the environment is heard as being indeed constant.

Objections to the dispositional theory

We shall enlist two objections to the dispositional account.

A first problem with the dispositional account is that of specifying what precisely the disposition in question is. An object’s disposition to vibrate at its natural frequencies does not exhaust the possibilities.

When struck, an object vibrates in an odd but characteristic fashion for a brief time before it settles into vibrating at its natural frequencies, if it has any. Like natural frequencies, these attack patterns are relatively stable dispositions of objects to vibrate when thwacked. Similarly, objects’ vibrations decay characteristically. Attacks and decays are stable dispositions of objects to vibrate when thwacked, just as natural frequencies of vibration are. (Kulvicki 2008: 14)

Hence there are three dispositions (at least) here: a disposition to react in the attack mode, a disposition to react in the decay mode, a disposition to vibrate in a certain way. There are many more. For instance, a disposition to respond by emitting a certain pattern of waves when rubbed or when brushed; or when broken. Many are the dispositions, and the dispositional account should endorse the idea that many are the (possibly unheard) sounds here. Call this the many-disposition problem.

Normativity half-solves the many-disposition problem. Some ways of imparting energy to an object are more telling than others. By contrast, light is the sole “thwacker” for colors (modulo differences in intensity and light spectrum), but there are many twhackers for eliciting the sonority of an object. The dispositional theory of colors suffers from some variant of the many-disposition problem. The color of many objects and stuff depends on their temperature. At 1000 °C, iron glows. At −20 °C, water is white.

A different problem for the dispositional theory is: What are those individuals that we hear when the object sounds? The main problem a dispositional account of sounds has to face is that even if we accept that sounds are sound dispositions, there appear prima facie to exist, on top of those dispositions, the individual, occurrent sounds—those that last when the disposition is realized. (A similar problem plagued Chisholm’s account of events as repeatable entities: as Davidson pointed out, any such account should accommodate the problem of assigning a status to the single occurrences of the repetitions, which cannot be themselves repeatable items.) Indeed, as Kulvicki notes,“now and again one encounters objects making sounds they do not have” (2008: 8); he dismisses these encounters as rather exceptional, but no matter their frequency—and radio listening provides a very common example—the “sound made”, as distinct from the “sound had”, must be accommodated by any theory of sounds. Kulvicki addresses the problem in a general way, by distinguishing between sounds as dispositions and the hearings of sounds. There are no individual sounds, only episodes of hearing the dispositional sounds. We do hear sounds, these are the dispositions, but the impression that sounds last depends on the fact that our hearing episodes last.

Kulvicki denies that this account entails an error theory because he thinks that intuitions are not particularly telling about the distinction between hearing as an individual event and hearing a sound. However, a threefold distinction should apply here. Grant the dispositional sound. Grant hearing episodes. Now, the duration of a hearing episode is different from the duration of an occurrent sound. You set a tuning fork in motion. It vibrates for thirty seconds. But then, after ten seconds, you get bored and block your ears with your hand. The duration of your hearing is ten seconds. But the sound is still there, for another twenty seconds. Generally speaking, unheard sounds are accounted for in the dispositional theory only as unrealized dispositions. Occurrent unheard sounds are invisible to the dispositional account.

Kulvicki answers this objection by saying that this phenomenon is consistent with the dispositional view (2014: 218). In order to have sounds, it is required that the dispositions of an object to vibrate in a certain way are realized, that these dispositions generate waves in a medium, and that ears can detect these waves. Occurrent unheard sounds produced when you block your ears and do not hear the tuning fork vibrating for thirty seconds are not a problem simply because they are not sounds. For the dispositional view “sounds are not occurrent anythings: they are qualities” (2014: 218). But the questions that naturally arise and need to be answered are: if the unheard dispositions are not sounds, what are they? Moreover, imagining that when I am still blocking my ears with my hand, someone else is there listening to the tuning fork, would she be listening to sounds? What are the items she would be listening to?

Another critical point of the dispositional view is that it does not take into account the intuitive fact that sounds are heard as unfolding over time and as having a duration, which is a clear sign of their being an event. Kulvicki (2014: 211) discusses this by elaborating on the plausibility of the argument of “perceptual seeming” that he attributes to the people who claim that sounds are events. The argument is based on the commonsensical idea for which if sound appears to have a duration, it is because it actually has a duration. This argument rests on three assumptions:

  1. our experience of sounds can last a certain amount of time;
  2. we usually say we hear happenings, such as the slamming of a door or the falling of a tree, which we commonly refer to as “events”;
  3. the events we hear have a duration that seems to coincide with the duration of our experience of them.

For Kulvicki, even though the concurrence of these three assumptions does not imply that sounds have durations, the claim that sounds have a duration is still the best explanation of the three uncontroversial claims. Therefore, even though, he does not propose a way to completely undermine this argument, he suggests a way to weaken it at least. The dispositional view can account for the third auditory intuition—for which the time the events we hear take to occur is often coincident with the duration of the related auditory experience—emphasizing the role of mechanical stimulants. If we imagine the circumstance of a “long-lasting, broad-spectrum mechanical stimulant” (2014: 214) which thwacks the sounding object in a stable vibratory disposition, it is not surprising that we hear events for about as long as we hear sounds. Kulvicki himself recognizes that this is not a conclusive way of excluding that sounds have durations, but yet it is a way of weakening this claim.

Recently Kulvicki (2017: 91) has proposed a more inclusive view of what sounds are. He admits that identifying sounds with the mere vibratory dispositions of objects does not seem to capture the full range of the auditory experience. Therefore, among sounds he includes also aspects of events and the contexts in which those aspects are heard, and distinguishes between non-perspectival and perspectival qualities. The former are the dispositions of objects to vibrate in response to being mechanically stimulated but also happenings, and the second are “abstractions over features of objects, events, and environs” (2017: 87). Both qualities are taken to be sounds, which are a “mongrel” category. Kulvicki explains that the perspectival qualities are “abstraction over intrinsic and relational features of ordinary objects, environments, and range events in which they participate” (2017: 90). That is, concerning some aspects things sound in a way and, concerning other aspects, they sound differently, depending on the perspective. When we imitate the voice of someone, the reproduction of her voice is similar to the original in some respects but is different in others. Within Kulvicki’s view, the worry is that sounds seem to be different kinds of things, very distant and often in opposition with each other (features of objects but also happenings, perspectival properties but also non-perspectival qualities). Rather than claiming that these are different ways of characterizing sounds, Kulvicki is exhaustively telling us the different things we can hear, without explaining in full what sounds are.

4. Aspatial Theories

We divided up existing accounts of sounds according to the spatial location they assign to sounds. However, there are aspatial theories of sounds as well. Aspatial theories deny either (i) that sounds are intrinsically spatial, or (ii) that auditory perception is intrinsically spatial. Arguably, claim (i) implies claim (ii), but the converse is not true, which leaves room for an interesting aspatial theory of auditory perception which nevertheless acknowledges that sounds have some spatial locations.

4.1 Aspatial Sounds

We have seen the use of phenomenological arguments against both medial and proximal theories of sound. To sum them up, one may claim that, first, auditory experience has a spatial content whereby sounds seem to be located in egocentric space (to the left, above, in front of us, etc.). Second, unless one subscribes to an error theory of auditory experience, sounds are where they are (normally) heard to be, namely at their sources.

Before presenting a-spatial theories, let us conclude the discussion on the spatial theories of sound by adding that they all share the assumptions that reflections on the spatiality of audition are crucial in order to tackle issues on the metaphysics of sound—that is, we can say what sounds are when we know where they are located (Di Bona 2019). Moreover, considerations on the spatiality of audition and sounds are also important to understand the segregation of auditory streams—which constitute the auditory landscape—and the experience of some musical compositions (2019). Finally—despite the remarkable difference between the spatiality of vision and the spatiality of audition due to the fact that, in audition, we perceive where sources are, the space between us and the sources (O’Callaghan 2010b) and the space between the sources, whereas, in vision, we perceive also where objects are potentially to be seen (Nudds 2009)—spatial cues can also enter the auditory content and provide info on the volume of empty space, especially when derived from reverberations produced in a specific closed space (Young 2017).

But are sounds really located in space? There is in the literature a strong non-locational view of sounds. Strawson has made the plausibility of a thought-experiment of a purely auditory world rest upon the tenet of an intrinsic non-spatiality of sounds. (Philosophers and psychologists such as Lotze (1841 [1884: Ch. 6, 123–6]), Binet (1905: Ch. 3), Heymans (1911 [1905]: Ch. 1), Stumpf (1883–1890: Vol. 1, 29), Wellek (1934), Révész (1946), Ihde (1976: Ch. 6), Evans (1980 [1986: 248 ff.]), have investigated the phenomenology of auditory space and suggested the possibility of a purely auditory world.) Strawson (1959: 65–66) thus wrote:

Sounds…have no intrinsic spatial characteristics: such expressions as “to the left of”, “spatially above”, “nearer”, “farther” have no intrinsically auditory significance (…). A purely auditory concept of space…is an impossibility. The fact that, with the variegated types of sense-experience which we in fact have, we can, as we say, “on the strength of hearing alone” assign directions and distances to sounds, and things that emit or cause them, counts against this not at all. For this fact is sufficiently explained by the existence of correlations between the variations of which sound is intrinsically capable and other non-auditory features of sense-experience.

Strawson designed his thought-experiment in the context of an analysis of the Kantian claim that the notion of there being objective entities (entities which do not depend on our perception of them) involves the notion of space:

[t]he question we are to consider, then, is this: Could a being whose experience was purely auditory have a conceptual scheme which provided for objective particulars? (Strawson 1959: 66)

Thus, he imagined a character (later on called “Hero” by Evans 1980) who has only non-spatial auditory experiences. Hero perceives sounds, but he does not perceive them to be located in physical space. What Strawson tried to show is that Hero needs an “analogue” to the notion of space in order to “locate” sounds when they are not actually heard. This analogue is provided by what Strawson calls a “master-sound”, namely a constant sound varying in pitch. Any particular sound is heard against the background of the master-sound. Thanks to the master-sound, Hero can distinguish between experiencing the same particular sound again (when its “location” as provided by the master-sound is the same), and experiencing successively two particular sounds of the same type (when they have different “locations” on the master-sound map). Evans (1980) questioned the claim that the master-sound could play the role of physical space in grounding the notion of objective particulars; see also Strawson’s (1980) reply to Evans.

In evaluating Strawson’s thought-experiment, we should distinguish the claim that there can be non-spatial auditory experiences from the claim that there can be a world populated only with sounds. Strawson’s thought-experiment justifies the former claim, but it does not obviously lead to the latter. One cannot infer from the fact that we can perceptually represent a sound without representing its location, that we can perceptually represent a non-located sound.

4.2 Aspatial Auditory Perception

It is possible to argue that auditory perception is not intrinsically spatial independently of a commitment to the claim that sounds do not have spatial locations. This is O’Shaughnessy’s view, who writes that

while we have the auditory experience of hearing that a sound comes from p, we do not have any experience that it is here where it now sounds…And this is so for a very interesting reason: namely, that we absolutely never immediately perceive sounds to be at any place. (2000: 446)

However, O’Shaughnessy does not draw the conclusion that sounds have no spatial locations. On the contrary, as we have seen, he defends a proximal account of the location of sounds, according to which sounds are where hearers, rather than sources, are (2009).

O’Shaughnessy would not be impressed by allegedly phenomenological arguments according to which one normally hears sounds as located at their sources. One may still have the feeling that his sophisticated version of proximal theories does not locate sounds where an untutored description of what is perceived suggests they are. As a consequence, it appears to be another massive error theory of auditory perception.

4.3 Sounds as Pure Events

Scruton (2009a, 2009b; see also his 1997) proposes a non-physicalist account of sounds. He is impressed by the fact that when we hear sounds as music we (can) hear them as events detached from their physical causes. He then suggests that sounds are “pure events”, things that happen but which don’t happen to anything, and that they are “secondary objects”, entities whose nature is bound up with the way we perceive them.

One way of reconciling Scruton’s interesting suggestions with a physicalist account of sounds is to draw a distinction (which of course should be properly developed) between the ontology we need to account for our ecological ability to hear sounds in our natural environment and the ontology we need to account for our (at least partly acquired) ability to hear sounds as music. Eventually, the ontology of sounds as music, which Scruton wants to focus on, might be quite different from the ontology of natural sounds, which can still be of the physicalist kind.

5. Conclusion

We have suggested that a fruitful way to classify the various accounts of sounds that have been given in the literature is in terms of their spatial locations. If the sounds we hear have spatial locations, they can be thought to be located either where the material sources are (distal theories), or where the hearers are (proximal theories), or somewhere in between (medial theories). It has also been denied that sounds have any spatial locations, which gives rise to a fourth class of theories, aspatial theories. All these theories have interestingly different phenomenological, epistemological and metaphysical implications.


  • Aristotle, De Anima (On the Soul), in The Complete Works of Aristotle: The Revised Oxford Translation, James Barnes (trans.), Oxford: Princeton University Press, 1991.
  • Bennett, Jonathan, 1996, “What Events Are”, in Casati and Varzi 1996: 137–151.
  • Berkeley, George, 1713 [1975], Three Dialogues Between Hylas and Philonous, reprinted in M.R. Ayers (ed.), Philosophical Works, London: Dent, 1975.
  • Binet, Alfred, 1905, L’Âme et le Corps, Paris: Flammarion.
  • Blauert, Jens, 1974 [1983], Räumliches Hören, Stuttgart: S. Hirzel Verlag. English translation, Spatial Hearing, Cambridge, MA: MIT Press, 1983.
  • Bregman, Albert S., 1990, Auditory Scene Analysis, Cambridge, MA: MIT Press.
  • Bullot, Nicolas J. and Paul Égré, 2010, Objects and Sound Perception, special issue of Review of Philosophy and Psychology, 1(1). Includes Kubovy and Schutz 2010; Matthen 2010; Nudds 2010; O’Callaghan 2010; Roden 2010.
  • Butler, Ronald J. (ed.), 1965, Analytical Philosophy (Second Series), Oxford: Basil Blackwell.
  • Casati, Roberto, Elvira Di Bona, and Jérôme Dokic, 2013, “The Ockhamization of the Event Sources of Sound”, Analysis, 73(3): 462–466. doi:10.1093/analys/ant035
  • Casati, Roberto and Jérôme Dokic, 1994, La philosophie du son, Nîmes: Chambon.
  • –––, 2009, “Some Varieties of Spatial Hearing”, in Nudds and O’Callaghan 2009: 97–110.
  • Casati, Roberto and Achille C. Varzi (eds.), 1996, Events, (International Research Library of Philosophy 15), Aldershot: Dartmouth.
  • Churchland, Paul M., 2007, Neurophilosophy at Work, Cambridge: Cambridge University Press. doi:10.1017/CBO9780511498435
  • Cohen, Jonathan, 2010, “Sounds and Temporality”, in Zimmerman 2010: 303–320.
  • Crowther, Thomas and Clare Mac Cumhaill (eds.), 2018, Perceptual Ephemera, Oxford: Oxford University Press. doi:10.1093/oso/9780198722304.001.0001
  • Davidson, Donald, 1969, “The Individuation of Events”, in Essays in Honor of Carl G. Hempel, Nicholas Rescher (ed.), Dordrecht: Springer Netherlands, 216–234. Reprinted in Casati and Varzi 1996: 265–83. doi:10.1007/978-94-017-1466-2_11
  • –––, 1980, Essays on Actions and Events, Oxford: Clarendon Press.
  • Descartes, René, 1649, The Passions of the Soul, Indianapolis: Hackett Publishing Company, 1989.
  • Di Bona, Elvira, 2017, “Towards a Rich View of Auditory Experience”, Philosophical Studies, 174(11): 2629–2643. doi:10.1007/s11098-016-0802-4
  • –––, 2019, “Why Space Matters to an Understanding of Sound”, in Spatial Senses: Philosophy of Perception in an Age of Science, Tony Cheng, Ophelia Deroy, and Charles Spence (eds.), New York: Routledge.
  • Di Bona, Elvira and Vincenzo Santarcangelo (eds.), 2017, The Auditory Object, Special Issue, Rivista di Estetica, 66(3). [Di Bona and Santarcangelo 2017 available online]
  • –––, 2018, Il suono. L’esperienza uditiva e i suoi oggetti, Milan: Raffaello Cortina.
  • Dretske, Fred I., 1967, “Can Events Move?”, Mind, 76(304): 479–492. Reprinted in Casati and Varzi 1996: 415–428. doi:10.1093/mind/LXXVI.304.479
  • Ducasse, C. J., 1926, “On the Nature and the Observability of the Causal Relation”, The Journal of Philosophy, 23(3): 57–68. doi:10.2307/2014377
  • Evans, Gareth, 1980 [1986], “Things Without the Mind”, in Van Straaten 1980: 76–116. Reprinted in his Collected Papers, A. Phillips (ed.), Oxford: Oxford University Press, 1986, pp. 248–290.
  • Fowler, Gregory, 2013, “Against the Primary Sound Account of Echoes”, Analysis, 73(3): 466–473. doi:10.1093/analys/ant059
  • Galilei, Galileo, 1623 [1957], Il Saggiatore (The Assayer), Rome. Translated in Discoveries and Opinions of Galileo, Stillman Drake (trans.), New York: Anchor Books, 1957.
  • Grimshaw, Mark and Tom Garner, 2015, Sonic Virtuality: Sound as Emergent Perception, New York: Oxford University Press.
  • Hacker, P.M.S., 1987, Appearance and Reality: A Philosophical Investigation Into Perception and Perceptual Qualities, Oxford: Basil Blackwell.
  • Heymans, Gerard, 1911 [1905], Einführung in die Metaphysik auf Grundlage der Erfahrung, Leipzig: Barth; 1st edition, 1905.
  • Hobbes, Thomas, 1651 [1966], Leviathan, London. Reprinted in The English Works of Thomas Hobbes of Malmesbury, Aalen: Scientia Verlag.
  • Ihde, Don, 1976, Listening and Voice: A Phenomenology of Sound, Athens, OH: Ohio University Press. Second edition, Albany, NY: SUNY Press, 2007.
  • Kalderon, Mark Eli, 2017, Sympathy in Perception, Cambridge: Cambridge University Press. doi:10.1017/9781108303668
  • Kubovy, Michael and Michael Schutz, 2010, “Audio-Visual Objects”, Review of Philosophy and Psychology, 1(1): 41–61. doi:10.1007/s13164-009-0004-5
  • Kulvicki, John, 2008, “The Nature of Noise”, Philosopher’s Imprint, 8(011): 1–16. [Kulvicki 2008 available online]
  • –––, 2014, “Sound Stimulants”, in Perception and Its Modalities, Dustin Stokes, Mohan Matthen, and Stephen Biggs (eds.), New York: Oxford University Press, 205–221. doi:10.1093/acprof:oso/9780199832798.003.0009
  • –––, 2017, “Auditory Perspectives”, in Current Controversies in Philosophy of Perception, Bence Nanay (ed.), Oxford: Routledge, Taylor and Francis, 83–94.
  • Leddington, Jason P., 2014, “What We Hear”, in Consciousness Inside and Out: Phenomenology, Neuroscience, and the Nature of Experience, Richard Brown (ed.), Dordrecht: Springer Netherlands, 321–334. doi:10.1007/978-94-007-6001-1_21
  • –––, 2019, “Sounds Fully Simplified”, Analysis, 79(4): 621–629. doi:10.1093/analys/any075
  • Lotze, Hernann, 1841 [1884], Metaphysik, Leipzig: Hirzel. Translated as Metaphysics, in three books: Ontology, Cosmology, and Psychology, Bernard Bosanquet (ed.), Oxford: Clarendon Press, 1884.
  • Maclachlan, D.L.C, 1989, Philosophy of Perception, Englewood Cliffs, NJ: Prentice Hall.
  • Malpas, R.M.P., 1965, “The Location of Sound”, in Butler 1965: 131–144.
  • Martin, M. G. F., 2012, “Sounds and Images”, The British Journal of Aesthetics, 52(4): 331–351. doi:10.1093/aesthj/ays036
  • Matthen, Mohan, 2010, “On the Diversity of Auditory Objects”, Review of Philosophy and Psychology, 1(1): 63–89. doi:10.1007/s13164-009-0018-z
  • Meadows, P. J., 2018, “In Defense of Medial Theories of Sound”, American Philosophical Quarterly, 55(3): 293–302.
  • Moravcsik, Julius M.E., 1965, “Strawson and Ontological Priority”, in Butler 1965: 106–119.
  • Nagel, Thomas, 1965, “Physicalism”, The Philosophical Review, 74(3): 339–356. doi:10.2307/2183358
  • Nudds, Matthew, 2001, “Experiencing the Production of Sounds”, European Journal of Philosophy, 9(2): 210–229. doi:10.1111/1468-0378.00136
  • –––, 2009, “Sounds and Space”, in Nudds and O’Callaghan 2009: 69–96.
  • –––, 2010a, “What Sounds Are”, in Zimmerman 2010: 279–302.
  • –––, 2010b, “What Are Auditory Objects?”, Review of Philosophy and Psychology, 1(1): 105–122. doi:10.1007/s13164-009-0003-6
  • –––, 2018, “The Unitary Nature of Sound”, in Crowther and Mac Cumhaill 2018: 51–66.
  • Nudds, Matthew and Casey O’Callaghan (eds.), 2009, Sounds and Perception: New Philosophical Essays, Oxford: Oxford University Press. doi:10.1093/acprof:oso/9780199282968.001.0001
  • O’Callaghan, Casey, 2002, Sounds, Ph.D. Thesis, Princeton, NJ: Princeton University.
  • –––, 2007, Sounds: A Philosophical Theory, New York: Oxford University Press. doi:10.1093/acprof:oso/9780199215928.001.0001
  • –––, 2008, “Object Perception: Vision and Audition”, Philosophy Compass, 3(4): 803–829. doi:10.1111/j.1747-9991.2008.00145.x
  • –––, 2009, “Sounds and Events”, in Nudds and O’Callaghan 2009: 26–49.
  • –––, 2010a, “Constructing a Theory of Sounds”, in Zimmerman 2010: 247–270.
  • –––, 2010b, “Perceiving the Locations of Sounds”, Review of Philosophy and Psychology, 1(1): 123–140. doi:10.1007/s13164-009-0001-8
  • –––, 2011a, “Hearing Properties, Effects or Parts?”, Proceedings of the Aristotelian Society, 111(3): 375–405. doi:10.1111/j.1467-9264.2011.00315.x
  • –––, 2011b, “Lessons from beyond Vision (Sounds and Audition)”, Philosophical Studies, 153(1): 143–160. doi:10.1007/s11098-010-9652-7
  • –––, 2011c, “Against Hearing Meanings”, The Philosophical Quarterly, 61(245): 783–807. doi:10.1111/j.1467-9213.2011.704.x
  • O’Shaughnessy, Brian, 1957, “The Location of Sound”, Mind, 66(264): 471–490. doi:10.1093/mind/LXVI.264.471
  • –––, 2000, Consciousness and the World, Oxford: Clarendon Press. doi:10.1093/0199256721.001.0001
  • –––, 2009, “The location of perceived sound”, in Nudds and O’Callaghan 2009: 97–110.
  • Pasnau, Robert, 1999, “What Is Sound?”, The Philosophical Quarterly, 49(196): 309–324. doi:10.1111/1467-9213.00144
  • –––, 2000, “Sensible Qualities: The Case of Sound”, Journal of the History of Philosophy, 38(1): 27–40. doi:10.1353/hph.2005.0100
  • –––, 2009, “The Event of Color”, Philosophical Studies, 142(3): 353–369. doi:10.1007/s11098-007-9191-z
  • Perkins, Moreland, 1983, Sensing the World, Indianapolis, IN: Hackett.
  • Révész, G., 1946, Einführung in die Musikpsychologie, Bern: Francke.
  • Roden, David, 2010, “Sonic Art and the Nature of Sonic Events”, Review of Philosophy and Psychology, 1(1): 141–156. doi:10.1007/s13164-009-0002-7
  • Roberts, Pendaran, 2017, “Turning up the Volume on the Property View of Sound”, Inquiry, 60(4): 337–357. doi:10.1080/0020174X.2016.1159979
  • Sacks, Oliver, 2008. Musicophilia: Tales of Music and the Brain, New York: Vintage.
  • Schnupp, Jan, Israel Nelken, and Andrew King, 2011, Auditory Neuroscience: Making Sense of Sound, Cambridge, MA: MIT Press.
  • Scruton, Roger, 1997, The Aesthetics of Music, Oxford: Clarendon Press. doi:10.1093/019816727X.001.0001
  • –––, 2009a, Understanding Music. Philosophy and Interpretation, London: Continuum.
  • –––, 2009b, “Sounds as Secondary Objects and Pure Events”, in Nudds and O’Callaghan 2009: 50–68.
  • –––, 2010, “Hearing Sounds”, in Zimmerman 2010: 271–278.
  • Sorensen, Roy, 2008, Seeing Dark Things: The Philosophy of Shadows, Oxford: Oxford University Press. doi:10.1093/acprof:oso/9780195326574.001.0001
  • Soteriou, Matthew, 2018, “Sound and Illusion”, in Crowther and Mac Cumhaill 2018: 32–49.
  • Strawson, P.F., 1959, Individuals: An Essay in Descriptive Metaphysics, London: Methuen.
  • –––, 1980, “Reply to Evans”, in Van Straaten 1980: 273–282.
  • Stumpf, Carl, 1883–1890, Tonpsychologie, Leipzig: Hirzel.
  • Urmson, J.O., 1968, “The Objects of the Five Senses”, Proceedings of the British Academy, 54: 117–31.
  • Van Straaten, Zak (ed.), 1980, Philosophical Subjects: Essays Presented to P. F. Strawson, Oxford: Clarendon Press.
  • Wellek, A., 1934, “Der Raum in der Musik”, Archiv für die gesamte Psychologie, 91: 395–443.
  • Young, Nick, 2017, “Hearing Spaces”, Australasian Journal of Philosophy, 95(2): 242–255. doi:10.1080/00048402.2016.1164202
  • Zimmerman, Dean, 2010, Oxford Studies in Metaphysics, Volume 5, Oxford: Oxford University Press.

Other Internet Resources


This entry was prepared with the help of funds from the IST-2002-002114 Enactive Network of Excellence of the 6th Framework programme of the European Commission. Thanks to Maurizio Giri for help on some parts of the draft.

Copyright © 2020 by
Roberto Casati <>
Jerome Dokic <>
Elvira Di Bona <>

Open access to the SEP is made possible by a world-wide funding initiative.
The Encyclopedia Now Needs Your Support
Please Read How You Can Help Keep the Encyclopedia Free