Tuesday, July 10, 2018

Like rain from the mountain

六耳不同謀 
Six sets of ears are different. 

from Guin's Verse Commentary on Dōgen's Treasury of the True Dharma Eye (Heine 2020b, 114)

Modes of listening

In this text, I will discuss modes of listening. A mode of listening is a heuristic device that calls attention to the fact that music can be actualized in qualitatively different ways–through different modes of experience. A piece of music is not just a constellation of sounds and silences articulated in time, but, perhaps more importantly, a particular way of actualizing these sounds and silences–a particular mode through which these sounds and silences appear as music. Based on this simple definition, two themes will run through this text: the fact that different kinds of music seem to call for different modes of listening, and the fact that different people upon encountering the same piece of music might experience it through different modes of listening. Although deeply interconnected, the first part of the text will primarily focus on the first theme and the second half primarily on the second. 

To listen for what is important

When discussing modes of listening as something that the music calls for 'from itself', the theory of modes of listening is similar to what Ziff (1964) in relation to different styles of visual art has called acts of aspection. Paintings executed in the Venitian style, for example, "lend themselves to an act of aspection involving attention to balanced masses; contours are of no importance" (620). Some aspects, such as balanced masses, are emphasized, while others, such as contours, are less important. This is different from paintings of the Florentine school, which demand "attention to contours" as "the linear style predominates" (620). Paintings from the Venitian school and the Florentine school are seen with different acts of aspection because different aspects become the primary focus of the experience.

Carlson (2008) explains that the idea of acts of aspection points to the fact that different works of art have "different kinds of boundaries and different foci of aesthetic significance and so demand different acts of aspection" (119).  According to Ziff, one looks "for light in a Claude, for color in a Bonnard, for contoured volumes in a Signorelli" (1964, 620). For Ziff, these differences are not pre-given in the phenomenal world but come about when a subject enacts visual phenomena; we do not merely see contoured volumes in a Signorelli but look for them–although this 'looking for' is often, as we will see below, unconscious and spontaneous rather than effortful. Despite the automaticity with which we come to take on different acts of aspection, appreciating art is a skillful know-how. Enacting the meaningfulness that emerges when phenomenal appearances are placed in a hierarchy of relative importance is an acquired, embodied skill. 

Similar to Ziff's idea of the viewer conducting their gaze around different foci of aeshtetic significance, a musical mode of listening describes the components or aspects of the music that are discerned and that appear simultaneously in people's consciousness and how these different aspects of the phenomena relate to each other. Certain aspects are distinguished or emphasized in hearing, while others are not discernible or treated as minor components. 


In much music of Xenakis, we are listening to large moving masses of sound; detailed attention to the single sounds is of no importance. In the music of the classical gǔqín repertoire, we are on the contrary invited to dwell in an intimate relationship with the detailed articulation of each sound. If we were only to listen to the larger gestalts in this music, we would miss the point.


When listening to a traditional symphony, it is part of that music’s adequate mode of listening to 'ignore' ambient sounds. When listening to symphonic music, Saito (2007) describes how "[t]he outside traffic noise, the cough of the audience […] are […] consciously ignored, though they are part of our experience contemporaneous with the symphonic sound" (7). When listening to the Number Pieces by John Cage, Haskins describes a relationship to ambient sounds that is contrary to the symphonic listening described by Saito. Haskins is describing the feeling of how the music more and more is taking equal precedence with the other events around him; it is "gently enveloping" him until he sees and hears "minute details of everyday life with a fresh, uncluttered clarity" (2004).


These two pairs of examples illustrate how different kinds of music invite different modes of listening. Certain aspects are distinguished and emphasized in one mode of listening while downplayed in another.


Modes of listening as worlds


While there are modes of listening that can be characterized as 'analytical' or 'theoretical'–such as when doing an entrance exam to a music school and transcribing a melody–what interests me in this text are the 'authentic' modes of listening that are non-conceptual in nature. Wallrup (2012) writes sensitively how:


"the average listener neither listens philosophically, nor analytically (looking for themes and their development, trying to figure out how the work is harmonically construed). No, a common way of listening is to engage oneself in the musical work as a world, to let oneself be changed by the world emerging in music." (Wallrup 2012, 385)


To listen to music is to be attuned to a musical world. If formulations such as 'acts of aspection' and 'modes of listening' sound like they are describing a kind of 'selective awareness'–an interest-driven conceptual awareness that seeks to "arrange phenomena into hierarchies of relative importance" (LaFleur 1983, 88)–, then speaking of a listening mode as a world acts as a corrective to this in emphasizing the non-conceptual and attunemental nature of music listening. The music's world is the non-conceptual, unobjective, and unperceived 'something other' that in a dialectic process with the 'matter' of music brings it into presence as music. Since the world is constituted in a certain way–it has a certain kind of temporality, mobility, spatiality, and materiality (Wallrup, 2012) that distinguishes it from other worlds–, distinct phenomenal appearances carry different importance because these are heard from the 'perspective' of that world. In this way, things can appear to be meaningfully distinguished without being explained as the result of cognitivistic grids that interpret sense data. Instead, they result from a non-conceptual attunement to a world. This kind of thinking draws upon Heidegger's notion of the world as given in the following famous passage:


"The world is not the mere collection of the countable or uncountable, familiar and unfamiliar things that are at hand. But neither is it a merely imagined framework added by our representation to the sum of such given things. The world worlds .... World is never an object that stands before us and can be seen. World is the ever-nonobjective to which we are subject." (Heidegger 1971b, 44)


The world of a piece of music is neither the 'mere collection' of sounds nor is it a 'merely imagined framework' that we impose on the sum of these sounds. Instead, it is that which brings the sound present as music. Wallrup's use of the word attunement emphasizes the nondual, pre-reflective quality of musical modes of listening: music is a 'world' to which we, for the duration of the performance, are attuned: 


"Music is disclosed as a world in a certain attunement or in a combination of attunements, and it is according to the attunement that the listener relates to the musical events. Attunement is that which allows us to perceive different events in this world as events as all: it is according to the attunement that we can describe the musical work and its constituent characters." (Wallrup, 2012, 382) 


Green and Ford (2015) even more explicitly equate the Heideggarian notion of "world" that informs Wallrup's analysis of musical attunement with a piece of music's "style" (157). The mere combination of musical matter such as tones, metres, and scales is not music per se; it requires "something else to become music", and this "something else" is the piece's "style". This can be regarded as a way of demystifying what Heidegger means by "world"–which for Heidegger was more on the level of an era, epoch, or zeitgeist–and bringing this notion and dialectical process solely into the realm of the experience of listening to music. "Style" is then not only the "sum" of the musical objects–it is not merely analyzing characteristic patterns, cadences, and so forth as musical objects and considering the style to be the statistical distribution of these. "Style" in Green and Ford's intent with this term is not just the superficial differences in musical objects, such as how we can hear the harpsichord in Baroque music and the hammarklaver in Classical and Romantic music, even though the word "style" might carry such technical connotations for those of us that through our schooling have taken many exams in music history. A style is rather the way of being of a world and is as such pre-reflective. 


Continually unfolding worlds


Modes of listening are not fixed perspectives but are continually unfolding in a dialectical process where the mode of listening is constantly re-adjusted and reinterpreted according to new sensory intuitions and knowledge. It is in a sense not too dissimilar to how the process of interpretation has been described through the hermeneutical circle. This process of interpretation unfolds both throughout the course of our lives—as we return to pieces of music, our understanding of them has changed—as well as being a dynamic process that unfolds during the course of listening to the music. As the music unfolds, our interpretation changes, and our mode of listening is constantly in a process of adjustment. A mode of listening is never a fixed perspective. Using the dynamic hermeneutical circle as a metaphor helps to shed light on the following bit of conversation between Xenakis and Feldman recorded the day after both of them attended a performance of Feldman's Trio (1980):


Xenakis: I must tell you that usually I can’t stand such a long piece, but yesterday I could, although it was very late. I could follow the things that you were doing and I was attracted by what I heard. This is a positive thing because when you’re not attracted then you’ll forget it. I was pinned by the sounds and by the preparation of the sounds, which I think is the most important thing you have done. Of course, that comes from the quality of what I heard, including the performance-quality. Except for that chord that I didn’t quite understand. . .

Feldman: The loud chord?

Xenakis: Yes, the loud chord.


Quietness is an important parameter of Trio that is crucial in establishing a certain mode of listening. When I listen to this piece, the quietness invites me to adopt a 'detailed' listening where the sounds are enacted intimately. The sudden loud chord on page 24 of Trio, at least 30 minutes into a piece that prior to this has been all pianissimo, comes as a total surprise and is impossible to integrate into the established mode of listening. The loud burst breaks my intimate and careful relationship with the sounds. Here, we are receiving a new sensory intuition that requires us to adjust our mode of listening, but in what way does this loud chord inform such an adjustment? After this sudden loudness, nothing is compositionally 'done' with the chord by Feldman; the music continues quietly just like before without any second loud chord.


It is not like the many pieces by Sciarrino that similarly begin soft but later introduce loud bursts. In those pieces, we take this new sensation into account and adjust our mode of listening in such a way as not to experience a jump scare again. In Sciarrino's music, this adjustment is positively affirmed by more sounds that fill the entire dynamic range. In Sciarrino's music, the softness was not determining the mode of listening, the piece just began soft, and we erred in believing that the softness was a determining factor for a certain mode of listening.


In the Feldman piece, however, this is not the case and the one loud chord simply exists unexplained. This is why Xenakis cannot understand it. It does not fit with the mode of listening that he was absorbed in–it does not fit in the musical world that he was attuned to. After the loud chord, it makes the most sense for me when listening to this piece to simply 'go back' to the previous mode of listening, but with, depending on my attitude, an uneasy feeling that I am listening inadequately, or (more positively) that the piece exists beyond my conventional 'understanding'—that its mysteriousness is beyond my grasp.


Learning modes of listening


For Ziff, the performance of different acts of aspection was enabled in a subject primarily by prior education. Stylistic knowledge of the different 'schools of art', their conventions, and the classification of styles was "of the essence". Familiarity with the classification of art enables us to appreciate it in an adequate manner. Having a knowledge of style means that we understand where the aesthetic significance lies and are able to focus our experience around it.


In music, we often speak of knowledge of different 'genres' and a similar argument to that of Ziff has been put forth in relation to music by Ola Stockfeldt (2004), who spoke of 'adequate modes of listening'. Stockfelt explains that adequate modes of listening occur "when one listens to music according to the exigencies of a given social situation and according to the predominant sociocultural conventions of the subculture to which the music belongs" (Stockfelt 2004, 91). For Stockfelt, adequate listening is the prerequisite that enables "real communication" between musicians and the audience. It is what enables musical understanding. To master a specific adequate mode of listening means that "one masters and develops the ability to listen for what is relevant to the genre in the music, for what is adequate to understanding according to the specific genre's comprehensible context" (2004, 91). 


The way in which Stockfelt words his arguments should not be interpreted to mean that learning to listen to music is merely about 'de-coding' a set of stylistic tropes. Listening for "what is relevant to the genre of the music" does not mean that we try to classify the music as if on a music history exam; it means that we master that which allows us to be attuned to a non-conceptual musical world. But in order to attune ourselves to the music's world, we must 'understand' the music in a sense–otherwise the music will just appear as meaningless noise.


For both Ziff and Stockfeldt, 'understanding' art is something that has to be learned. There is, however, a big spectrum of views about what such learning would entail, and to fully understand what they might have in mind, we must look closer into different theories of learning.


Language and culture


According to one view, in order to 'understand' any given music and to engage it with the adequate mode of listening, we have to be proper members of the same cultural world in which the music originally arises. It seems to me that Stockfelt argues this point, and it can also be considered Heidegger's view if we transpose his theory of art in general to music specifically. When the (extra-musical) world–the culture and customs that initially made an artwork meaningful–is gone, the artwork becomes something only for historians or something only for our most superficial delight. We may take pleasure in such art, as we do when experiencing the art of ancient Greece, but we can not perceive it adequately. Learning the adequate mode of listening can not, on this view, come about only by listening to the piece of music, but is a skill acquired through belonging to a certain culture.  


This argument points to the insight that music does not exist outside of cultural contexts. Modes of listening are intersubjectively meaningful ways of being in a world. A mode of listening is learned–appropriated–by taking part in an unfolding intersubjective, culturally constructed reality. Appreciating this point will have important implications for music education. Learning to listen adequately to music does not, on this view, come about by 'merely listening' but is dependent upon how we frame the listening with material and linguistic cultures that draw out particular meanings. As Minette Mans writes, 


"research and experience have shown that in many cases, it is learners’ investigation and understanding of extrinsic factors that lead to a deeper enjoyment of the music […] It is only by understanding values within a musical culture that true appreciation begins to develop" (Mans 2009, 183). 


What has come to be known as the 'sociocultural perspective' is probably the school of thought that most heavily emphasizes other human beings as the primary agents for the transmission of modes of listening. According to this perspective, it is difficult to discover for oneself what is valuable to discern in each situation; we need to have access to linguistic descriptions and analyses–in other words, linguistic discourses–and through communication become involved in communities of action and meaning (Säljö 2000, 62). According to this perspective, language is primary. For example, the word tonic is in this tradition what enables us to experience a chord as the home tonality of a tonal piece.


Taken to its extreme, the sociocultural vision leads to a heavily anthropocentric and language-centric vision of social constructionism. In section XLII of the Laṅkāvatāra Sūtra, the Buddha explicitly rejects such a view that words give rise to things—the view that things exist because we use names for them—and the Buddha argues that they are not even essential for communication. If social constructionism contends that the world is created only through human social interactions, and where human language is seen as the primary agent in shaping this world, it misses out on emphasizing that also mountains, the spring breeze, and musical sounds are agents (equally empty, illusory and relational as humans) that take equal part in constructing the world. To many of us, the idea that sounds are somehow secondary to the linguistic, cultural framing of sounds when developing the skill to listen adequately seems just unintuitive. Indeed, many of us have had experiences of encountering music far away from our home culture and still have experiences of this music that is deeply meaningful, despite not having access to adequate cultural framing. Despite the sociocultural perspective's insistence on the impossibility of in this way discovering what is valuable to discern in this music, we seem to be able to pick something up from just the sounds.  


Sense perception


Other perspectives have therefore argued that the socio-cultural perspective relies too heavily on descriptions of music and does not take into consideration music's non-verbal quality and the actual sense perception of musicSounds are themselves agents that can point out to us what is important to discern. By listening, we enter into a community of praxis with sound. As Clarke (2005) writes,


"Ideologies and discourses, however powerful or persuasive they may seem to be, cannot simply impose themselves arbitrarily on the perceptual sensitivities of human beings, which are rooted in (though not defined by) the common ground of immediate experience." (Clarke, 2005, 43)


Clarke argues that calling a sonority 'tonic' would simply not make sense to a person who has not already experienced tonal music on which one can map this term. If there is no basis grounded in immediate experience, it would be impossible to appropriate words as meaningful. The sociocultural perspective's focus on language leads to a disregard for music's sensuous appearing–it leads to the idea that any sound can be perceived in any way depending on the discourses that frame them. It, therefore, leads to the kind of relativism that Tia DeNora (2003) associates with sociologists. Contrary to musicologists' emphasis on sound's material properties, sociologists one-sidedly focus on linguistic discourses:


"Most sociologists do not bother with the question of music’s specifically musical properties and how these properties may ‘act’ upon those who encounter them. Indeed, sociologists tend to infuriate musicologists when they suggest that musical meaning – music’s perceived associations, connotations, and values – derive exclusively from the ways in which music is framed and appropriated, from what is ‘said’ about it. Musicologists often assume (and in some cases correctly) that this notion overrides any concept of music’s own properties (conventions, physical properties of sound) as active in the process of perceived meanings." (2003, 36)


Yet, going too far in the direction opposite of the socio-cultural perspective–in the direction of DeNora's musicologists–would lead to another extreme in the form of determinism–the view that everyone who can hear and encounter a certain sound will hear it in the same way: the view that the acoustic properties of sound determine the experience.


This kind of determinism was famously valorized by the qín player Bó Yá (伯牙), who was active around 300 B.C. For Bó Yá, there was no point in making music if not all listeners heard exactly the same thing. In Bó Yá's aesthetics, all listeners–including the musicians–must share the same experience when encountering the same properties of sounds for the musical situation to be successful. The exactness with which Bó Yá and his friend Zhōng Zīqī (鍾子期) could describe the meaning of sounds–as retold in the "Tang Weng" chapter of the Liezi (列子)–illustrates the valorization of this kind of determinism:


Bó Yá was a good lute-player, and Zhōng Zīqī was a good listener.

Bó Yá strummed his lute, with his mind on climbing high mountains; and Zhōng Zīqī said:

Good! Lofty, like Mount Tài!'

When his mind was on flowing waters, Zhōng Zīqī said:

Good! Boundless, like the Yellow River and the Yangtze! 

Whatever came into Bó Yá's thoughts, Zhōng Zīqī

always grasped it.

(Graham 1960, 109)


In the ideal of listening described here, both listeners (in this case, a performer and an audience member) share the same mental images evoked by the music. To be a good listener means to be able to understand the sounds in a very exact way. The meaning of the music is determined by the sounds alone, and a good listener is able to hear the meaning that resides in the sounds.  


The absurd consequence of adopting a deterministic view of music listening is illustrated in the retelling of what happened when Zhōng Zīqī (鍾子期) passed away. Because Zhōng was the only one who understood Bó Yá's playing, Bó Yá destroyed his instrument upon Zhōng's passing: there was no point in continuing to play without anyone to 'understand the sound' (zhīyīn 知音). If the listeners did not share the exact experience, the music was meaningless.


Two worlds


Both the extreme of determinism and relativism are clearly wrong. What is needed is a middle way that recognizes that the sensuous qualities of sound are relevant but do not invariably lead to the same experience in every listener. 


The way Wallrup expresses such a middle way is by describing how a certain musical attunement arises from a dialectical process between two worlds that represent the music's properties on the one hand and the surrounding culture on the other hand. The musical attunement cannot be solely determined by either but arises as a dialectical process involving two interdependent worlds:


"We seem to have one world constituted by the cultural context of the artwork, and then another world worlding in the same work of art - but they have to do with each other, they cannot be separated. When the world belonging to the work is gone, the world worlding in the work is radically changed" (2012, 339). 


Wallrup agrees with the view of Heidegger that changing the cultural world means changing the attunements that are possible for artworks to facilitate. But Wallrup differs from Heidegger in describing that new attunements can arise from the world in the work of art when it meets a new world in the form of a new cultural context. This new attunement will bring forth a world that is different from the work's 'original' world; a new interpretation will arise. The new interpretation is, however, a meaningful one. It does not necessarily have to be as Heidegger claims–that we in the present day only can enjoy Greek art as historical curiosities or as 'pretty objects'. We can make them sources of genuine attunement:


“Attunement is at work when there is resonance between music and listener and a musical world emerges. However, we must not forget that 'world' is ambiguous. When the world that belongs to the artwork has faded away, the world of the work is changed” (Wallrup 2012, 343).


Affordances


DeNora's suggestion for finding a middle way is instead by introducing the idea of affordances, a term that comes from the ecological theory of perception (Gibson & Gibson 1955). Affordances are features of the environment that can be described as that which offers relationally constituted possibilities for certain behaviors and interactions. They are neither something that invariably causes every perceiving subject to have the same experience, nor are they non-related to the actual properties of the music. Affordances are opportunities for interaction. They arise neither in the material world nor 'inside' of sentient beings but are relational. Different species, for example, experience the material world in different ways because they perceive different affordances. Varela et. al (2016) explain that for Gibson,   


"affordances consist in the opportunities for interaction that things in the environment possess relative to the sensorimotor capacities of the animal. For example, relative to certain animals, some things, such as trees, are climbable or afford climbing. Thus affordances are distinctly ecological features of the world." (Varela et. al 2016, 203)


The idea is that talking about the music's affordances points toward musical listening as a perceptually guided activity: the affordances are potentials for interaction that may or may not be acted upon. The ecological perspective emphasizes that musical listening is a creative act, not just a state of being "merely receptive" (see Schmidt 2016, 140) because the affordance structures are not 'causes' or 'stimuli' that lead to certain reactions but are opportunities for interaction by encultured beings. Music is "something acted with and acted upon" (DeNora, 2003, 48). When acted upon, they are not necessarily acted upon in exactly the same manner. In this way, the ecological perspective seeks to avoid determinism while at the same time giving primacy to sense perception. 

 

Exploratory perception


That listening is not passive–we do not merely receive sense data from the 'outside'–but rather a creative, perceptually guided activity is a key idea in the ecological perspective. Gibson would typically frame such activities as 'exploratory': perception is an activity that searches for clarity and ways of acting 'optimally' with the environment. Gibson writes that "a system 'hunts' until it achieves clarity" (in Clarke 2011, 204). Clarke, who has elucidated Gibson’s work in relation to music, writes that


“Perception is essentially exploratory, seeking out sources of stimulation in order to discover more about the environment and to act optimally within it” (Clarke, 2011, 204). 


The description of perception as something exploratory closely mirrors a similar expression of this idea put forth by Merleu-Pointy, who famously wrote how


“for each object, as for each picture in an art gallery, there is an optimum distance from which it requires to be seen: . . . at a shorter or greater distance we have merely a perception blurred through excess or deficiency. We therefore tend towards the maximum of visibility, and seek a better focus as with a microscope” (1979, 302). 


Imagine going into an art gallery to find Monet’s painting of water lilies. If we stand too close, it will hinder us from seeing water lilies at all and leave us with only brushstrokes. Some art 'only' consists of brush strokes so in those cases, only seeing the brushstrokes is an adequate mode of perception. That kind of art does not afford a figurative reading of it. Monet's paintings, however, are figurative, so we are seeking to find a distance at which we can also read the figures–a position where we get an "optical grip" of the object. We, therefore, take a few steps back so that we can perceive the painting with maximum clarity. The painting affords the activity of taking a few steps back. Merleu-Pointy writes that what guides our placement to the painting is our human mind-body's attempt to find a way of 'co-existing' with the world:


“My body is geared into the world when my perception presents me with a spectacle as varied and as clearly articulated as possible, and when my motor intentions, as they unfold, receive the responses they expect from the world. This maximum sharpness of perception and action points clearly to a perceptual ground, a basis of my life, a general setting in which my body can co-exist with the world” (1979, 250).


The wording of Merleu-Ponty is crucial here because he talks about the object as 'requiring' to be seen in a certain way. Phenomenologically, it does not feel as if we are subjects that are simply seeking out more clarity. It is not that it feels as if we are individual agents going out and trying to 'master' the world by achieving a "maximum sharpness of perception". Saying that it feels like the object itself invites, or even 'requires', a way of seeing it with clarity is more accurate, but even this could imply that this process is not automatic and hidden from our thinking unless we try to analyze the situation. From the ecological perspective, too, it would be a mistake to say that the 'opportunities' for interaction are consciously perceived as opportunities. The way I understand Merleu-Ponty and Gibson, this search for clarity is not really a conscious effort. This 'co-existing' rather expresses itself as a spontaneous adjustment. When it comes to music, it is such an effortlessness and non-reflective attunement to the world that Wallrup captures with his perspective introduced above.  


Critiques of Merleau-Ponty and Gibson


In describing the process of finding this 'resonance' between us and the world–the process of finding the "maximum sharpness of perception and action" that allows us to "co-exist with" and attune ourselves to the world–, Merleau-Ponty relies on a certain conception of embodiment and sensorimotor perception. Shusterman (2009) has critiqued Merleau-Ponty's theory of embodiment as postulating a "universal and unchanging primordial body consciousness" (2009, 139) that misses out on articulating the subject's embeddedness in social and cultural contexts. This is likely a well-founded critique, but it would be a mistake to assume that for Merlau-Ponty, this primal body consciousness is all that is involved in the process of attuning ourselves to a certain world, or even just finding the right distance from which to see a Monet painting. If this was the case, everyone, because of the 'same' sensorimotor search for clarity, would find the same distance to the painting regardless of the cultural context that the work appears in, something that we know is not true. Only emphasizing this sensorimotoric search for 'clarity' and basing this in a universal body consciousness and the perceptual principles that humans share as members of the same species would lead to the same form of determinism as when 'musicologists' (to use DeNora's charicature) believe that the meaning of a piece of music resides in the properties of sound. We have only then replaced the sounds with a more sophisticated theory of affordances. 


This is, in a sense, the criticism that Varela et. al (2016) directs towards Gibson's ecological theory. Despite being ecological in orientation, Gibson's theory actually misses to fully articulate the deep relationality of perception. This is because for Gibson, that which is perceived by "picking up" and "seeking out sources of stimulation" is not itself fundamentally affected by the nature of perceptually guided action because it is invariant:  


"The observer may or may not perceive or attend to the affordance, according to his needs, but the affordance, being invariant is always there the be perceived." (Gibson 2015, 130)


In Gibsons view, the affordances are unique to the species and the special interests of the subjects that perceives them but are not so much 'constructed' by those subjects as the subjects are enabled to 'pick out' those affordances from the environment. This perspective, despite DeNora arguing otherwise, therefore downplays the role of perception as a truly creative act, and it therefore leads to a kind of determinism.


These two critiques of Merleau-Ponty and Gibson can be applied to Clarke's conception of music listening as ecological: although Clarke argues that his perspective is a middle way between determinism and relativism, in reality he emphasizes the 'properties' (the affordances) of the perceived and the universal 'perceptual principles' of the subject to a far greater degree than he emphasizes the "cultural context" that he claims to be equally important (2005, 93). The cultural context seem to be only there to support the universal perceptual principles to detect the existing affordances in the music to be perceived adequately. 


The enactive perspective


As long as one takes any one of the sides to be more important than the other, one does not go far enough in recognizing dependent origination. The perceived arise in interdependence with the perceiver and neither entity can be said to exist independently. Objects are neither apart from the subject nor the same as the subject: “[m]atter is no other than mind; mind, no other than matter. Without any obstruction, they are interrelated” (Kūkai quoted in Hakeda 1972, 229). A corrective to Gibson that fully acknowledges this interrlation and interdependency has been given by Varela, Thompson, and Rosch (2016[1991]), who explicitly ground their theory in the Madhyamaka philosophy of Nāgārjuna. Against Gibson’s positing of the environment as 'independent', they describe it as enacted. The term enactive means that 


"cognition is not the representation of a pregiven world by a pregiven mind but is rather the enactment of a world and a mind on the basis of a history of the variety of actions that a being in the world performs." (2016, 6)


Perception is not 'detection' but is a result of the fact that "sensorimotor patterns enable actions to be perceptually guided" (2016, 203). Ultimately, there are no common entities out there in the world that can be seen in different ways. This amounts to abandoning the notion of "simple location". Music does not consist of sound waves that reach different individuals who then interpret this acoustical information. Experiences are not 'subjective happenings' in which some kind of interface between mind and body negotiates the experience to create a 'representation' of the world. Tashi Tsering explains that "there is no common object of" beings' "respective sensory consciousness" (in Yakherds, 2021, 282).


Rather than saying that perceptual appearances vary according to the different ways that different beings mentally construct and process 'the same' items, we say that they appear differently accordingly to different modes of enactment that arise relationally. Neither the 'subject-side' nor the 'object-side' exist independently but arise dependently. Music is neither in the sounds nor in the act of listening, but arises as a dialectical process where music is created by listening and creating listening. Kong Yingda expressed it beautifully when he wrote:


Music comes from people, yet returns to effect people. This is like rain coming from the mountain yet returning to rain upon the mountain, like fire coming from wood yet returning to burn wood. (in Cook 1995, 13)


Since all things arise interdependently, all things are "empty of any independent intrinsic nature” (Varela et al., 2016, 224). Like the arguments of Nāgārjuna, the theory of enactment does not entail dualism—the view that the world and subject are different—or monism–the view that they are the same. Being true to the tenor of Nāgārjuna's work, Varela et al. argue that because there are no foundations on which to 'ground' this theory of enactment, the theory itself is also 'groundless'. Even their own concept of enaction is therefore only a "provisional and conventional activity of the relative world" that hopes to point beyond itself to a "truer understanding of groundlessness" (228). 


Hearing something as something and something as nothing


John Cage spoke of listening to his pieces of music in ways that sometimes made it seem like we listen to them without a musical mode of listening–outside of a musical world. Cage praised music that sounded like when he was "not listening to music, when sounds simply happen", and he praised music that sounded like "just sounds". He loved sounds just as they were and had "no need for them to be anything more":


And they say, these people who finally understand, they finally say, 'You mean it's just sounds?' thinking that for something to just be a sound is to be useless… whereas I love sounds, just as they are and I have no need for them to be anything more than what they are. (in Sebestik 1992)


Cage seems to suggest that listening to his music means having a kind of 'unmediated experience' of sound. But it is not possible to hear something unmediated and outside of worlds. We are always attuned. It does not make a difference if we classify 'hearing sounds as just sounds' as a musical mode of listening or as an 'everyday-life' mode of listening (see McMahan 2008, 142), they are still part of a world. Cage's mode of listening in which the listener hears sounds as 'just sounds' is no less a constructed mode than that of listening to Classical music. They are both modes of listening that are rooted in praxis–they are skillful know-hows for enacting phenomena. When sounds are heard in a mode of listening, they are heard as relating to other sounds in certain ways. They are heard as something. This does not mean that all phenomena are heard symbolically and through concepts. It means that even hearing them as mere sounds in the music of Cage still gives them a meaningful, albeit non-conceptual, articulation. 


While disagreeing on important points, all the perspectives drawn upon above–the enactive, ecological, socio-cultural, and phenomenological–agree that we always experience phenomena as something. They are all critical of the positivist idea that sense experience can be 'immediately' given to us in a way uncontaminated by past experiences and interests. They agree that we should be deeply critical of the realist idea "that there is a way that the world essentially is in itself independent of any conceptual framework and that the mind can know this world" (Thompson, 2020). There is no "pure experience" of ordinary phenomena that is not constructed–such a non-constructed 'pure experience' would be a kind of white noise since the world would lose its structure if everything were to be equally important (Marton & Booth 2000, 153). Wallrup writes how "[a]s soon as we take part in the world of a musical work, the materiality shows itself as a part of that world, and this means that we do not have any unmediated relation to it" (2012, 385). Sounds are always heard as something and as part of a world. If it is an unmediated experience, it is absolutely nothing.


The mistaken belief that we can have an unmediated experience of something was propagated by the influential 20th-century popularizer of Zen D.T. Suzuki, who described aesthetic expressions such as dry-landscape gardens and Zen paintings as being able to point toward such a "pure" and unmediated experience (Sharf 1995, 248). This had historical predecessors in the 'anti-symbolist' tendency of Zen art: Zen poems were often about "the simple recognition of phenomena" and about redirecting "our focused attention to phenomena for their own sake" with the purpose of "reversing the symbolizing habit of mind" (LaFleur 1983, 23). But D.T. Suzuki 'upped the ante', as David McMahan shows in his influential study of 'Buddhist Modernism', by taking this tendency further and postulating that Zen art presents reality completely pure and unfiltered. Zen art was said to be based on an "immediate access to and representation of reality that transcends the personal and social" (2008, 134). As was argued by the perspectives above, such a 'reality' that exists separate from the 'personal' and 'social' does not exist. As both Mādhyamikas and the enactive perspective view things, phenomena arise in interdependence with the perceiver and neither object nor subject can be said to exist independently. Because all phenomena are enacted, there is no pre-given world that a pre-given mind can have unfiltered access to. 


In Yogācāra terminology, the concept of paratantra is used to describe the basis for all phenomenal appearances. The Trisvabhāvanirdeśa explains this paratantra as "that which appears, in opposition to the way in which it appears" (Williams, 2009, 90). The paratantra is, according to the Saṃdhinirmocana Sūtra, "the dependent origination of dharmas, that is, the causal flow" (Williams, 2009, 90). That it is the essence of dependent origination means that it does not arise by itself outside of us as something we can have an objective perception of. When Kasulis describes Zen meditation as something that takes the meditator back to a point "where the specifics of the situation dissolve back into the meaningless flow, the as-ness or presencing", this 'meaningless flow' can at best be the paratantra but as experienced without imaginary concepts and dualisms that misapprehends this flux precisely as phenomena that exist by themselves–as phenomena with self-nature (svabhāva) that exist outside of us. The meaningless flow is one in which phenomena are seen as what they truly are and what they truly are is not something that exists outside of us as some kind of pre-given, pure reality, and neither as something that is only our own solipsistic Vorstellungen, what the Yogācārins call vijñaptimātra. What they are is dependent origination, which is emptiness.  


From both Madhyamaka and Yogācāra perspectives, D.T. Suzuki's notion of a "pure experience" comes across as unorthodox. This is even more the case if we consider the view of the Zen philosopher Dōgen. According to contemporary philosophers such as Kasulis (2018) and Davis (2011), Dōgen proposed a kind of perspectivism: whatever appears is always from a kind of perspective. The "meaningless flow" achieved in zazen is not a "white noise" but is a phenomenality that is enacted from praxis–from a world. As Davis (2011) summarizes the view of Dōgen, engagement–even awakened engagement–is only ever possible from a "perspectival opening within the dynamically interweaving web of the world" (6). What "meaningless" means is an engagement that is free from self-interest and dualistic perceptions. From such an attitude, the meaningless flow is not a white noise–a view of nothing from nowhere–but becomes an "infinite resource out of which new situations and new meanings can arise" (Kasulis, 2018, 230). Importantly, and as we will see in more detail below, the "new situations and new meanings" that can arise are neither infinite in the conventional meaning of this term and not random since what appears is determined by the process of dependent origination and rooted in praxis.  


Freedom


The idea of modes of listening is valid in art experiences despite the fact that art experiences can feel 'free': we are not, when encountering art, merely receiving 'information' but usually feel 'co-creative' of the experience in some stimulating way. We are not bound by the art to perceive it in a predetermined way but feel a sense of freedom in our encounter with it. More than postulating a "pure perception" of phenomena, what I believe was even more important to Cage was giving a sense of freedom to the listener. In the continuation to the passage quoted above, Cage says that he does not want sounds to be heard symbolically:


I don't want a sound to pretend that it's a bucket, or that it's president, or that it's in love with another sound. I just want it to be a sound. (In Sebestik, 1992)


The problem with symbolic perception is that it fixates the meaning and makes the listener someone who merely interprets this meaning. In symbolic music, the listener comes to be someone like Zhōng Zīqī: rather than approaching the art with a sense of freedom, the audience decodes a message. But when we remove the message-decoding function from art, we are not left with a pure perception of objective reality. Instead, space has been freed up for the free play of the audience members' minds to spontaneously function. 


In the writings of Agnes Martin, an artist that values this freedom, we, therefore, find a philosophy of art that seems to be the total opposite of the ideal of zhīyīn. Martin wrote that "painters can't give anything to the observer / People get what they need from a painting" (1991, 36). Elsewhere, she wrote that the cause of a particular response to art "is not traceable in the work. An artist cannot and does not prepare for a certain response" (1991, 18) I believe that it would be a wrong interpretation to take this statement to mean that the way in which the art looks has completely nothing to do with what the viewer 'gets' from the painting. This statement should not be taken as an example of the kind of relativism that is opposite of Bó Yá's determinism. Rather, what the viewer 'gets' is conditioned by the relationally arising affordances. From any non-deterministic view, it is true that artists can not 'give' anything to the observer in the way that Bó Yá 'gave' mental images like mountains and flowing waters to Zhōng Zīqī. In any non-deterministic view, the audience members have to actively create the experience, but they do not create it freely but as agents in an intersubjectively meaningful reality in which the sensuous properties of the art and the cultural background of the audience are important in enacting affordance structures that seem to 'automatically' and attunementally invite certain modes of being. 


One reason that I believe that Martin is dedicated to downplaying the role of the painting in the experience is because of the importance she places on the audience feeling 'free' when they connect with her art. Feeling co-creative when looking at art or hearing music is part of the reason why we take pleasure in art and find it meaningful. Lee Ufan captured this perfectly when claiming that art objects (in this case literally "painting and sculpture") "are uninteresting if they are like naked words that can be understood by anyone" (2018, 273). According to Lee, Zhōng Zīqī's listening experience is not an interesting one because it is similar to receiving information:


"Works of art must take part in a living dialogue. When a work of art and the viewer are in a relationship that creates resonance between them, the result will be a secret dialogue. Dialogue is not created by works of art with little physicality and no secrecy, which attempt to have general appeal and be accessible to all sorts of people. Viewers are only required to understand them as silly information" (Lee, 2018, 273).


This feeling of freedom is, however, not removed just because we say that the affordances of the painting are important. On the contrary, being responsive to these relationally arising affordances is exactly what enables the feeling of freedom. In other words, it should not be thought that this talk of 'modes of listening' has anything to do with maximizing the clarity of some sort of information transfer. Art is often mysterious, polysemic, ambiguous, and arouses the imagination. Or put another way, it is an essential part of many modes of listening to precisely be secret. But to even be able to enter into such a 'secret dialogue' with the art requires some prerequisites, and it is these prerequisites that we are pointing to with 'modes of listening'.  


Hearing music differently


Modes of listening are deeply relational, culturally contingent, and brought forth nondually from unique histories of enacting and being enacted upon by an 'environment' that only exists conventionally. Some people might from such a view surmise that it is obvious that people always should experience 'the same' phenomenon (such as the same piece of music) in significantly different ways: people have different interests, bodies, cultural values, and histories of engagement with phenomena–all of these variables will impact what kind of world they enact. Dōgen, for example, wrote in the 13th century:


"Not all beings see mountains and rivers in the same way. Some see water as a jeweled ornament, some see water as wondrous blossoms, and hungry ghosts see water as raging fire or pus and blood. Dragons and fish see water as a palace or a pavilion. Some beings see water as seven treasures or a wish-granting jewel and others see water as a forest or a partition. Some see it as the Dharma nature of pure liberation, the true human body, or the form of the body and the essence of mind." (Dōgen in Heine, 2020)


Lama Shabkar in turn sings in his masterpiece Khading Shoklap that 


“[appearances] have no other creator,

But appear according to how they are labeled and grasped

Through the habitual patterns and fixations of one’s conceptual thoughts.” (Kunsang 1986, 40)


Just like Dōgen, Shabkar emphasizes that what some sentient beings perceive as water, others perceive as nectar, and that "what is light for some is darkness for others" (Kunsang 1986, 39). Both of these authors were likely finding inspiration in the Yogācāra school’s common metaphor of the four views of one water (一水四見, Jp. issuishiken). This metaphor describes how a deity sees water as bejeweled, a hungry ghost sees it as pus, a fish as a palace, and a human as water. The "what" in the statement "what is light for some is darkness for others" does, as already explained above, not refer to material reality, like the same 'sound waves'. There is no objective 'stuff' out there (like sound waves) that is being processed by separate mind-bodies. When Asvabhāva in his Commentary on Compendium of the Great Vehicle writes that hungry spirits see a stream of water as pus, we should not interpret this to mean that there is some 'stuff' out there that different beings simply 'interpret differently':


"Due to the force of the ripening of their respective karmas, hungry spirits see a stream of water as things like pus. The same thing that is viewed by the hungry spirits appears to animals such as fish as their habitat, and they live there. Humans perceive it as sweet, clear, and refreshing water. They wash with it, drink it, and swim in it." (in Yakherds 2021, 281-282) 


The Compendium of the Great Vehicle even uses the fact that different people experience "the same object" differently as proof that there is no external object:


"We assert that objects do not exist because hungry spirits, animals, humans, and gods each perceive them differently in accordance with their respective natures." (in Yakherds, 2021, 281)


As the quote from Asvabhāva illustrates, the traditional Buddhist explanation for how different modes of enactment come along is karmaDōgen and Shabkar refer to the fact that our individual karmic propensities—our just plainly our karma—will cause us to have different experiences. The theory of karma is a theory in which karma 'belongs' to individual mental streams: a singular mental stream (a "person") acts and the karmic consequences ripen in that very same mental continuum. Beings, because they have different mental continuums that are shaped by different previous actions, therefore experience the world differently. 


Limitations to difference


As composers and musicians, the classical karmic framework makes for a difficult starting point because it seems to provide us with a framework for explaining that people do in fact experience different worlds. As composers and musicians, we are not primarily interested in how completely different everyone’s experience is. On the contrary, when we perform for an audience, we are interested in communicating with many people simultaneously. Ideally, we want everyone to take to heart what we play. We know very well, however, that due to people’s previous experiences, where someone hears melodies, someone else hears just 'textures'; where some hear rich chromatic harmony, someone else hears just 'wrong notes'; where someone hears a beautiful microtonal inflection, someone else hears a note that is out-of-tune; and where someone hears happiness, another one hears anguish (see Svensson, 2023 for a small scale empirical study that investigates different people's reactions to the same piece of music). Our experiences of talking to others how they perceive the same music as us sometimes seem to validate the theory of karma's individualism. Dōgen even wrote that there is no limit to how phenomena can be actualized: 


"If we are to inquire into the manner and style of the totality of phenomena, we should know that beyond their being visible as circularity or angularity, there is no limit to the other things the ocean or the mountains can be. We should bear in mind that there are many worlds everywhere." (Dōgen in Kasulis, 2018, 228)


The composer can not take all these different responses into account when she creates a piece. She can not create something that has the same effect on every listener. To achieve this, the music would have to be individually adapted for each listener. For the experience to be equivalent, it would not be possible to have the same organization of sounds—equality is not the same as equity. Musicians do not have the skill to communicate in an equitable manner in the way that a Buddha or high-level bodhisattva can. As recounted in numerous sūtras, they alone have the power to communicate "to all beings in accord with their mentalities". In the seventh book of the Buddhāvataṃsaka, we hear Manjuśri, empowered by the Buddha, exclaim that 


"all the Buddhas in the worlds in the ten directions know that the inclinations of sentient beings are not the same, and so they teach and train them according to their needs and capacities. The extent of this activity is equal to the realm of space of the cosmos." (Cleary 1993, 272)


But how is it then that Zeami so confidently in Kakyō (花鏡 'A Mirror of the Flower') could speak of the master performer as one who has cultivated the ability to take on the perspective of the audience that is free from the ego's perspective (我見, gaken). Zeami writes that "[w]hen you exercise your riken no ken [離見の見, the seeing of detached perception]you are of one mind with your audience" (quoted in Odin 2001, 115). If this audience is comprised of a multitude of perspectives, how can the performer take on them all? As Yusa explains, "[r]iken no ken is the mental eye by which the actor knows what the audience sees of him and identifies his viewpoint with that of the audience" (1987, 335). But how could this be possible if the audience is comprised of different subjective perspectives that enact different worlds? This is precisely where Dōgen is quick to add an important caveat:


"Although what is seen may differ drastically according to the one perceiving it, we should not be too hasty in accepting this as absolutely so. Are there really many variable ways of seeing any particular single object?" (in Heine, 2020)


Kasulis comments that for Dōgen, although the present moment is open to infinitely many meanings, there is also "an infinite number of meanings that do not fit the present occasion"; not just any interpretation is viable. Kasulis illustrates this by comparing the infinite number of integers to the infinte number of decimals found between two integers. "Like the domain of real numbers between 2 and 3, the number of possible meanings can be infinite but nonetheless limited." (2018, 232) The variation may be infinite, but takes place within limits.


The idea that variation in experiences of the 'same' phenomena happens within limits is also one of the insights of the phenomenographic research tradition. In this tradition, the goal is to map all possible variations of how different individuals qualitatively perceive the same phenomena. One might play a piece of music and through qualitative interviews study the 'outcome space' of all different variations in how this piece of music is actualized as a phenomenon (see Svensson, 2023 for a small-scale version of such a study). Usually in phenomenographic research, the outcome space is limited. Marton and Booth (2000) write that, on the one hand, it is true that each phenomenon is possible to experience in an infinite number of ways, but that, on the other hand, humans will still, paradoxically, experience this phenomenon in a limited number of qualitatively different ways—no matter which phenomenon we are dealing with (2000, 135).  Marton and Booth thus echo the insight of Dōgen, that an infinite number of variations happen within limitations. The rationale for this insight is, however, different. For Marton and Booth, it is because of a kind of 'cognitive' limit: there is a limit to the number of ways we can perceive phenomena because there is a limit to how many aspects of a phenomenon we can bring into focus. For Dōgen, however, what determines the limit of interpretations is a matter of praxis and context, something that has its base in intersubjective experiences and culture.  


According to Kasulis' interpretation of Dōgen, what makes a certain enaction of phenomena possible is a question of context and occasion. Even though there seem to be limitless possibilities, "we see and grasp only what reaches our eyes in our praxis"—"[w]e realize meaning through complete engagement with the present context" (Kasulis, 2018, 230). Because the present context is intersubjectively and socially conditioned, the focus on praxis and what is contextually 'appropriate' leads us straight back to the idea of adequate modes of listeningModes of listening are neither in the music nor in the subject but arise as relational and as intersubjectively grounded in praxis. Yet, accounting for intersubjective understanding and shared experiences becomes complicated and unintuitive on the karmic model precisely because it preserves an ontology of discrete streams.  


Collective karma


In order to account for intersubjective experiences and culture on the karmic model, there must be a way for different mental streams to have an impact on other mental streams. To account for intersubjective interactions, the theory of karma developed so as to say that phenomenal impressions not only are caused by 'individual karma'–what appears to me is not only the ripening of prior, private karma–but can also be caused by other mental streams as long as these are 'suitably linked'. Vasubandhu explains that:


"[t]here is mutual determination of impressions through reciprocal influence. [...] Mutual determination of impression occurs among all beings suitably linked by means of reciprocal influence of impressions. 'Mutual' means between one another. Accordingly the distinct impression arises in one mental stream from some distinct impression in another mental stream, not from a distinct external object" (in Siderits 2007, 168). 


Siderits explains that two streams become "suitably linked" when "the prior histories of each stream [...] have led to certain similarities in present experiences" (Siderits 2007, 170). Hungry ghosts (preta) have such different karma from me, that so far no hungry ghost has managed to interact with me in this life. Animals, however, have also very different karma but are similar enough for us to be able to share a world and be suitably linked. The idea that different mental streams are able to 'share' the same world and not just exist in solipsistic bubbles is explained by this idea of 'collective karma'. Generally shared karma accounts for why human beings seem to relate in comparable ways to the same objects and explains why it is possible for one mind to act upon, and thus influence, another mind. The more similar we are, the easier it will be to influence another, leading to the human mind being socially constituted.


The Yogācāra classic Chéng Wéishì Lùn (成唯識論) speaks of how different minds are like different lamps shining together to form a singular beam of light that illuminates an intersubjectively shared object of perception. This metaphor explains "how the cognitions of the same things by different sentient beings come to "mutually resemble" (Chi.: xiangsi 相似) one another" (Brewster 2018, 124). In reality, there is no 'shared object'. There is no actual 'core' or 'essence' that is being 'interpreted' differently; these different variations and different beams of light are all there is. They are not variations of anything. As we said above, there is no objective 'stuff' out there (like sound waves) that is being processed by separate body-minds, but the power of the "reciprocal influence of impressions" is so strong that we perceive the world as if it is. 


From Yogācāra to Huáyán


Siderits brings attention to a potentially interesting problem in Yogācāra philosophy regarding artifacts such as artworks. Artifacts can be said to fall in between being explained as impressions from 'the natural world' and being impressions from other mental streams. As seen above, Vasubandhu holds that there are two sources of impressions: 


"In addition to the ripening of karmic seeds, impressions can also be caused in a mental stream by the occurrence of a distinct impression in another suitably linked mental stream." (Siderits 2007, 170)


The natural world is typically explained through the former, while, as Sidertis (2007) explains, "some of our sensory experiences of other persons will be explained in terms of causal laws linking a desire in one mental stream with an impression in a suitably linked distinct mental stream" (172). This raises questions about how artifacts should be classified. This is an especially important issue in discussions of art and music. Siderits' lucid explanation is worth quoting in full:


"An artifact like a pot is the result of a desire on the potter's part, so the impressions-only theorist will want to explain our experience of it not in terms of karmic seeds, but in terms of a desire in a distinct mental stream (namely the potter's). But our sensory experiene of the pot isn't confined to just those times when we are 'suitably linked' with that mental stream. We can continue to have a pot experience when the potter isn't around anymore, for instance when the potter has died. Now the hypothesis of karmic seeds was meant to explain how something in the remote past could be the cause of a present effect when everything is momentary. The idea was that the cause produced a seed, which produced another seed, etc., in an unbroken series, until conditions bring about the ripening of a seed to produce an impression. And this makes sense when the remote cause and the seed series and the impression all being to the same mental stream. But it isn't clear how the seed hypothesis could work in the case of our experience of artifacts. The seeds couldn't be in the potter's mental stream, since we can have pot-experiences after that stream has ceased. So did the potter's pot-making desire cause seeds in the mental streams of those who now see the pot? Suppose the pot I see now was made ten years ago. Then the potter's desire would have caused a seed in 'my' mental stream ten years ago, and that seed would have been replicated in an unbroken series up to the present, when I finally have the experiences that count as the ripening conditions (such as the experience we call walking into a ceramins gallery.)" (2007, 171-172).


The point of Siderits here is that the Yogācāra theory of karmic seeds becomes very complex when taking artifacts into account. The theory of the plurality of sensory worlds–in which a plurality of discrete 'mental streams' with their own 'store-house-consciousnesses' influence each other–seems to lead to the conclusion that an artist plants seeds in every mental stream that in the future might encounter their art.


Moving away from seeds


While the Yogācāra theory of discrete streams and karmic seeds becomes very complex and perhaps strikes a contemporary reader steeped in scientific materialism as a theoretical fantasy, other perspectives have proposed models for intersubjective experiences that preserve the ontology of discrete streams without the theory of seeds. In more contemporary times, such models for intersubjectivity have been proposed by sociologists, social psychologists, phenomenologists, and developmental psychologists. Instead of explaining intersubjectivity through the planting of seeds, some theorists have seen it as basically an interferential process in which communication, and especially language, constructs the shared social world. For others, like Mead (1934), the foundation of the intersubjective world was the ability to imagine oneself in the roles of others and to see oneself from others' perspectives. For evolutionary theorists like Tomasello (2019), shared intentionality—the ability to cooperate with other individuals by creating a shared agent "wethat operates with shared knowledge, sociomoral values, and shared intentions–can be explained as an evolutionary adaptation unique to some sentient beings and especially strong in humans. For Heidegger, such a formation of collective intentionality was not merely a process of imagining oneself as seen by others or through shared language creating a shared framework. Rather, it was to be explained as a more direct, non-conceptual attunement to others. For Heidegger, a basic constituent of Dasein was simply being-with, a thought later taken further by di Palo and De Jagher’s interactive brain hypothesis, which postulates that the brain "is primarily an organ of relational cognition(Varela &al 2016, xlix). Drawing upon a formulation by Schilbach, they suggest that "the contents of mental states (of oneself or another) are experienced via quasi automatic attunement to others(Schilbach et al., 2006, 727-728, quoted in di Paolo and De Jaegher, 2012, 2).      


All these non-Buddhist perspectives are similar to Yogācāra in that they preserve the ontology of discrete selves; selves that can interact with other selves or attune themselves to other selves. They are theoretically simpler because they do not have to explain intersubjectivity through the postulation of seeds. Yet, there are other perspectives that emphasize that only a more radical ontology can truly explain intersubjectivity. According to these perspectives, intersubjectivity entails more than both an inferential process that through communication and imagination comes to take on the perspectives of others as well as entailing something more than individual agents attuning themselves to others; what is known as the perspective of dialogism suggests instead that relational wholes and interactions should be regarded as the "basic ontological primitives". They ground this ontology in arguments for "the interdependencies between self and others, and the fact that human beings have socially constituted minds" (Linell 2009, xxiiv)–minds that only exist relationally without own-being:


“relational complexes, whose relata cannot be regarded as preexisting entities (e.g., independent speakers, autonomous individual acts, etc.) but must be understood from within the relational interdependencies.” (Linell 2009, 15)


By denying autonomous individuals ontological primacy and instead focusing on interdependencies, dialogism moves away from having to describe intersubjectivity as subjective happenings in discrete selves. Instead, the field of interdependency becomes the ontological ground. This perspective has similarities to the Buddhist Huáyán perspective.


According to Kongyin Zhencheng (空印鎮澄), the Yogācāra's plurality of sensory worlds is ultimately subsumed in the relational field of dharmadhātu (see Brewster, 2018). On this perspective, the discrete karmic continuums are part of a wider relational sphere where every stream not only influences every other stream but in fact both contain and are contained by every other stream. On this model that comes from Huáyán Buddhism, we no longer need to explain intersubjective experiences as a complex amount of individual, separate streams that influence each other. These experiences can instead be explained by their mutual interfusion and interpenetration. If this interfusion was not the case, Chūjin writes that "there could be no cognition" (in LaFleur 1973, 105). In this view, it is not the case that intersubjective experiences of art come about by the artist planting seeds in every mental stream that in the future might encounter their art, but rather they come about because the streams already mutually contain each other. This relationality can be described as emptiness, or positively as the non-obstruction of phenomena–shì shì wú ài (事事無礙)–the conditioned origination of the dharmadhātu. The Yogācāra idea of separate mental streams—that the 'same world' is "comprised of multiple and discrete sensory worlds" (Brewster 2018, 120)—is only another conventional truth while in reality there is only the one mind. The dharmadhātu "contains and encompasses the totality of the universe" (Brewster 2018, 125).


One mind with the audience


To speak of modes of listening is therefore not about 'imposing' your 'own' mode of listening onto others, or being normative in the usage of power (effectively asking others to hear the way you do). On the contrary, talking about modes of listening requires one to able to be ego-free and to see reality from the intersubjective perspective–from the conditioned origination of the dharmadhātu and the non-obstruction of phenomena. It is about caring for others rather than controlling others. Since we are not Mañjuśrī, our only tools for "real communication", as Stockfelt calls it, are culturally conditioned modes of listening. 


It is people who deny that there are any shared modes of listening at all that are the true egoists, despite claiming not to be so by 'not wanting to impose' anything on anyone else. Such individualists do not want to make any claims about how their music is perceived by others. While they think they give up control, they only give up caring. They make only music for themselves and can not say anything about how other people perceive it. They come across as being uninterested in communicating or helping others with their art. As an answer to the question of whether he thinks about the audience or not, the composer Michael Pisaro gives a modern cynical answer that such thinking only leads "to market research and commercial music, ultimately" (Pisaro & Dougherty, 2018). Individualists subscribe to a view of people as isolated islands that only perceive their own realities, but in reality, we are all equally unestablished and codependent.  


Xenakis, in the conversation with Feldman that was quoted above, has a very different perspective from Pisaro and considers this issue to be a 'non-problem': we are all "made of the same stuff", he says, and therefore one need "never to think" about the perception of the audience. This "same stuff" is relational emptiness—one mind. It is because we are all made of the same interrelated stuff–which is nothing other than emptiness–that we can come to, as Zeami said, take on the perspective of the audience. Being of one mind with the audience is to be attuned to the collective and intersubjective nature of music. Ford and Green, who above reinterpreted Heidegger's concept of "world" with a musical style, write:


"...musical experience is neither "inner", of the "soul" or "spirit", or absolutely individual. Rather the reverse, for pieces do not throw listeners into inwardness, but rather open them out to a nonconceptual world which, whilst registered individually, is also collective. So, rather than having individual control over music, we offer ourselves up to musical experience within the freedom of a collective style. This idea is in accord with Kant's grounding of aesthetic judgement in universal subjective validity, though, and this is most important, with "universal" substituted." (2015, 163)


According to Ford and Green (2015), being attuned to a "collective style" means being attuned to intersubjectivity. The 13th-century Confucian poet Yán Yǔ (嚴羽) placed this idea in a soteriological framework; Yán Yǔ proposed that poetic "enlightenment" (wu 悟) is achieved by the assimilation of a collective style—a complete internalization of an orthodox tradition. The "tradition" is what is intersubjective so internalizing this is tantamount to transcending the ego and thus mastering riken no ken. The awakened poet in Yán Yǔ s theory attained a state of being "where subjective self, medium of communication, and objective reality become one" (Lynn 2004, 216) and because of this could write masterful poems. Being able to hear, like Zeami emphasized, from a perspective without ego, is how we can make our pieces communicate and for Yán Yǔ, this perspective of non-ego necessarily led to assimilating the intersubjective 'tradition'.


Through Yán Yǔ's focus on a shared tradition, we have found our way back to Stockfelt's initial remark that stylistically informed adequate listening is the prerequisite that enables "real communication" between musicians and the audience. It is, however, important to not interpret either Yán Yǔ's orthodox tradition or Stockfelt's modes of listening as endorsing some kind of cultural conservatism. The insights of Yán Yǔ and Stockfelt do not lead to a poetics that valorizes the stagnation of styles or the establishment of fixed rules for creating music: musicians and composers create and invent new modes of listening and new styles all the time. What is important is that these new styles are created while being of one mind with the audience. If what is created is informed by an insight that self and world are interpenetrating aspects of a single process, what is created cannot not be intersubjectively grounded.  


As many artists know, hearing without ego is a skill that needs to be mastered. To this effect, the act of composition becomes an act of self-cultivation. Becoming a better composer is to become more attuned to the intersubjectively and relationally constructed reality, i.e. to cast off the ego and enter into emptiness. Composing is the site for exploring the relationality and union between the self and the world. As we develop as composers, we come to understand self and world as interpenetrating aspects of a single process and that our artworks arise from the total interpenetration of self and world. 


Music in mono-cultures and transcultural music


In the traditional Confucian view of music as found in the Record of Music (樂記), musical practices were not only seen as something merely enabled by intersubjectivity but also as influential agents in creating a harmonious society of shared values: they were creating intersubjectivity. Lǐ Zéhòu (2010) writes how music "caused the natural human emotions to become socialized" and he explains how music had the power to govern and "directly shape and mold humanized emotions" (26). In the Confucian tradition, music was thought to help the preservation of interpersonal and societal relations. It was said that we "learn to synchronize our emotions to others through music" (Park 2015, 127). Unlike the Buddhist tradition, where music was traditionally considered to be subservient to words (see Music and Buddhist Monastics), Mencius emphasized that rhythmic music has a greater effect on people than words because "without realizing it one’s feet begin to step in time to them and one’s hands dance according to their rhythms" (Park 2015, 126). Music has, in other words, a more immediate and attunemental effect on humans than that of words. The effect on people is so profound that "moral emotions cannot be stopped" (Park 2015, 127). 


The Confucian insight into the intersubjective value of music is echoed in contemporary anthropological and psychological theories. In summarizing contemporary findings on this topic, Henrich (2020) writes that by "moving in step with others", as in rituals with music, "the neurological mechanisms used to represent our own actions and those used for other's actions overlap in our brains" which effectively "blurs the distinction between ourselves and others, which leads us to perceive others as more like us and possibly even as extensions of ourselves" (76). Furthermore, synchronous patterns in ritual also "cause all participants to feel similarly" since synchrony leads to an abundance of mimicry cues from others, and "mimicry is one of the tools we use to help us infer other people's thoughts and emotions", causing the emergence of what Henrich calls a "virtuous feedback loop" (76-77). While this might be true of any synchronized actions, tapping one's feet and moving one's hands to the same music infuses the shared actions with a unifying mood that attunes all participants similarly. To make this point, Henrich quotes anthropological accounts by Lorna Marshall of a dance ritual:


"...Whatever their relationship, whatever state of their feelings, whether they like or dislike each other, whether they are on good terms or bad terms with each other, they become a unit, singing clapping, moving together in an extraordinary unison of stamping feet and clapping hands, swept along by the music." (Marshall 1999, 90, quoted in Henrich 2020, 77-78)


The Confucian tradition, therefore, saw great importance in establishing music 'properly'. If the state establishes the appropriate ritual music, "then all members of the state will be able to share its historical and cultural value" (Park 2015, 127).


The Confucian theory of music is based on an insight into intersubjectivity–our world is a shared world in which attunement is more effective than words–but gives from the theoretical point of view outlined in this text (which takes as its paradigmatic music listener the contemporary concert-goer rather than the ritual dancer) too much agency to the music. Transposed upon a context of the modern, more 'passive' listeners, the Confucian theory that there are musical properties that will act upon those who encounter them in the same way–generating the same effect–does not accord with our experiences of going to concerts with other people. From such a perspective, it even seems highly unlikely that music worked so 'effortlessly' as described in the Record of Music–that it automatically synchronized people to the same moral emotions and values. Maybe we are willing to accept that music can attune a group of people to a shared feeling during the duration of ritual dance, but we are perhaps not willing to go the extra step to say that they thereby came to adopt the same moral characters and values, although Henrich seems to argue that finding the latter unlikely is merely indicative of our contemporary, Western, individualistic perspective. Perhaps we therefore are quick to point out that, as Kasulis (2018) does, in Confucian literature, "the descriptive is inextricably linked to the prescriptive" (353). The is and ought are conflated and descriptive statements in the Records of Music should primarily be read normatively–not necessarily describing the reality as it is, anthropologically, but as the authors want it to be.  


Yet, the idea that music has a similar effect on different listeners is not exclusive to the Confucian tradition. In the rich tradition of writings on aesthetics found in the Indian tradition, we find a remarkable acceptance that people necessarily experience a piece of art's rasa (the 'taste' of the attunement to art) in similar ways. As Pollock (2016) explains, 


"for Indian aesthetics, there really is no disputing in matters of taste, not because each reader has his own in accordance with the relativist-skeptical stance of modernity, but because all readers have, ideally, the same." (Pollock 2016)


Since all audience members are taken to have the same response to a rasa, there is no theory of how these rasa-s are learned. Indian aestheticians, however, saw that not everyone had the same capacity for enjoying art, but explained that this was not a matter of differences in education but rather a matter of differences in predispositions from birth. Vishvanatha (c. 1350) said that some people simply due to "merit acquired in a former existence" are born with a "superabundance of sensitivity" that makes them susceptible to relishing rasa


For Abhinavagupta, it is possible to become a better audience member, but this is not a matter of learning a set of conventions for how to interpret art but rather a process of becoming a more clear mirror that can be sensitive to the meanings already inherent in the art. Abhinavagupta argued that someone who has a "heart by nature like a spotless mirror," and a mind "no longer subject to the anger, confusion, craving, and so on typical of this phenomenal world" will make rasa manifest "with absolute clarity" (Pollock, 2016). By "polishing" the mirror of the heart, a spectator can improve their sympathetic responses to art and intensify "their capacity to recognize rasa" (Pollock 2016, 208).  


Both the classical Confucian musical theory and the Indian rasa theory were deterministic to some degree in either believing that meaning resided in the music or believing that society was so homogenous that everyone enacted the same meaning because everyone shared the same values and experiences. If we are so inclined, we may look with nostalgia about what it must have been like living in such 'mono-cultures' in which there were simply no problems concerning the reception of art. Art, we are led to believe, was directly moving humans, and there was no room to hear 'the wrong rasa'.


In contemporary times, we often experience a multicultural landscape rather than a shared culture, leading to fewer common cultural reference points. As a response, many modern musicians seem to divert from culturally narrow listening practices by focusing on more universal and elemental materials, such as the aesthetics of "materiality" and just intonation. This approach may promote a more effortless attunement to the listening modes appropriate for their pieces, in contrast to the learned, sometimes arbitrary-seeming stylistic tropes associated with certain musical practices. 


In the pedagogical context, John Paynter argued something similar when he, in the 1980s, wrote of how music education in mandatory British public schools, which often was based on Western Classical music, alienated the students from home environments other than the 'majority culture' that music education was based on. Paynter found that a much more democratic environment that gave students equal opportunities only came about when music education took cues from the avant-garde music scene by working with noise, graphic scores, free improvisation, and non-stylistically determined musical experimentation. Paynter called for the need not to base music composition upon different 'styles' but upon the merely "common ground we all occupy in being able to hear sounds at all" (1982, 114). By working with a musical expression 'indicative of no culture', we could find something that attempts to speak to what is common to all humans.


As suggested above, the adequate modes of listening to music from such values might more effortlessly attune listeners than other modes of listening. For instance, much of Xenakis's music, which focuses on raw energies and eschews traditional harmonic language, could have a more immediate impact on listeners compared to the complex harmonic language found in late Romantic music. Yet, this music is not by any means automatically graspable since all modes of listening are constructions contingent upon complex cultural processes. It is merely a modernist fantasy that we can create music that by emancipating itself from tradition becomes cross-cultural and universal.


But, as the views put forth in this text suggest, the fact that modes of listening arise in relationship to a being's unique histories and cultures–what Buddhists might call a being's saṃskāra-s; their unique dispositions and habitual patterns in constructing their world–does not mean that everyone hears a piece of music differently and that we should adopt a relativistic stance where 'everything goes' and where we can not talk about adequate modes of listening. While there certainly might be some value in thinking about to what degree the music one creates is able to communicate with people from different backgrounds, and to what degree it relies on its meaningfulness on highly specific tropes, Xenakis himself argued that this is not so much a 'problem' that needs to be solved but merely a way of describing our fundamentally non-dual and relational mode of being–it is a 'non-problem'. To speak of modes of listening is not to talk about how beings construe their own, private versions of reality nor about wanting to dictate how they should construct their experiences. On the contrary, to speak of modes of listening is to talk about how we create and are created by this phenomenal world together, continually. As Dōgen put it, "Mountains, rivers, and earth are born at the same moment with each person" (2013, 129). Unlike 'ways of experiencing a piece of music', the natural world of mountains and rivers are obviously intersubjectively shared phenomena–they are not brought forth by just one person. Yet, Dōgen emphasizes that each person takes part in this construction and that the mountains and rivers depend on each person–the mountains and rivers are "born at the same moment with each person". With each person are mountains, rivers, and earth brought forth. At the same time, each person does not create their own mountains because "when a person is born, this person's birth does not seem to be bringing forth additional mountains" (2013, 114). The conclusion seems to be that individuals neither create nor do not create mountains and rivers. They both create and do not create the mountains and rivers. Another way to put this is that it is a collective process of intersubjective enaction. It is the same with any given mode of listening–individuals neither create nor not create them. It is exactly this kind of relational process that lacks ontological ground that Buddhists attempt to capture with the theory of emptiness.  


Wildflowers bloom at the old garrison,

The travelers echo in the empty woods.

Much rain in spring on the wood plank houses;

Daylight soon turns to darkness in the mountain town. 

Cinnabar Stream connects with the old Guo borders;

White Feather reaches to Jing Peak. 

If you see the fresh scenery of this western hill, 

You'll know the minds of Huang and Qi. 


(Wang Wei, trans. P. Rouzer, 2020, 69)