Interpreting music through the lens of linguistics
For his PhD thesis “Hearing Structure in Music”, Gabriele Cecchetti of the Digital and Cognitive Musicology Lab at CDH investigated how to structurally interpret the way we hear music using concepts from linguistics.
When listening to music, many of us feel immersed in the moment, fully present. However, behind the scenes in our brains, we may be constructing synthetic structures, predicting what comes next, and readjusting our expectations and memories as the piece continues. For his doctoral thesis, Gabriele Cecchetti looked at the underlying cognitive methods driving this process, and explored the parallels between how we interpret music and how we interpret language.
Connecting “virtually unconnected chords”
Cecchetti’s central inquiry revolved around the cognitive processes underlying structural interpretation in music, seeking to understand whether listeners spontaneously form mental structures that reflect relationships between musical moments.
He distinguished two ways of hearing a passage of music: surface hearing and structural interpretation. The former just indicates hearing the raw sounds as they resonate in the air outside of us listeners: in surface hearing, the listener is a passive receiver. Structural hearing, instead, occurs in the listener’s mind and involves actively constructing synthetic structures based on perceived relationships between different “events”, or musical moments. This was the type of hearing that he was interested in.
“How can listeners go from hearing a series of virtually unconnected chords to hearing a network of relations?” he asked. To answer this question, Cecchetti employed a modeling approach. He divided this structural interpreation into two components. On one hand, there is the problem of transforming raw music (chords, melodies, rhythms) into meaningful structures; on the other hand, there is the problem of parsing, which is the real-time process where listeners construct and navigate the intricate network of musical relationships while they are exposed to a musical piece.
The complexity arises when listeners transition from hearing isolated chords to perceiving this intricate network. Cecchetti grappled with several questions, such as: what mental processes allow listeners to connect seemingly unrelated chords into a coherent structure?; when do these cognitive operations occur during music perception?; and, how demanding are these operations?
“Luckily for us, we’re not alone in addressing these questions,” Cecchetti explains. “This is also a core concern of linguistics.”
Parallels with linguistics
In his work, Cecchetti drew parallels with linguistics, referring to the Dependency Locality Theory (DLT), which suggests that structural integration—connecting a word with a previous word—incurs a cognitive cost due to memory retrieval, which is impeded by the interference of intervening words, and to the expectations of what is yet to come in a sentence. Listeners must maintain such expectations in their mind until the expected target is finally encountered.
In music, Cecchetti envisions an algorithm that matches chord sequences with their structure, likening it to acomputational infrastructure involving a buffer and stack system. The buffer reads new events one by one, while the stack fills memory slots, resulting in increased memory storage costs. Eventually, when a match is possible between the buffer and the stack, structural integration occurs combining two partial structures into a bigger one.
While this analogy works at a formal level, “It is still unclear to what extent music and language behave in the same way,” he says.
Does music behave like language?
Cecchetti conducted three experiments to investigate the parallels between linguistics and music. In the first, which looked at syntactic representations, Cecchetti created chord progressions with their own syntactic structure. Listeners were exposed to a prime chord progression, followed by noise, and then a target chord progression. They were asked if the stem in the target was the same as the stem in the prime. Results showed a clear improvement in performance when the prime and target shared the same underlying structure, indicating that the structure of chord progression is represented in the minds of listeners, just like the structure of sentences is in language.
In his second experiment, incremental parsing, listeners would listen to a rhythm, and at some point, they were shown a short visual flash of light. Their task was to later report at what point in the rhythm the flash occurred. Mistakes about the placement of the flash were influenced by the cognitive cost of processing the structure of the rhythms.
Cecchetti then explored retrospective reanalysis in his third experiment, comparing expected "preferred interpretations" with unexpected "garden path" interpretations in ambiguous melodies. In this experiment, listeners would hear a piece of music and try to predict what would come next based on similar pieces of music they had heard before. Similarly to how readers are startled by the ambiguity halfway through a sentence like “When the band played the song pleased the customers”, listeners were shown to revise their interpretation of the ambiguous melodies in the experiment.
“The results of these experiments support the meaningfulness of a model of incremental syntactic processing for the real-time experience of listening to music,” Cecchetti explains, “and we can therefore move to build an algorithmic model similar to those that linguists have been developing for language.”
Going forward, the methodological challenge will be to come up with paradigms to allow listeners to listen to the stimuli more than once, and from a theoretical perspective, finding a way to characterize the interaction between the fully subconscious and automatic way of listening with more active ways of engaging with music: after all, music listening is more like a process of discovery than perceiving passively. According to Cecchetti, these conscious acts of discovery will need to be addressed in the future.
“For me, the most interesting thing of all of this is what comes next.”