Decoding Beethoven's music style using data science

Fabian C. Moss of the Digital and Cognitive Musicology Lab (DCML). © 2019 EPFL / Hillary Sanctuary

Fabian C. Moss of the Digital and Cognitive Musicology Lab (DCML). © 2019 EPFL / Hillary Sanctuary

What makes Beethoven sound like Beethoven? EPFL researchers have completed a first analysis of Beethoven’s writing style, applying statistical techniques to unlock recurring patterns.


EPFL researchers are investigating Beethoven’s composition style and they are using statistical techniques to quantify and explore the patterns that characterize musical structures in the Western classical tradition. They confirm what is expected against the backdrop of music theory for the classical music era, but go beyond a music theoretical approach by statistically characterizing the musical language of Beethoven for the very first time. Their study is based on the set of compositions known as the Beethoven String Quartets and the results are published in PLOS ONE on June 6th, 2019.

“New state-of-the-art methods in statistics and data science make it possible for us to analyze music in ways that were out of reach for traditional musicology. The young field of Digital Musicology is currently advancing a whole new range of methods and perspectives,” says Martin Rohrmeier who leads EPFL’s Digital and Cognitive Musicology Lab (DCML) in the College of Humanities’ Digital Humanities Institute. “The aim of our lab is to understand how music works.”

The Beethoven String Quartets refer to 16 quartets encompassing 70 single movements that Beethoven composed throughout his lifetime. He completed his first String Quartet composition at the turn of the 19th century when he was almost 30 years old, and the last in 1826 shortly before his death. A string quartet is a musical ensemble of four musicians playing string instruments: two violins, the viola, and the cello.

From music analysis to big data

For the study, Rohrmeier and colleagues plowed through the scores of all 16 of Beethoven’s String Quartets in digital and annotated form. The most time-consuming part of the work has been to generate the dataset based on ten thousands of annotations by music theoretical experts.

“We essentially generated a large digital resource from Beethoven’s music scores to look for patterns,” says Fabian C. Moss, first author of the PLOS ONE study.

When played, the String Quartets represent over 8 hours of music. The scores themselves contain almost 30 000 chord annotations. A chord is a set of notes that sound at the same time, and a note corresponds to a pitch.

In music analysis, chords can be classified according to the role they play in the musical piece. Two well-known types of chords are called the dominant and the tonic, which have central roles for the build-up of tension and release and for establishing musical phrases. But there is a large number of types of chords, including many variants of the dominant and tonic chords. The Beethoven String Quartets contain over 1000 different types of these chords.

“Our approach exemplifies the growing research field of digital humanities, in which data science methods and digital technologies are used to advance our understanding of real-world sources, such as literary texts, music or paintings, under new digital perspectives,” explains co-author Markus Neuwirth.

Beethoven’s statistical signature

Beethoven’s creative choices are now apparent through the filter of statistical analysis, thanks to this new data set generated by the researchers.

The study finds that very few chords govern most of the music, a phenomenon that is also known in linguistics, where very few words dominate language corpora. As expected from music theory on music from the classical period, the study shows that the compositions are particularly dominated by the dominant and tonic chords and their many variants. Also, the most frequent transition from one chord to the next happens from the dominant to the tonic. The researchers also found that chords strongly select for their order and, thus, define the direction of musical time.

But the statistical methodology reveals more. It characterizes Beethoven’s specific composition style for the String Quartets, through a distribution of all the chords he used, how often they occur, and how they commonly transition from one to the other. In other words, it captures Beethoven’s composition style with a statistical signature.

“This is just the beginning,” explains Moss. “We are continuing our work by extending the datasets to cover a broad range of composers and historical periods, and invite other researchers to join our search for the statistical basis of the inner workings of music.”


Author: Hillary Sanctuary

Source: EPFL