A Musical Journey: Visualization showing the evolution of music genres through the years

13 min readApr 14, 2022

More than 7100 languages are spoken in the world today, yet this number does not account for the most universal language of them all: Music. Music is something that we all listen to, while driving, in the elevator, at a restaurant, and even sometimes while working. It has been around for decades, and it has evolved!

As Hans Christian Anderson rightly said, “Where words fail, music speaks”.

Here, the data is going to speak for the music. The motive behind this post is to highlight how music has changed over the years, to stop and appreciate the history of where our favorite tunes came from!

Music has evolved with society as well as changes in the world. With technological advances, new styles and genres have come about. From radio to television, and records to the internet, music and the way it has been consumed has changed dramatically. This blog post tells the tale of music, using data from Spotify’s Web API.

This visualization was inspired by Hans Rosling’s World Health Chart visualization which can be found here. The visualization shows the Global Life Expectancy versus the Income Per Person in countries across the years. While recreating this visualization with its original dataset, I started to realize how the same concept can be extremely powerful with other datasets. Hence, the inception of this post.

At the end of this blog post, one should be able to understand the tale of music through the years. Witness how the world shifted from rock to r&b to pop and now to pop-rap. Not just the genre, but the number of people listening to and streaming music has increased substantially over time.

Who is this for? Anyone with an interest in music, or a desire to know more about the history of music.

In the end, I extracted and analyzed my own Spotify data to see how Gen-Z I really am!

The Dataset

With over 320 million monthly users and home to 60 million tracks, four billion playlists, and 1.9 million podcasts, Spotify is one of the most popular music streaming platforms in existence. It houses historical data about music from the 1900s to this very day, hence the choice of this dataset.

Python was used to obtain the data using the library Spotipy. The articles referenced include https://towardsdatascience.com/extracting-song-data-from-the-spotify-api-using-python-b1e79388d50.

The data was assimilated from the Spotify API and has 33,375 unique artists, spanning the period 1921 to 2020. Furthermore, the data highlights the various attributes of the tracks in detail.. excruciating detail.

acousticness: A confidence measure from 0 to 100 of whether the track is acoustic. 100 represents high confidence the track is acoustic.
danceability: Describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0 is least danceable and 100 is most danceable.
energy: Energy is a measure from 0 to 100 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy.
instrumentalness: Predicts whether a track contains no vocals. “Ooh” and “aah” sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly “vocal”. The closer the instrumentalness value is to 100, the greater likelihood the track contains no vocal content. Values above 50 are intended to represent instrumental tracks, but confidence is higher as the value approaches 100.
loudness: The overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing the relative loudness of tracks. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typical range between -60 and 0 dB.
speechiness: Detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audiobook, poetry), the closer to 100 the attribute value. Values above 66 describe tracks that are probably made entirely of spoken words. Values between 33 and 66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 33 most likely represent music and other non-speech-like tracks.
tempo: The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, the tempo is the speed or pace of a given piece and derives directly from the average beat duration.
valence: A measure from 0 to 100 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).
popularity: The value will be between 0 and 100, with 100 being the most popular. The popularity is calculated from the popularity of the song on Spotify, estimated by the number of streams, downloads, and so on.
duration: Length of the track in seconds.

For more information about Spotify’s audio features, check out the official documentation at https://developer.spotify.com/web-api/get-audio-features/

The key attribute of analysis is the genres of data which was obtained from another dataset called data_by_genres.csv from the same source. This data needed to be cleaned as each song belongs to multiple genres but contained lists in the genre column which was not ideal for visualization.

Having assimilated and prepared the data, we move on to the analysis.

Have you seen The Sound of Music?

Well, here we are going to literally see the sound of music!

To begin telling the tale of music, a set of features must be chosen. Given the abundance of features, a thorough analysis of the features is needed. The following visualizations highlight features of the audio, and their relative importance to the story.

Attribute variance over time

I plotted the variance of the attributes over time using a multiple-line chart. This plot shows how the various track attributes have changed over the years.

Visual Encoding: The x-axis shows the years while the y-axis plots the trend of the various attributes over time. Color is used to differentiate between the various attributes.

Analysis: One of the key things to notice is that the acousticness of music has reduced dramatically over the years, while the energy has increased. This indicates that the songs have become more upbeat, while at the same time the decline in acousticness shows that the tracks became less natural. Songs with high ‘acousticness’ will consist mostly of natural acoustic sounds (think acoustic guitar, piano, orchestra, the unprocessed human voice), while songs with a low ‘acousticness’ will consist of mostly electric sounds (think electric guitars, synthesizers, drum machines, auto-tuned vocals and so on). Given the music that has been released in the last couple of years, this clearly makes sense. However, this does not give any relevant information about the tale of music specifically.

Pre 2000s vs Post 2000s

We plot the average of the attributes of the songs Pre 2000s vs Post 2000s on a spider chart.

Visual Encoding: The Pre-2000s are represented by the blue plot and the Post-2000s are represented by the orange plot. The points represent the extent of the mean value of the attribute.

Analysis: The stark difference between the acousticness of the songs is evident as well as the difference in energy of songs.

Correlation Matrix

I plotted a correlation matrix to visualize the trend similarities between the attributes.

Visual Encoding: The x-axis and y-axis plot the various attributes against every other attribute. The hue represents the strength of correlation between the attributes.

Analysis: From this, we can clearly see that the highest correlation is between popularity and the years. That is to say that music streaming became popular over the years. In other interesting features, it is also evident that there is a strong correlation between the energy and loudness of songs, indicating that the more energetic the song, the louder it is.

Scatterplot

To further analyze the popularity of songs against other traits, I plotted scatterplots for popularity versus each attribute.

Visual Encoding: The y-axis represents the popularity in each graph, while the x-axis represents each of the other attributes.

Analysis: We notice that popularity and the years almost have a linear trend, while other attributes do not seem to lend much to the intended storyline.

Observing the high correlation between the popularity and years, I decided to use these two attributes along with the number of songs released each year to tell the tale of music.

Coming to the tale of music

This visualization shows how the number of songs in a genre has changed over the years, against the popularity of the songs in the genre in that year.

Visual Encoding

The x-axis represents the popularity of the songs. Initially, I took the average popularity of the songs in a genre, however, the formula for calculating the popularity of songs has changed over the years. This resulted in the average popularity of all genres being similar in a particular year and the trend changing linearly over the years. To combat this and offer a clear differentiation, I summed the popularity of all the songs in a year and changed the display scale to a logarithmic one to highlight the differences in popularity of the genres.

The y-axis shows the count of the songs released in a particular genre in the selected year.

The size of the bubbles portrays the number of artists that released songs in this genre in the year.

The color of the bubbles represents each of the genres.

North and south represent many songs and few songs. East and west represent high popularity and low popularity. Larger the bubble, more the number of artists who released songs in the genre.

The dashboard can be accessed here.

The actual tale

We begin with the 1920s when music was dominated by classical and lounge music, along with rhythm and blues and traveling dance bands (folk music). Given that the 1920s was also post-World War 1, the music was heavily influenced by the traveling bands and was quite upbeat and optimistic with the economic boom. As the name implies, Rhythm ’n’ Blues (“R&B” or “RnB”) is a powerful combination of two old music genres. Gospel (the Rhythm) on one hand, provides an infectious groovy tempo that was already noticeable in its early barbershop days with handclapping and rhythmical vibrating voices. R&B can be the middle ground between Blues and Rock, but also between Rap and Pop or Gospel and House, which makes it one of the most accessible music genres.

The 1930s saw the beginning of the Great Depression which heavily influenced the music of the time. The influence was seen in the theme of songs, but different artists had a different takes on the tempo. Some chose to remain optimistic about the conditions, while others took a sorrowful approach to the same issues. However, this era also saw the rise of radio as a platform for artists to present their music. As a result of which, several new genres emerged and even surpassed their predecessors in terms of popularity.

World War 2 in the 1940s saw most artists making music to entertain the troops and maintain morale. This brought back classical music. After the war in 1945, there was an increase in movie and Broadway music due to the demand for entertainment following a period of turmoil.

Despite the hardships faced during 1920–1940 the music continued to remain mostly upbeat. What is more beautiful to see is that even with most of the world in despair, music was still a universal undertone, uniting the world. The pop stars of the time were primarily crooners (Rudy Vallee, Eddie Cantor, Al Jolson, Bing Crosby, Frank Sinatra) and big band leaders (Cab Calloway, Paul Whiteman, Benny Goodman, Tommy Dorsey, Glenn Miller).

Following this phase of despair, the genre of rock came about and dominated the charts for the next three decades. It is also important to see that rock quickly grew and branched out into several subgenres by the 1960s.

The 1970’s served as a bridge between the seriousness of the music that came out of the late sixties and the excessiveness of music from the 1980s. Disco and Funk music came about and became so pervasive that several established artists who produced music in other genres also began releasing songs in these genres to keep up with the trend. Heavier rock music and punk rock also emerged during the decade, some of them as retaliation to disco.

Rock music maintained its popularity all the way to the 1980s, which is when the other genres began to catch up. In the 1980s music was dramatically changed by the introduction of MTV (Music Television). This inspired the need for artists to gain popularity (hence the shift in popularity in the visualization). It became a necessity for artists to influence the masses, particularly the youth, and sell records, to maintain their stature. This led to a shift in priority from the actual music to the appearance of musicians. Michael Jackson was one of the dominant artists of the decade, bolstered by his creative music videos, dance style, and pure talent. He essentially set the standard for pop music. New Wave and Synth-Pop were popular genres and their electronic sounds fit perfectly with the beginnings of the computer age. Hair Metal bands also became popular during the decade with their theatrical and outrageous music videos and performances. Hip-Hop also came into the mainstream during the decade.

Like the sixties, the 1990s was a decade of extremes with under-produced, anti-establishment grunge bands and gangster rappers enjoying just as much success as the overly produced and studio manufactured pop groups. The decade was ruled by powerful singers with Mariah Carey, Celine Dion, and Whitney Houston topping the charts. However, much of the talent was overshadowed by personal problems faced by the artists during the time.

In the 1990s, the invention of Autotune technology by Andy Hildebrand led to a boom in pop and rap music. As the name suggests it could tune an out-of-pitch voice digitally to make it sound perfect. The number of pop and rap songs almost tripled and the number of artists producing this type of music grew 4 times by the end of 2020.

Overall, we observe how the world’s musical interests evolved in sync, adding to the fact that music truly is a universal language. From classical music, the inception of rock music in the 1960s, watching rock branch out and dominate the charts and evolution of country music, to seeing the birth and growth of pop into the alpha genre that it is today is truly amazing. Another striking thing to notice is that with time the music truly became diverse, and we saw the emergence of several new genres. New genres have their roots deep into the parent genres, some even from the times of Mozart and Beethoven. The current era of music is heavily dominated by the pop genre which has its origin in the 1970s. Since its origin, it has evolved and branched out into several subgenres which have been extremely well-received by the generation today.

To read more in-depth about each of the genres, I would recommend the MusicMap website which dives into great detail about each of the genres and their origin stories.

Analyzing my own music

To make things interesting, I also headed over to Spotify and requested my own listening data for the last 2 years. I created a bunch of visualizations which can be found in this Tableau workbook here.

Visual encoding

For the Top Artists, the y-axis shows the artist while the x-axis shows the number of minutes listened. Color is used to differentiate each song.

For Artist breadth, the y-axis shows the artist while the x-axis shows the distinct songs which are also differentiated by color.

For Top Songs, the color value represents the number of minutes listened which shows the increasing number of minutes from light to dark.

For Genres, each bubble represents a Genre with the color showing the increasing number of minutes.

For Listenership by Month and Hour, the x-axis shows the time value while the y-axis shows the number of minutes listened. Listenership by Month also uses color for the different streaming histories.

The dashboard can be accessed here.

Analysis

Clearly, I have kept up with my generation and my playlist is heavily dominated by pop music. I realized how much time I have spent listening to Ed Sheeran and Maroon 5. My favorite songs, however, are Team by Lorde and Spirits by the Strumbellas.

In conclusion

With such a rich history and its roots so deep in our lives, music truly is a universal language and one whose story must be heard! The code for all the visualizations can be found in my GitHub repository here.

I hope you enjoyed reading this!

External Links:

Tale of Music Dashboard - https://public.tableau.com/views/AMusicalJourney/Sheet3?:language=en-US&publish=yes&:display_count=n&:origin=viz_share_link
Personal Spotify Dashboard - https://public.tableau.com/views/PersonalSpotifyAnalysis_16487882259450/PremiumDashboard?:language=en-US&:display_count=n&:origin=viz_share_link
Github Code repository - https://github.com/saishachhabria/TheTaleOfMusic.

References:

World Health Chart - https://www.gapminder.org/fw/world-health-chart/
Gathering Data from the Spotify API -https://towardsdatascience.com/extracting-song-data-from-the-spotify-api-using-python-b1e79388d50
Spotify Developer API for Audio Features - https://developer.spotify.com/web-api/get-audio-features/
MusicMap - https://musicmap.info/