Spotify Project: Final Analysis and Presentation

Anılcan Atik
Dost Karaahmetli
Kutay Akalın
Tunahan Kılıç

December 11th, 2019

Key Takeaways

We analyzed the sentiments of current Turkey, USA, Brazil, Japan Top 50 playlists in Spotify and compared them. After that we analyzed the sentiment distribution change in daily Turkey Top 200 playlists between 2017 and 2019 so far.

In the analysis of the keys performed, it is observed that the most used key in the total of countries is the key G, while the least used key is the key D#. (Chapter 5.1)
According to the excitement formula we created, the average highest excitement score of playlist belongs to Brazil and the average lowest excitement score of playlist belongs to the America playlist. (Chapter 5.6)
According to the sentiment analysis we have done, the Brazil playlisyt has overwhelmingly Happy/Joyful sentiment intensity. Although we cannot mention this numerical superiority for other playlists, the intensity of Happy/Joyful feeling appears to be superior in the lists of Turkey, Japan and America as well. (Chapter 5.7.1)
In Turkey Top 200 data analysis (Chapter 6.3.2), we see that in June 2016, Turbulent/Angry songs have more percantange than Happy/Joyful songs.
As a genuinely restless nation, we seem to have an allergy against “Chill/Peaceful” music. (Chapter 6.3.3)
We can see climbing of “Turbulent/Angry” music after 1990’s. The climb is even more steep with 2010’s. (Chapter 6.3.3)
While more “Happy/Joyful” music is listened to after 1980’s, we can see a significant decline after 2010’s. (Chapter 6.3.3)
Can these alterations be explained by the value shift in our society and the “Urban Anomie Theory”?

1. Data Explanation

Our data obtained directly from Spotify Web API. For API connection, we created “Client ID” and “Client Secret” from Spotify for Developers Website. For this purpose, “spotifyr” package used for making connection.

2. Accessing Spotifty Web API

library(httpuv)
library(spotifyr)
library(tidyverse)
library(knitr)
library(lubridate)
library(ggalt)
library(plotly)
library(scales)
library(kableExtra)
options(max.print=1000000)

When connection is made successfully, we can access lots of difrent type data such as aritst, albums, tracks, user profile etc. Here is the Spotify API References. In our project, we will usually use playlist, artist and track data.

3. Gathering Turkey, USA, Japan and Brazil Top 50 Playlists

Our goal here is to download the Top 50 Playlists prepared by Spotify for countries in order to perform analysis. We put together these lists to compare musical differences between countries.

#Get Turkey Top 50
turkey_top_50_id="37i9dQZEVXbIVYVBNw9D5K"
turkey_top_50_audio_features <- get_playlist_audio_features("spotifycharts", turkey_top_50_id) %>% slice(-1)
#Get USA Top 50
usa_top_50_id = "37i9dQZEVXbLRQDuF5jeBp"
usa_top_50_audio_features <- get_playlist_audio_features("spotifycharts", usa_top_50_id)
#Get Japan Top 50
japan_top_50_id = "37i9dQZEVXbKXQ4mDTEBXq"
japan_top_50_audio_features <- get_playlist_audio_features("spotifycharts", japan_top_50_id)
#Get Brazil Top 50
brazil_top_50_id = "37i9dQZEVXbMXbN3EUUhlg"
brazil_top_50_audio_features <- get_playlist_audio_features("spotifycharts", brazil_top_50_id)
#Combining TR, USA, Japan and Brazil top 50 lists
combined_lists <- bind_rows(turkey_top_50_audio_features, usa_top_50_audio_features, japan_top_50_audio_features, brazil_top_50_audio_features)
glimpse(combined_lists)

## Observations: 199
## Variables: 61
## $ playlist_id                        <chr> "37i9dQZEVXbIVYVBNw9D5K", "...
## $ playlist_name                      <chr> "Turkey Top 50", "Turkey To...
## $ playlist_img                       <chr> "https://charts-images.scdn...
## $ playlist_owner_name                <chr> "spotifycharts", "spotifych...
## $ playlist_owner_id                  <chr> "spotifycharts", "spotifych...
## $ danceability                       <dbl> 0.801, 0.743, 0.628, 0.810,...
## $ energy                             <dbl> 0.688, 0.680, 0.725, 0.631,...
## $ key                                <int> 9, 5, 7, 4, 6, 7, 9, 6, 5, ...
## $ loudness                           <dbl> -6.620, -4.344, -7.387, -7....
## $ mode                               <int> 1, 0, 1, 0, 0, 0, 0, 1, 0, ...
## $ speechiness                        <dbl> 0.1130, 0.1030, 0.1100, 0.3...
## $ acousticness                       <dbl> 0.2780, 0.1130, 0.0266, 0.1...
## $ instrumentalness                   <dbl> 8.88e-03, 1.23e-01, 0.00e+0...
## $ liveness                           <dbl> 0.1500, 0.1830, 0.0549, 0.1...
## $ valence                            <dbl> 0.410, 0.694, 0.458, 0.407,...
## $ tempo                              <dbl> 158.003, 180.059, 173.952, ...
## $ track.id                           <chr> "2KlbLTnQ5Wch2oOelW0Y2k", "...
## $ analysis_url                       <chr> "https://api.spotify.com/v1...
## $ time_signature                     <int> 4, 4, 4, 4, 4, 4, 4, 4, 4, ...
## $ added_at                           <chr> "1970-01-01T00:00:00Z", "19...
## $ is_local                           <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ primary_color                      <lgl> NA, NA, NA, NA, NA, NA, NA,...
## $ added_by.href                      <chr> "https://api.spotify.com/v1...
## $ added_by.id                        <chr> "", "", "", "", "", "", "",...
## $ added_by.type                      <chr> "user", "user", "user", "us...
## $ added_by.uri                       <chr> "spotify:user:", "spotify:u...
## $ added_by.external_urls.spotify     <chr> "https://open.spotify.com/u...
## $ track.artists                      <list> [<data.frame[2 x 6]>, <dat...
## $ track.available_markets            <list> [<"AD", "AE", "AR", "AT", ...
## $ track.disc_number                  <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ track.duration_ms                  <int> 154251, 196583, 269002, 185...
## $ track.episode                      <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ track.explicit                     <lgl> TRUE, FALSE, FALSE, TRUE, F...
## $ track.href                         <chr> "https://api.spotify.com/v1...
## $ track.is_local                     <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ track.name                         <chr> "Wir sind Kral", "AYA", "To...
## $ track.popularity                   <int> 61, 70, 74, 78, 81, 77, 60,...
## $ track.preview_url                  <chr> NA, NA, "https://p.scdn.co/...
## $ track.track                        <lgl> TRUE, TRUE, TRUE, TRUE, TRU...
## $ track.track_number                 <int> 11, 1, 1, 1, 1, 1, 6, 1, 1,...
## $ track.type                         <chr> "track", "track", "track", ...
## $ track.uri                          <chr> "spotify:track:2KlbLTnQ5Wch...
## $ track.album.album_type             <chr> "album", "single", "single"...
## $ track.album.artists                <list> [<data.frame[2 x 6]>, <dat...
## $ track.album.available_markets      <list> [<"AD", "AE", "AR", "AT", ...
## $ track.album.href                   <chr> "https://api.spotify.com/v1...
## $ track.album.id                     <chr> "3gSlFOtOhV1OS3qTQx4g55", "...
## $ track.album.images                 <list> [<data.frame[3 x 3]>, <dat...
## $ track.album.name                   <chr> "Lights Out", "AYA", "Sarka...
## $ track.album.release_date           <chr> "2019-11-14", "2019-09-20",...
## $ track.album.release_date_precision <chr> "day", "day", "day", "day",...
## $ track.album.total_tracks           <int> 12, 1, 5, 1, 1, 1, 12, 1, 1...
## $ track.album.type                   <chr> "album", "album", "album", ...
## $ track.album.uri                    <chr> "spotify:album:3gSlFOtOhV1O...
## $ track.album.external_urls.spotify  <chr> "https://open.spotify.com/a...
## $ track.external_ids.isrc            <chr> "DECE71900765", "NLG6619008...
## $ track.external_urls.spotify        <chr> "https://open.spotify.com/t...
## $ video_thumbnail.url                <lgl> NA, NA, NA, NA, NA, NA, NA,...
## $ key_name                           <chr> "A", "F", "G", "E", "F#", "...
## $ mode_name                          <chr> "major", "minor", "major", ...
## $ key_mode                           <chr> "A major", "F minor", "G ma...

4. Adding Sentiments in Each Track

The purpose of this function named “classify_track_sentiment” is important for us to work primarily to reveal the mood of songs and song lists along these lines. Energy and valence are two important factors in terms of interpreting emotion in music. The variations of these two factors, which have values between 0 and 1, in this range determine the songs to be turbulent/angry, happy/joyful, sad/depressing and chill/peaceful.

According to Get Audio Features for a Track, explanations of the corresponding factors are as follows.

4.1. Energy

Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy.

4.2. Valence

A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).

classify_track_sentiment <- function(valence, energy) {
  if (is.na(valence) | is.na(energy)) {
    return(NA)
  }
  else if (valence >= .5) {
    if (energy >= .5) {
      return('Happy/Joyful')
    } else {
      return('Chill/Peaceful')
    }
  } else {
    if (energy >= .5) {
      return('Turbulent/Angry')
    } else {
      return('Sad/Depressing')
    }
  }
}
track_sentiment = c()
for (i in 1:199){
  
  track_sentiment[i] = classify_track_sentiment(combined_lists[[15]][[i]], combined_lists[[7]][[i]])
  
}
#Adding sentiment column to Combined of four countries
combined_lists<-cbind(combined_lists,track_sentiment)
#Adding Artist Column to Combined Music List
track_audio_combined <- combined_lists %>% 
  select(track.name,track.id,track.artists,track.album.release_date,track.popularity,danceability:tempo,track_sentiment,track.duration_ms)
kable(head(track_audio_combined)) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive")) %>% scroll_box(width = "100%", height = "400px")

track.name	track.id	track.artists	track.album.release_date	track.popularity	danceability	energy	key	loudness	mode	speechiness	acousticness	instrumentalness	liveness	valence	tempo	track_sentiment	track.duration_ms
Wir sind Kral	2KlbLTnQ5Wch2oOelW0Y2k	list(href = c(“https://api.spotify.com/v1/artists/5pVRwX5ZQR7hfJ18w8ZYkl”, “https://api.spotify.com/v1/artists/6LnJKrtFnTEGdbWQ2riWCL”), id = c(“5pVRwX5ZQR7hfJ18w8ZYkl”, “6LnJKrtFnTEGdbWQ2riWCL”), name = c(“Ufo361”, “Ezhel”), type = c(“artist”, “artist”), uri = c(“spotify:artist:5pVRwX5ZQR7hfJ18w8ZYkl”, “spotify:artist:6LnJKrtFnTEGdbWQ2riWCL”), external_urls.spotify = c(“https://open.spotify.com/artist/5pVRwX5ZQR7hfJ18w8ZYkl”, “https://open.spotify.com/artist/6LnJKrtFnTEGdbWQ2riWCL”))	2019-11-14	61	0.801	0.688	9	-6.620	1	0.1130	0.2780	0.008880	0.1500	0.410	158.003	Turbulent/Angry	154251
AYA	4IJEw3fDvS6XF4sDc3bvjK	list(href = c(“https://api.spotify.com/v1/artists/2y1VzMKAa5nmfXKtJL9jnj”, “https://api.spotify.com/v1/artists/6LnJKrtFnTEGdbWQ2riWCL”), id = c(“2y1VzMKAa5nmfXKtJL9jnj”, “6LnJKrtFnTEGdbWQ2riWCL”), name = c(“Murda”, “Ezhel”), type = c(“artist”, “artist”), uri = c(“spotify:artist:2y1VzMKAa5nmfXKtJL9jnj”, “spotify:artist:6LnJKrtFnTEGdbWQ2riWCL”), external_urls.spotify = c(“https://open.spotify.com/artist/2y1VzMKAa5nmfXKtJL9jnj”, “https://open.spotify.com/artist/6LnJKrtFnTEGdbWQ2riWCL”))	2019-09-20	70	0.743	0.680	5	-4.344	0	0.1030	0.1130	0.123000	0.1830	0.694	180.059	Happy/Joyful	196583
Toz Taneleri	36ulbeGLdspdIYSFKXIlmN	list(href = “https://api.spotify.com/v1/artists/1KXTegXtnCPKXjRaX1llcD”, id = “1KXTegXtnCPKXjRaX1llcD”, name = “Sagopa Kajmer”, type = “artist”, uri = “spotify:artist:1KXTegXtnCPKXjRaX1llcD”, external_urls.spotify = “https://open.spotify.com/artist/1KXTegXtnCPKXjRaX1llcD”)	2019-11-29	74	0.628	0.725	7	-7.387	1	0.1100	0.0266	0.000000	0.0549	0.458	173.952	Turbulent/Angry	269002
Arkadaş	6bBnnrknLbDoOCUdKMkmnq	list(href = “https://api.spotify.com/v1/artists/2kS0jWMkkFBL0mrl0VotD0”, id = “2kS0jWMkkFBL0mrl0VotD0”, name = “Ben Fero”, type = “artist”, uri = “spotify:artist:2kS0jWMkkFBL0mrl0VotD0”, external_urls.spotify = “https://open.spotify.com/artist/2kS0jWMkkFBL0mrl0VotD0”)	2019-11-08	78	0.810	0.631	4	-7.855	0	0.3270	0.1170	0.000000	0.1190	0.407	144.978	Turbulent/Angry	185566
Dance Monkey	1rgnBhdG2JDFTbYkYRZAku	list(href = “https://api.spotify.com/v1/artists/2NjfBq1NflQcKSeiDooVjY”, id = “2NjfBq1NflQcKSeiDooVjY”, name = “Tones and I”, type = “artist”, uri = “spotify:artist:2NjfBq1NflQcKSeiDooVjY”, external_urls.spotify = “https://open.spotify.com/artist/2NjfBq1NflQcKSeiDooVjY”)	2019-05-10	81	0.825	0.593	6	-6.401	0	0.0988	0.6880	0.000161	0.1700	0.540	98.078	Happy/Joyful	209754
Nalan	1LNUxWJifZNEPpd273N2le	list(href = “https://api.spotify.com/v1/artists/4XP7cGw4t8BqZ8Du5q3bHg”, id = “4XP7cGw4t8BqZ8Du5q3bHg”, name = “Emir Can Igrek”, type = “artist”, uri = “spotify:artist:4XP7cGw4t8BqZ8Du5q3bHg”, external_urls.spotify = “https://open.spotify.com/artist/4XP7cGw4t8BqZ8Du5q3bHg”)	2019-09-06	77	0.540	0.418	7	-15.570	0	0.0761	0.1900	0.685000	0.1980	0.189	91.042	Sad/Depressing	199958

artist_names = c()
for (i in 1:199){
  
  artist_names[i] <- track_audio_combined[[3]][[i]][[3]]
  
}
combined_lists <-cbind(combined_lists, artist_names)
glimpse(combined_lists)

## Observations: 199
## Variables: 63
## $ playlist_id                        <chr> "37i9dQZEVXbIVYVBNw9D5K", "...
## $ playlist_name                      <chr> "Turkey Top 50", "Turkey To...
## $ playlist_img                       <chr> "https://charts-images.scdn...
## $ playlist_owner_name                <chr> "spotifycharts", "spotifych...
## $ playlist_owner_id                  <chr> "spotifycharts", "spotifych...
## $ danceability                       <dbl> 0.801, 0.743, 0.628, 0.810,...
## $ energy                             <dbl> 0.688, 0.680, 0.725, 0.631,...
## $ key                                <int> 9, 5, 7, 4, 6, 7, 9, 6, 5, ...
## $ loudness                           <dbl> -6.620, -4.344, -7.387, -7....
## $ mode                               <int> 1, 0, 1, 0, 0, 0, 0, 1, 0, ...
## $ speechiness                        <dbl> 0.1130, 0.1030, 0.1100, 0.3...
## $ acousticness                       <dbl> 0.2780, 0.1130, 0.0266, 0.1...
## $ instrumentalness                   <dbl> 8.88e-03, 1.23e-01, 0.00e+0...
## $ liveness                           <dbl> 0.1500, 0.1830, 0.0549, 0.1...
## $ valence                            <dbl> 0.410, 0.694, 0.458, 0.407,...
## $ tempo                              <dbl> 158.003, 180.059, 173.952, ...
## $ track.id                           <chr> "2KlbLTnQ5Wch2oOelW0Y2k", "...
## $ analysis_url                       <chr> "https://api.spotify.com/v1...
## $ time_signature                     <int> 4, 4, 4, 4, 4, 4, 4, 4, 4, ...
## $ added_at                           <chr> "1970-01-01T00:00:00Z", "19...
## $ is_local                           <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ primary_color                      <lgl> NA, NA, NA, NA, NA, NA, NA,...
## $ added_by.href                      <chr> "https://api.spotify.com/v1...
## $ added_by.id                        <chr> "", "", "", "", "", "", "",...
## $ added_by.type                      <chr> "user", "user", "user", "us...
## $ added_by.uri                       <chr> "spotify:user:", "spotify:u...
## $ added_by.external_urls.spotify     <chr> "https://open.spotify.com/u...
## $ track.artists                      <list> [<data.frame[2 x 6]>, <dat...
## $ track.available_markets            <list> [<"AD", "AE", "AR", "AT", ...
## $ track.disc_number                  <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ track.duration_ms                  <int> 154251, 196583, 269002, 185...
## $ track.episode                      <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ track.explicit                     <lgl> TRUE, FALSE, FALSE, TRUE, F...
## $ track.href                         <chr> "https://api.spotify.com/v1...
## $ track.is_local                     <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ track.name                         <chr> "Wir sind Kral", "AYA", "To...
## $ track.popularity                   <int> 61, 70, 74, 78, 81, 77, 60,...
## $ track.preview_url                  <chr> NA, NA, "https://p.scdn.co/...
## $ track.track                        <lgl> TRUE, TRUE, TRUE, TRUE, TRU...
## $ track.track_number                 <int> 11, 1, 1, 1, 1, 1, 6, 1, 1,...
## $ track.type                         <chr> "track", "track", "track", ...
## $ track.uri                          <chr> "spotify:track:2KlbLTnQ5Wch...
## $ track.album.album_type             <chr> "album", "single", "single"...
## $ track.album.artists                <list> [<data.frame[2 x 6]>, <dat...
## $ track.album.available_markets      <list> [<"AD", "AE", "AR", "AT", ...
## $ track.album.href                   <chr> "https://api.spotify.com/v1...
## $ track.album.id                     <chr> "3gSlFOtOhV1OS3qTQx4g55", "...
## $ track.album.images                 <list> [<data.frame[3 x 3]>, <dat...
## $ track.album.name                   <chr> "Lights Out", "AYA", "Sarka...
## $ track.album.release_date           <chr> "2019-11-14", "2019-09-20",...
## $ track.album.release_date_precision <chr> "day", "day", "day", "day",...
## $ track.album.total_tracks           <int> 12, 1, 5, 1, 1, 1, 12, 1, 1...
## $ track.album.type                   <chr> "album", "album", "album", ...
## $ track.album.uri                    <chr> "spotify:album:3gSlFOtOhV1O...
## $ track.album.external_urls.spotify  <chr> "https://open.spotify.com/a...
## $ track.external_ids.isrc            <chr> "DECE71900765", "NLG6619008...
## $ track.external_urls.spotify        <chr> "https://open.spotify.com/t...
## $ video_thumbnail.url                <lgl> NA, NA, NA, NA, NA, NA, NA,...
## $ key_name                           <chr> "A", "F", "G", "E", "F#", "...
## $ mode_name                          <chr> "major", "minor", "major", ...
## $ key_mode                           <chr> "A major", "F minor", "G ma...
## $ track_sentiment                    <fct> Turbulent/Angry, Happy/Joyf...
## $ artist_names                       <fct> Ufo361, Murda, Sagopa Kajme...

5. Plot Analysis

5.1. Country Playlists by Key

The number of common songs in the 4 lists we have is few (Analysis 5.2). In this respect, the musical keys used mainly in songs on a country basis can be obtained from the plot.

country_by_key <- combined_lists%>%
  select(playlist_name, key_name, track.name)%>%
  group_by(playlist_name) %>% count(key_name, sort = TRUE)
  
kable(country_by_key) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

playlist_name	key_name	n
United States Top 50	G	10
Brazil Top 50	B	8
Japan Top 50	C#	8
Japan Top 50	G	8
United States Top 50	C	8
Turkey Top 50	A	7
United States Top 50	D	7
Brazil Top 50	D	6
Brazil Top 50	F#	6
Turkey Top 50	D	6
Turkey Top 50	F	6
Turkey Top 50	G#	6
United States Top 50	C#	6
Brazil Top 50	C#	5
Brazil Top 50	G#	5
Japan Top 50	D	5
Japan Top 50	E	5
Turkey Top 50	F#	5
Turkey Top 50	G	5
United States Top 50	F#	5
Brazil Top 50	A#	4
Brazil Top 50	F	4
Japan Top 50	A	4
Japan Top 50	A#	4
Japan Top 50	F#	4
Japan Top 50	G#	4
United States Top 50	A#	4
Brazil Top 50	C	3
Brazil Top 50	E	3
Brazil Top 50	G	3
Japan Top 50	B	3
Turkey Top 50	A#	3
Turkey Top 50	B	3
Turkey Top 50	C	3
Turkey Top 50	E	3
United States Top 50	E	3
United States Top 50	G#	3
Brazil Top 50	A	2
Japan Top 50	C	2
Japan Top 50	F	2
United States Top 50	B	2
Brazil Top 50	D#	1
Japan Top 50	D#	1
Turkey Top 50	C#	1
Turkey Top 50	D#	1
United States Top 50	A	1
United States Top 50	F	1

ggplot(country_by_key, aes(x = key_name, y = n, fill = playlist_name)) + 
  geom_bar(stat = "identity") +
  labs(title = "Playlists by Key Name", x = "Key Name", y = "Total Number of Keys") + 
  theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), legend.title = element_blank())

5.2. Common Songs in Playlists

It is possible that there are common songs that could affect our analysis using the music charts of the 4 countries we have. The following table and plot show the songs that are common in the charts and how many of these songs are on the playlists.

common_songs <- combined_lists %>% group_by(track.name, artist_names) %>%
  summarise(n_songs = n()) %>% 
  filter(n_songs >= 2) %>% 
  arrange(desc(n_songs))
kable(common_songs) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

track.name	artist_names	n_songs
Señorita	Shawn Mendes	4
bad guy	Billie Eilish	3
Dance Monkey	Tones and I	3
All I Want for Christmas Is You	Mariah Carey	2
everything i wanted	Billie Eilish	2
Memories	Maroon 5	2

ggplot(common_songs, aes(x = reorder(track.name, n_songs), y = n_songs, fill = artist_names)) + 
  geom_bar(stat = "identity") + 
  labs(title = "Common Songs on Playlists", x = "Song Name", y = "Number of Songs") + 
  theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), legend.title = element_blank()) +
  coord_flip()

5.3 .Danceability Density of Playlists

Danceability shows us how fun and full of energy a song is. So if we can plot the distribution of danceability by country, we can see which country’s playlist is more fun and full of energy. But we need to wait a little bit for the following analysis to detect the emotions and feelings that the songs reflect.

ggplot(combined_lists, aes(x = danceability, fill = playlist_name)) + 
  geom_density(alpha = 0.7, color = NA)+
  labs(x = "Danceability", y = "Density") +
  guides(fill = guide_legend(title = "Playlist"))+
  theme_minimal()+
  ggtitle("Distribution of Danceability Data") +
  theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), legend.title = element_blank())

5.4. Energy and Valence Range of Playlists

In the steps above, we added a column of sentiment to the songs. We used energy and valence values to do this. The following chart shows the range of energy and valence values according to the country playlists.

playlist_feature_range <- combined_lists %>%
  group_by(playlist_name)%>%
  mutate(max_energy=max(energy), max_valence = max(valence))%>%
  mutate(min_energy=min(energy), min_valence = min(valence))%>%
  select(playlist_name, min_energy, max_energy, min_valence, max_valence)%>%
  unique()
kable(playlist_feature_range) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

playlist_name	min_energy	max_energy	min_valence	max_valence
Turkey Top 50	0.191	0.931	0.1050	0.928
United States Top 50	0.210	0.816	0.0605	0.947
Japan Top 50	0.320	0.979	0.1840	0.962
Brazil Top 50	0.426	0.958	0.1520	0.964

5.4.1. Energy and Valence Range of Playlists with Dumbbell Plot

The following chart is created with plotly, so you can move your cursor over the chart to see the maximum and minimum values.

energy_range_plot <- plot_ly(playlist_feature_range, color = I("gray80"),  
                hoverinfo = 'text') %>%
  add_segments(x = ~max_energy, xend = ~min_energy, y = ~playlist_name, yend = ~playlist_name, showlegend = FALSE) %>%
  add_segments(x = ~max_valence, xend = ~min_valence, y = ~playlist_name, yend = ~playlist_name, showlegend = FALSE) %>%
  add_markers(x = ~max_energy, y = ~playlist_name, name = "Maximum Energy Value", color = I("red"), size = 2.5, text=~paste('Max Energy: ', max_energy)) %>%
  add_markers(x = ~min_energy, y = ~playlist_name, name = "Minimum Energy Value", color = I("blue"), size = 2.5, text=~paste('Min Energy: ', min_energy))%>%
  add_markers(x = ~max_valence, y = ~playlist_name, name = "Maximum Valence Value", color = I("#395B74"), size = 2.5, text=~paste('Max Valence: ', max_valence)) %>%
  add_markers(x = ~min_valence, y = ~playlist_name, name = "Minimum Valence Value", color = I("#F7BC08"), size = 2.5, text=~paste('Min Valence: ', min_valence))%>%
  layout(
    title = "Playlist Energy and Valence Range",
    xaxis = list(title = "Energy and Valence"),
    yaxis= list(title="Country Lists"))
ggplotly(energy_range_plot)

5.5. Excitement of Playlists

While researching for Spotify analysis, we came across a beautiful analysis on the link. We have updated the formula used in this analysis to suit our own purpose. Here, we imagine what kind of music will be fun to hear when we go to a party or a festival. Values such as energy, danceability, tempo and loudness have a great impact on the energy and happiness of a song. We add the valence value that we will use in the emotion analysis of the songs to this group and use the following formula to find the average excitement of playlists.

excitement_of_playlist <- combined_lists %>% group_by(playlist_name) %>% 
  select(playlist_name, track.name, valence, energy, loudness, danceability, tempo) %>% 
  mutate(excitement = loudness + tempo + (energy*100) + (danceability*100) + (valence*100), excitement_mean = mean(excitement))
ggplot(excitement_of_playlist, aes(x = excitement, fill = playlist_name, color = playlist_name)) + 
  geom_histogram(binwidth = 30, position = "identity", alpha = 0.7) +
  geom_vline(data = excitement_of_playlist, aes(xintercept = excitement_mean, color = playlist_name),
             linetype = "dashed") +
  labs(title = "Excitement Distribution of Playlists", y = "Count", x = "Excitement Scale") +
  theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5), 
        axis.title.x = element_text(size = 14, face = "bold"), axis.title.y = element_text(size = 14, face = "bold"),
        legend.title = element_blank())

5.6. Mean Excitement of Playlists

The plot below shows the mean excitement of the playlists by country. Brazil’s playlist is noticeably exciting and entertaining, with Japan in second place out of three other countries whose averages are close to each other.

excitement_mean <- excitement_of_playlist %>% group_by(playlist_name) %>% select(excitement_mean) %>% unique()
kable(excitement_mean) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

playlist_name	excitement_mean
Turkey Top 50	296.3666
United States Top 50	291.7878
Japan Top 50	306.2224
Brazil Top 50	337.4382

ggplot(excitement_mean, aes(x = reorder(playlist_name, excitement_mean), y = excitement_mean, fill = playlist_name)) + 
  geom_bar(stat ="identity") + 
  labs(title = "Excitement Comparison of Playlists", x = "Country Playlist Names", y = "Means of Excitement", fill = "Country Charts", 
       caption = "The low score shows that the list is boring. \n Excitement Formula: (loudness + tempo + (energy*100) + (danceability*100) + (valence*100))") +
  theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), 
        legend.title = element_blank())

5.7. Sentiment Analysis of Country Playlists

We create the a table by selecting energy, valence and sentiment columns for each country.

sentiment_by_countries <- combined_lists %>% group_by(playlist_name) %>% 
  select(playlist_name, track.name, artist_names, valence, energy, track_sentiment)
kable(head(sentiment_by_countries,n=15L)) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

playlist_name	track.name	artist_names	valence	energy	track_sentiment
Turkey Top 50	Wir sind Kral	Ufo361	0.410	0.688	Turbulent/Angry
Turkey Top 50	AYA	Murda	0.694	0.680	Happy/Joyful
Turkey Top 50	Toz Taneleri	Sagopa Kajmer	0.458	0.725	Turbulent/Angry
Turkey Top 50	Arkadaş	Ben Fero	0.407	0.631	Turbulent/Angry
Turkey Top 50	Dance Monkey	Tones and I	0.540	0.593	Happy/Joyful
Turkey Top 50	Nalan	Emir Can İğrek	0.189	0.418	Sad/Depressing
Turkey Top 50	Yemin Olsun	Ufo361	0.310	0.582	Turbulent/Angry
Turkey Top 50	Neresi?	BEGE	0.652	0.562	Happy/Joyful
Turkey Top 50	Goal	Patron	0.461	0.750	Turbulent/Angry
Turkey Top 50	Güzel Kızlar Patron Dinler	Patron	0.537	0.723	Happy/Joyful
Turkey Top 50	Eskimiş Senelere	Aspova	0.496	0.515	Turbulent/Angry
Turkey Top 50	LOLO	Ezhel	0.566	0.603	Happy/Joyful
Turkey Top 50	Ay Tenli Kadın	Ufuk Beydemir	0.335	0.541	Turbulent/Angry
Turkey Top 50	YKKE	Ufo361	0.456	0.445	Sad/Depressing
Turkey Top 50	Neyse	Sagopa Kajmer	0.317	0.776	Turbulent/Angry

5.7.1. Sentiment Analysis of Country Playlists with Gradient Chart

The following table shows the sentiment intensity of the songs in the playlists, grouped by country. The analysis shows that Brazilian playlist is in the Happy/Joyful sentiment class. In other playlists, although the Happy/Joyful is superior, the Turbulent/Angry appears to be in close numbers.

kable(sentiment_by_countries %>% count(track_sentiment, sort = TRUE)) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

playlist_name	track_sentiment	n
Brazil Top 50	Happy/Joyful	41
Japan Top 50	Happy/Joyful	26
Japan Top 50	Turbulent/Angry	20
Turkey Top 50	Happy/Joyful	19
United States Top 50	Happy/Joyful	19
Turkey Top 50	Turbulent/Angry	18
United States Top 50	Turbulent/Angry	18
Turkey Top 50	Sad/Depressing	9
United States Top 50	Sad/Depressing	7
United States Top 50	Chill/Peaceful	6
Brazil Top 50	Turbulent/Angry	5
Brazil Top 50	Sad/Depressing	3
Turkey Top 50	Chill/Peaceful	3
Japan Top 50	Chill/Peaceful	2
Japan Top 50	Sad/Depressing	2
Brazil Top 50	Chill/Peaceful	1

ggplot(sentiment_by_countries,aes(x = valence, y = energy, color = track_sentiment)) + geom_point() +
  labs(color = "", title = "Sentiment Analysis by Each Country") +
  theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), legend.title = element_blank()) +
  scale_x_continuous(expand = c(0, 0), limits = c(0, 1)) + 
  scale_y_continuous(expand = c(0, 0), limits = c(0, 1)) +
  geom_label(aes(x = 0.12, y = 0.98, label = "Turbulent/Angry"), label.padding = unit(1, "mm"), fill = "grey", color="white") +
  geom_label(aes(x = 0.90, y = 0.98, label = "Happy/Joyful"), label.padding = unit(1, "mm"), fill = "grey", color="white") +
  geom_label(aes(x = 0.12, y = 0.025, label = "Sad/Depressing"), label.padding = unit(1, "mm"),  fill = "grey", color="white") +
  geom_label(aes(x = 0.895, y = 0.025, label = "Chill/Peaceful"), label.padding = unit(1, "mm"), fill = "grey", color="white") +
  geom_segment(aes(x = 1, y = 0, xend = 1, yend = 1)) +
  geom_segment(aes(x = 0, y = 0, xend = 0, yend = 1)) +
  geom_segment(aes(x = 0, y = 0, xend = 1, yend = 0)) +
  geom_segment(aes(x = 0, y = 0.5, xend = 1, yend = 0.5)) +
  geom_segment(aes(x = 0.5, y = 0, xend = 0.5, yend = 1)) +
  geom_segment(aes(x = 0, y = 1, xend = 1, yend = 1)) +
  facet_wrap(~ playlist_name)

6. Turkey Top 200 Daily Data Between 2017-2019

We obtained Turkey Top 200 daily playlist data between January 2017 and November 2019 on Spotify Charts . Because the data consist of 211.400 rows, data frame uploaded group github page in RDS format.

topturkey200<-readRDS(url("https://github.com/pjournal/mef03g-spo-R-ify/blob/master/turkeytop200.rds?raw=true"))
glimpse(topturkey200)

## Observations: 211,400
## Variables: 6
## $ Position   <chr> "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", ...
## $ Track.Name <chr> "Gece Gölgenin Rahatina Bak", "Starboy", "Affet", "...
## $ Artist     <chr> "Çagatay Akman", "The Weeknd", "Müslüm Gürses", "Cl...
## $ Streams    <chr> "80607", "44427", "34889", "28400", "25425", "23032...
## $ URL        <chr> "https://open.spotify.com/track/3P31rcl0ym5paqRdwSi...
## $ Date       <date> 2017-01-01, 2017-01-01, 2017-01-01, 2017-01-01, 20...

6.1. Monthly Change in Total Streams

Firstly, we wanted to look at monthly total stream change. In the graph below, we see that total stream increases exponentially. We can conclude that usage of spotify and stream amount increased rapidly in Turkey.

#First cal
#topturkey200 %>% group_by(Artist)%>% summarise(Total_number=n()) %>% arrange(desc(Total_number))
change<-topturkey200 %>% mutate(Year_Month = format(Date,"%Y/%m")) %>% group_by(Year_Month) %>% summarise(Total_Stream=sum(as.numeric(Streams))) 
ggplot(change, aes(x = Year_Month,y=Total_Stream,group=1)) + geom_point() + geom_smooth() + theme(axis.text.x = element_text(angle = 90),title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5)) + labs(x = "Month", y = "Total Streams",title = "Total Stream Change") + scale_y_continuous(labels = comma)

6.2. Most Streamed 20 Tracks

If we curious about which songs streamed most between 2017 and 2019 so far, Ezhel - Geceler is the answer. First 20 most streamed songs are displayed in the table below.

rank<-topturkey200 %>% group_by(Artist,Track.Name) %>% summarise(Total_Stream=sum(as.numeric(Streams))) %>% arrange(desc(Total_Stream))
kable(head(rank,n=20L)) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

Artist	Track.Name	Total_Stream
Ezhel	Geceler	44258676
Ezhel	Felaket	43699884
Ufuk Beydemir	Ay Tenli Kadın	35480606
Norm Ender	Mekanın Sahibi	34730473
Ben Fero	Biladerim İçin	33648205
Ezhel	İmkansızım	32607042
Ben Fero	3 2 1	32086343
Ben Fero	Demet Akalın	28807007
Anıl Piyancı	KAFA10	28704551
Reynmen	Ela	28246604
Ezhel	Kazıdık Tırnaklarla	27783484
Yüzyüzeyken Konuşuruz	Ne Farkeder	27540693
Ceza	Neyim Var Ki (feat. Sagopa K)	27067016
Murda	AYA	26111866
Yüzyüzeyken Konuşuruz	Dinle Beni Bi’	25431753
Feride Hilal Akın	Yok Yok	24360307
MERO	Olabilir	23990740
Ed Sheeran	Shape of You	23958838
Anıl Piyancı	Bırakman Doğru Mu	23643712
Reynmen	Derdim Olsun	23545514

6.3. Sentiment Analysis of Tracks

6.3.1. Data Preparation

For sentiment analysis, we need to extract unique songs in data frame. So that, we can determine the sentiments of each tracks which can entry the top 200 songs in sometime. After this process, we extract the Spotify ID’s of tracks on a new column for further analysis.

top_200<-topturkey200 %>% mutate(id=substring(topturkey200$URL,32)) 
top_200_unique<-top_200[!duplicated(top_200[,c('id')]),]
glimpse(top_200_unique)

## Observations: 2,874
## Variables: 7
## $ Position   <chr> "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", ...
## $ Track.Name <chr> "Gece Gölgenin Rahatina Bak", "Starboy", "Affet", "...
## $ Artist     <chr> "Çagatay Akman", "The Weeknd", "Müslüm Gürses", "Cl...
## $ Streams    <chr> "80607", "44427", "34889", "28400", "25425", "23032...
## $ URL        <chr> "https://open.spotify.com/track/3P31rcl0ym5paqRdwSi...
## $ Date       <date> 2017-01-01, 2017-01-01, 2017-01-01, 2017-01-01, 20...
## $ id         <chr> "3P31rcl0ym5paqRdwSiZps", "5aAx2yezTd8zXrkmtKl66Z",...

With track ID’s, we obtanied audio features of this songs using Spotify Web API and “spotifyr” R package. To increase process speed, we upload the RDS data in our github repository and read it from there.

Id_list=top_200$id
#Tracks feautres obtaining code is below. Because of the process time data frame downloaded from github repository.
#a<-unique(Id_list)
#tracks_features=get_track_audio_features(a[1])
#for (x in 2:length(a)){
#  tracks_features <- rbind(tracks_features,get_track_audio_features(a[x]))
#}
#tracks_features<-tracks_features%>%slice(-1) 
 
tracks_features<-readRDS(url("https://github.com/pjournal/mef03g-spo-R-ify/blob/master/top200_tracks_features.rds?raw=true"))
glimpse(tracks_features)

## Observations: 2,874
## Variables: 18
## $ danceability     <dbl> 0.769, 0.681, 0.424, 0.720, 0.476, 0.748, 0.4...
## $ energy           <dbl> 0.837, 0.594, 0.666, 0.763, 0.718, 0.524, 0.3...
## $ key              <int> 6, 7, 7, 9, 8, 8, 4, 4, 1, 5, 6, 6, 0, 8, 7, ...
## $ loudness         <dbl> -4.057, -7.028, -6.683, -4.068, -5.309, -5.59...
## $ mode             <int> 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, ...
## $ speechiness      <dbl> 0.1400, 0.2820, 0.0473, 0.0523, 0.0576, 0.033...
## $ acousticness     <dbl> 0.65200, 0.16500, 0.39600, 0.40600, 0.07840, ...
## $ instrumentalness <dbl> 0.00e+00, 3.49e-06, 1.69e-04, 0.00e+00, 1.02e...
## $ liveness         <dbl> 0.0986, 0.1340, 0.1200, 0.1800, 0.1220, 0.111...
## $ valence          <dbl> 0.8190, 0.5350, 0.2750, 0.7420, 0.1420, 0.661...
## $ tempo            <dbl> 100.962, 186.054, 160.079, 101.965, 199.864, ...
## $ type             <chr> "audio_features", "audio_features", "audio_fe...
## $ id               <chr> "3P31rcl0ym5paqRdwSiZps", "5aAx2yezTd8zXrkmtK...
## $ uri              <chr> "spotify:track:3P31rcl0ym5paqRdwSiZps", "spot...
## $ track_href       <chr> "https://api.spotify.com/v1/tracks/3P31rcl0ym...
## $ analysis_url     <chr> "https://api.spotify.com/v1/audio-analysis/3P...
## $ duration_ms      <int> 163960, 230453, 279475, 251088, 205947, 24496...
## $ time_signature   <int> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, ...

After we prepared 2 data frame, we joined them in to one data frame. Than, we add a column in our new data frame using our sentiment function.

top_200_audio_features <- inner_join(top_200_unique,tracks_features,by="id")
Sentiment=c()
for (i in 1:nrow(top_200_audio_features)){
  Sentiment[i]=classify_track_sentiment(valence=top_200_audio_features$valence[i],energy=top_200_audio_features$energy[i])
}
top_200_audio_features<-cbind(top_200_audio_features,Sentiment)
glimpse(top_200_audio_features)

## Observations: 2,874
## Variables: 25
## $ Position         <chr> "1", "2", "3", "4", "5", "6", "7", "8", "9", ...
## $ Track.Name       <chr> "Gece Gölgenin Rahatina Bak", "Starboy", "Aff...
## $ Artist           <chr> "Çagatay Akman", "The Weeknd", "Müslüm Gürses...
## $ Streams          <chr> "80607", "44427", "34889", "28400", "25425", ...
## $ URL              <chr> "https://open.spotify.com/track/3P31rcl0ym5pa...
## $ Date             <date> 2017-01-01, 2017-01-01, 2017-01-01, 2017-01-...
## $ id               <chr> "3P31rcl0ym5paqRdwSiZps", "5aAx2yezTd8zXrkmtK...
## $ danceability     <dbl> 0.769, 0.681, 0.424, 0.720, 0.476, 0.748, 0.4...
## $ energy           <dbl> 0.837, 0.594, 0.666, 0.763, 0.718, 0.524, 0.3...
## $ key              <int> 6, 7, 7, 9, 8, 8, 4, 4, 1, 5, 6, 6, 0, 8, 7, ...
## $ loudness         <dbl> -4.057, -7.028, -6.683, -4.068, -5.309, -5.59...
## $ mode             <int> 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, ...
## $ speechiness      <dbl> 0.1400, 0.2820, 0.0473, 0.0523, 0.0576, 0.033...
## $ acousticness     <dbl> 0.65200, 0.16500, 0.39600, 0.40600, 0.07840, ...
## $ instrumentalness <dbl> 0.00e+00, 3.49e-06, 1.69e-04, 0.00e+00, 1.02e...
## $ liveness         <dbl> 0.0986, 0.1340, 0.1200, 0.1800, 0.1220, 0.111...
## $ valence          <dbl> 0.8190, 0.5350, 0.2750, 0.7420, 0.1420, 0.661...
## $ tempo            <dbl> 100.962, 186.054, 160.079, 101.965, 199.864, ...
## $ type             <chr> "audio_features", "audio_features", "audio_fe...
## $ uri              <chr> "spotify:track:3P31rcl0ym5paqRdwSiZps", "spot...
## $ track_href       <chr> "https://api.spotify.com/v1/tracks/3P31rcl0ym...
## $ analysis_url     <chr> "https://api.spotify.com/v1/audio-analysis/3P...
## $ duration_ms      <int> 163960, 230453, 279475, 251088, 205947, 24496...
## $ time_signature   <int> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, ...
## $ Sentiment        <fct> Happy/Joyful, Happy/Joyful, Turbulent/Angry, ...

kable(head(top_200_audio_features,n=10L)) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive")) %>% scroll_box(width = "100%", height = "400px")

Position	Track.Name	Artist	Streams	URL	Date	id	danceability	energy	key	loudness	mode	speechiness	acousticness	instrumentalness	liveness	valence	tempo	type	uri	track_href	analysis_url	duration_ms	time_signature	Sentiment
1	Gece Gölgenin Rahatına Bak	Çağatay Akman	80607	https://open.spotify.com/track/3P31rcl0ym5paqRdwSiZps	2017-01-01	3P31rcl0ym5paqRdwSiZps	0.769	0.837	6	-4.057	0	0.1400	0.65200	0.00e+00	0.0986	0.819	100.962	audio_features	spotify:track:3P31rcl0ym5paqRdwSiZps	https://api.spotify.com/v1/tracks/3P31rcl0ym5paqRdwSiZps	https://api.spotify.com/v1/audio-analysis/3P31rcl0ym5paqRdwSiZps	163960	4	Happy/Joyful
2	Starboy	The Weeknd	44427	https://open.spotify.com/track/5aAx2yezTd8zXrkmtKl66Z	2017-01-01	5aAx2yezTd8zXrkmtKl66Z	0.681	0.594	7	-7.028	1	0.2820	0.16500	3.50e-06	0.1340	0.535	186.054	audio_features	spotify:track:5aAx2yezTd8zXrkmtKl66Z	https://api.spotify.com/v1/tracks/5aAx2yezTd8zXrkmtKl66Z	https://api.spotify.com/v1/audio-analysis/5aAx2yezTd8zXrkmtKl66Z	230453	4	Happy/Joyful
3	Affet	Müslüm Gürses	34889	https://open.spotify.com/track/0ikRchpmFlaqlzLRgu9qWk	2017-01-01	0ikRchpmFlaqlzLRgu9qWk	0.424	0.666	7	-6.683	0	0.0473	0.39600	1.69e-04	0.1200	0.275	160.079	audio_features	spotify:track:0ikRchpmFlaqlzLRgu9qWk	https://api.spotify.com/v1/tracks/0ikRchpmFlaqlzLRgu9qWk	https://api.spotify.com/v1/audio-analysis/0ikRchpmFlaqlzLRgu9qWk	279475	4	Turbulent/Angry
4	Rockabye (feat. Sean Paul & Anne-Marie)	Clean Bandit	28400	https://open.spotify.com/track/5knuzwU65gJK7IF5yJsuaW	2017-01-01	5knuzwU65gJK7IF5yJsuaW	0.720	0.763	9	-4.068	0	0.0523	0.40600	0.00e+00	0.1800	0.742	101.965	audio_features	spotify:track:5knuzwU65gJK7IF5yJsuaW	https://api.spotify.com/v1/tracks/5knuzwU65gJK7IF5yJsuaW	https://api.spotify.com/v1/audio-analysis/5knuzwU65gJK7IF5yJsuaW	251088	4	Happy/Joyful
5	Let Me Love You	DJ Snake	25425	https://open.spotify.com/track/4pdPtRcBmOSQDlJ3Fk945m	2017-01-01	4pdPtRcBmOSQDlJ3Fk945m	0.476	0.718	8	-5.309	1	0.0576	0.07840	1.02e-05	0.1220	0.142	199.864	audio_features	spotify:track:4pdPtRcBmOSQDlJ3Fk945m	https://api.spotify.com/v1/tracks/4pdPtRcBmOSQDlJ3Fk945m	https://api.spotify.com/v1/audio-analysis/4pdPtRcBmOSQDlJ3Fk945m	205947	4	Turbulent/Angry
6	Closer	The Chainsmokers	23032	https://open.spotify.com/track/7BKLCZ1jbUBVqRi2FVlTVw	2017-01-01	7BKLCZ1jbUBVqRi2FVlTVw	0.748	0.524	8	-5.599	1	0.0338	0.41400	0.00e+00	0.1110	0.661	95.010	audio_features	spotify:track:7BKLCZ1jbUBVqRi2FVlTVw	https://api.spotify.com/v1/tracks/7BKLCZ1jbUBVqRi2FVlTVw	https://api.spotify.com/v1/audio-analysis/7BKLCZ1jbUBVqRi2FVlTVw	244960	4	Happy/Joyful
7	Haydi Söyle	Kalben	22231	https://open.spotify.com/track/3N5AZtKJALZZDHT4On8Ebx	2017-01-01	3N5AZtKJALZZDHT4On8Ebx	0.475	0.318	4	-4.836	0	0.0302	0.80000	0.00e+00	0.1680	0.317	81.050	audio_features	spotify:track:3N5AZtKJALZZDHT4On8Ebx	https://api.spotify.com/v1/tracks/3N5AZtKJALZZDHT4On8Ebx	https://api.spotify.com/v1/audio-analysis/3N5AZtKJALZZDHT4On8Ebx	166049	4	Sad/Depressing
8	Heathens	Twenty One Pilots	21631	https://open.spotify.com/track/6i0V12jOa3mr6uu4WYhUBr	2017-01-01	6i0V12jOa3mr6uu4WYhUBr	0.732	0.396	4	-9.348	0	0.0286	0.08410	3.58e-05	0.1050	0.548	90.024	audio_features	spotify:track:6i0V12jOa3mr6uu4WYhUBr	https://api.spotify.com/v1/tracks/6i0V12jOa3mr6uu4WYhUBr	https://api.spotify.com/v1/audio-analysis/6i0V12jOa3mr6uu4WYhUBr	195920	4	Chill/Peaceful
9	One Dance	Drake	20857	https://open.spotify.com/track/1xznGGDReH1oQq0xzbwXa3	2017-01-01	1xznGGDReH1oQq0xzbwXa3	0.791	0.619	1	-5.886	1	0.0532	0.00784	4.23e-03	0.3510	0.371	103.989	audio_features	spotify:track:1xznGGDReH1oQq0xzbwXa3	https://api.spotify.com/v1/tracks/1xznGGDReH1oQq0xzbwXa3	https://api.spotify.com/v1/audio-analysis/1xznGGDReH1oQq0xzbwXa3	173987	4	Turbulent/Angry
10	Lost on You	LP	20231	https://open.spotify.com/track/3SqvR3HYLlCTYzbDXJ52OC	2017-01-01	3SqvR3HYLlCTYzbDXJ52OC	0.433	0.724	5	-6.126	0	0.0372	0.10000	0.00e+00	0.0918	0.689	174.006	audio_features	spotify:track:3SqvR3HYLlCTYzbDXJ52OC	https://api.spotify.com/v1/tracks/3SqvR3HYLlCTYzbDXJ52OC	https://api.spotify.com/v1/audio-analysis/3SqvR3HYLlCTYzbDXJ52OC	268105	4	Happy/Joyful

6.3.2. Sentiment Change in Months

For proper sentiment analyse, we calculate the percentages of sentiments frequency in every month. With percentages, we can compare the users emotional preferences in months.

df1a<-top_200 %>% mutate(Year_Month = format(Date,"%Y/%m"))
df1b<-df1a %>% left_join(select(top_200_audio_features, "Sentiment","id"), by = "id")
monthly_sentiment<-df1b %>% group_by(Year_Month,Sentiment) %>% summarise(Count_Sentiment = n()) %>% ungroup() %>% group_by(Year_Month) %>% mutate (Month_Sum=sum(Count_Sentiment)) %>% ungroup() %>% mutate(Percent_in_Month = percent(Count_Sentiment/Month_Sum))
glimpse(monthly_sentiment)

## Observations: 140
## Variables: 5
## $ Year_Month       <chr> "2017/01", "2017/01", "2017/01", "2017/01", "...
## $ Sentiment        <fct> Chill/Peaceful, Happy/Joyful, Sad/Depressing,...
## $ Count_Sentiment  <int> 186, 2710, 848, 2456, 151, 2452, 690, 2307, 1...
## $ Month_Sum        <int> 6200, 6200, 6200, 6200, 5600, 5600, 5600, 560...
## $ Percent_in_Month <chr> "3.0%", "43.7%", "13.7%", "39.6%", "2.7%", "4...

kable(head(monthly_sentiment,n=10L)) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

Year_Month	Sentiment	Count_Sentiment	Month_Sum	Percent_in_Month
2017/01	Chill/Peaceful	186	6200	3.0%
2017/01	Happy/Joyful	2710	6200	43.7%
2017/01	Sad/Depressing	848	6200	13.7%
2017/01	Turbulent/Angry	2456	6200	39.6%
2017/02	Chill/Peaceful	151	5600	2.7%
2017/02	Happy/Joyful	2452	5600	43.8%
2017/02	Sad/Depressing	690	5600	12.3%
2017/02	Turbulent/Angry	2307	5600	41.2%
2017/03	Chill/Peaceful	183	6200	3.0%
2017/03	Happy/Joyful	2670	6200	43.1%

monthly_sentiment<-monthly_sentiment%>% mutate(Perc_Num=as.double( strsplit(Percent_in_Month,split = "%")))
ggplot(monthly_sentiment,aes(Year_Month,Perc_Num,group=Sentiment,color=Sentiment)) + geom_point() + geom_line(aes( color=Sentiment)) +  labs(title = "Monthly Sentiment Change", x = "Months", y = "Percantage") +
  theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5),axis.text.x = element_text(angle = 90), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), legend.title = element_blank())

6.3.3. Sentiment Change in Years

Spotify offer us Turkey’s playlists in 1980, 1990, 2000, 2010 decades. These playlists includes just 50 songs per decade. Thus, we use sentiment percentage frequency again for compare decades and last 3 years data.

df80<-get_playlist_audio_features("spotifycharts","37i9dQZF1DX4io1yPyoLtv") %>% mutate(Year="1980") 
df90<-get_playlist_audio_features("spotifycharts","37i9dQZF1DXb7MJRXLczzR") %>% mutate(Year="1990") 
df00<-get_playlist_audio_features("spotifycharts","37i9dQZF1DWYteTcNVQZNq") %>% mutate(Year="2000") 
df10<-get_playlist_audio_features("spotifycharts","37i9dQZF1DXaE9T4Nls8eC") %>% mutate(Year="2010") 
past_track_data <- rbind(df80, df90, df00, df10)
glimpse(past_track_data)

## Observations: 200
## Variables: 62
## $ playlist_id                        <chr> "37i9dQZF1DX4io1yPyoLtv", "...
## $ playlist_name                      <chr> "Türkçe 80'ler", "Türkçe 80...
## $ playlist_img                       <chr> "https://pl.scdn.co/images/...
## $ playlist_owner_name                <chr> "Spotify", "Spotify", "Spot...
## $ playlist_owner_id                  <chr> "spotify", "spotify", "spot...
## $ danceability                       <dbl> 0.395, 0.742, 0.469, 0.534,...
## $ energy                             <dbl> 0.675, 0.186, 0.486, 0.739,...
## $ key                                <int> 1, 8, 9, 4, 2, 4, 9, 7, 7, ...
## $ loudness                           <dbl> -5.881, -16.820, -15.076, -...
## $ mode                               <int> 0, 1, 0, 1, 0, 0, 0, 0, 1, ...
## $ speechiness                        <dbl> 0.0472, 0.0449, 0.0427, 0.1...
## $ acousticness                       <dbl> 0.6490, 0.8220, 0.1490, 0.6...
## $ instrumentalness                   <dbl> 0.00e+00, 0.00e+00, 1.21e-0...
## $ liveness                           <dbl> 0.0963, 0.1620, 0.1230, 0.2...
## $ valence                            <dbl> 0.496, 0.570, 0.595, 0.505,...
## $ tempo                              <dbl> 177.053, 114.982, 130.737, ...
## $ track.id                           <chr> "2rP7pI2WpMWcUraYAX2xiT", "...
## $ analysis_url                       <chr> "https://api.spotify.com/v1...
## $ time_signature                     <int> 4, 4, 4, 4, 3, 4, 5, 4, 4, ...
## $ added_at                           <chr> "2019-08-07T12:49:23Z", "20...
## $ is_local                           <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ primary_color                      <lgl> NA, NA, NA, NA, NA, NA, NA,...
## $ added_by.href                      <chr> "https://api.spotify.com/v1...
## $ added_by.id                        <chr> "", "", "", "", "", "", "",...
## $ added_by.type                      <chr> "user", "user", "user", "us...
## $ added_by.uri                       <chr> "spotify:user:", "spotify:u...
## $ added_by.external_urls.spotify     <chr> "https://open.spotify.com/u...
## $ track.artists                      <list> [<data.frame[1 x 6]>, <dat...
## $ track.available_markets            <list> [<"AD", "AE", "AR", "AT", ...
## $ track.disc_number                  <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ track.duration_ms                  <int> 259360, 257044, 239266, 229...
## $ track.episode                      <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ track.explicit                     <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ track.href                         <chr> "https://api.spotify.com/v1...
## $ track.is_local                     <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ track.name                         <chr> "Tükenecegiz", "Bu Kalp Sen...
## $ track.popularity                   <int> 60, 48, 55, 50, 35, 56, 49,...
## $ track.preview_url                  <chr> "https://p.scdn.co/mp3-prev...
## $ track.track                        <lgl> TRUE, TRUE, TRUE, TRUE, TRU...
## $ track.track_number                 <int> 8, 1, 1, 6, 2, 2, 7, 8, 6, ...
## $ track.type                         <chr> "track", "track", "track", ...
## $ track.uri                          <chr> "spotify:track:2rP7pI2WpMWc...
## $ track.album.album_type             <chr> "album", "album", "album", ...
## $ track.album.artists                <list> [<data.frame[1 x 6]>, <dat...
## $ track.album.available_markets      <list> [<"AD", "AE", "AR", "AT", ...
## $ track.album.href                   <chr> "https://api.spotify.com/v1...
## $ track.album.id                     <chr> "13JKU1RyLFS73hDvWnTHr1", "...
## $ track.album.images                 <list> [<data.frame[3 x 3]>, <dat...
## $ track.album.name                   <chr> "Sen Aglama", "Bu Kalp Seni...
## $ track.album.release_date           <chr> "1984-09-06", "2002-02-20",...
## $ track.album.release_date_precision <chr> "day", "day", "day", "year"...
## $ track.album.total_tracks           <int> 11, 9, 10, 10, 10, 13, 11, ...
## $ track.album.type                   <chr> "album", "album", "album", ...
## $ track.album.uri                    <chr> "spotify:album:13JKU1RyLFS7...
## $ track.album.external_urls.spotify  <chr> "https://open.spotify.com/a...
## $ track.external_ids.isrc            <chr> "TR0061200300", "TR00806002...
## $ track.external_urls.spotify        <chr> "https://open.spotify.com/t...
## $ video_thumbnail.url                <lgl> NA, NA, NA, NA, NA, NA, NA,...
## $ key_name                           <chr> "C#", "G#", "A", "E", "D", ...
## $ mode_name                          <chr> "minor", "major", "minor", ...
## $ key_mode                           <chr> "C# minor", "G# major", "A ...
## $ Year                               <chr> "1980", "1980", "1980", "19...

Sentiment=c()
for (i in 1:nrow(past_track_data)){
    Sentiment[i]=classify_track_sentiment(valence=past_track_data$valence[i],energy=past_track_data$energy[i])
}
past_track_data<-cbind(past_track_data,Sentiment) 
past_track_data_sentiment<-past_track_data %>% group_by(Year,Sentiment) %>% summarise(Count= n())
df1c<-top_200 %>% mutate(Year = format(Date,"%Y"))
sent_count_yearly<-df1c %>% left_join(select(top_200_audio_features, "Sentiment","id"), by = "id") %>% group_by(Year,Sentiment) %>% summarise(Count= n())
yearly_change<- rbind(past_track_data_sentiment,sent_count_yearly) %>% group_by(Year) %>% mutate (Year_Sum=sum(Count)) %>% ungroup() %>% mutate(Percent_in_Year = percent(Count/Year_Sum))
kable(yearly_change) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

Year	Sentiment	Count	Year_Sum	Percent_in_Year
1980	Chill/Peaceful	7	50	14.0%
1980	Happy/Joyful	21	50	42.0%
1980	Sad/Depressing	15	50	30.0%
1980	Turbulent/Angry	7	50	14.0%
1990	Chill/Peaceful	7	50	14.0%
1990	Happy/Joyful	29	50	58.0%
1990	Sad/Depressing	9	50	18.0%
1990	Turbulent/Angry	5	50	10.0%
2000	Chill/Peaceful	2	50	4.0%
2000	Happy/Joyful	30	50	60.0%
2000	Sad/Depressing	4	50	8.0%
2000	Turbulent/Angry	14	50	28.0%
2010	Chill/Peaceful	3	50	6.0%
2010	Happy/Joyful	28	50	56.0%
2010	Sad/Depressing	3	50	6.0%
2010	Turbulent/Angry	16	50	32.0%
2017	Chill/Peaceful	2764	72400	3.8%
2017	Happy/Joyful	34603	72400	47.8%
2017	Sad/Depressing	8352	72400	11.5%
2017	Turbulent/Angry	26681	72400	36.9%
2018	Chill/Peaceful	2899	73000	4.0%
2018	Happy/Joyful	29468	73000	40.4%
2018	Sad/Depressing	10070	73000	13.8%
2018	Turbulent/Angry	30563	73000	41.9%
2019	Chill/Peaceful	3668	66000	5.6%
2019	Happy/Joyful	24829	66000	37.6%
2019	Sad/Depressing	9439	66000	14.3%
2019	Turbulent/Angry	28064	66000	42.5%

yearly_change<-yearly_change%>% mutate(Perc_Num=as.double( strsplit(Percent_in_Year,split = "%")))
ggplot(yearly_change,aes(Year,Perc_Num,group=Sentiment,color=Sentiment)) + geom_point() + geom_line(aes( color=Sentiment)) +  labs(title = "Yearly Sentiment Change", x = "Years", y = "Percantage") +
  theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5),axis.text.x = element_text(angle = 90), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), legend.title = element_blank())

6.3.4. Sentiment Bar Graph of All Tracks

Sentiment count of all unique songs between 2017 and 2019.

sent_count <- top_200_audio_features %>% group_by(Sentiment) %>% count()
ggplot(sent_count, aes(x=Sentiment, y=n, fill=Sentiment)) +
  geom_bar(stat="identity") + 
  labs(title = "Sentiment Count", x = "Sentiment Distribution", y = "Count of Sentiments") +
  theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), legend.title = element_blank())

6.3.5. Sentiment Gradient Chart

Finaly we mapped the all Top 200 tracks by their sentiment and displayed in gradient chart.

ggplot(top_200_audio_features,aes(x = valence, y = energy, color = Sentiment)) + geom_point() +
  labs(color = "", title = "Sentiment Analysis of Turkey Top 200 Chart Between 2017 and 2019") +
    theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), legend.title = element_blank()) +
  scale_x_continuous(expand = c(0, 0), limits = c(0, 1)) + 
  scale_y_continuous(expand = c(0, 0), limits = c(0, 1)) +
  geom_label(aes(x = 0.25, y = 0.97, label = "Turbulent/Angry"), label.padding = unit(2, "mm"),  fill = "darkgrey", color="white") +
  geom_label(aes(x = 0.75, y = 0.97, label = "Happy/Joyful"), label.padding = unit(2, "mm"), fill = "darkgrey", color="white") +
  geom_label(aes(x = 0.25, y = 0.03, label = "Sad/Depressing"), label.padding = unit(2, "mm"),  fill = "darkgrey", color="white") +
  geom_label(aes(x = 0.75, y = 0.03, label = "Chill/Peaceful"), label.padding = unit(2, "mm"), fill = "darkgrey", color="white") +
  geom_segment(aes(x = 1, y = 0, xend = 1, yend = 1)) +
  geom_segment(aes(x = 0, y = 0, xend = 0, yend = 1)) +
  geom_segment(aes(x = 0, y = 0, xend = 1, yend = 0)) +
  geom_segment(aes(x = 0, y = 0.5, xend = 1, yend = 0.5)) +
  geom_segment(aes(x = 0.5, y = 0, xend = 0.5, yend = 1)) +
  geom_segment(aes(x = 0, y = 1, xend = 1, yend = 1))

7. Shiny Apps

7.1. Audio Features Analysis by Radar Chart

Click on the link to use our app, which analyzes and compares the audio features of music charts created by Spotify or belonging to two different Spotify users as a radar chart.

7.2. Musical Horoscope

Click on the link to use our app, which makes predictions on personality type based on the user playlist, using the audio features and key characteristic of the Spotify user’s playlist.