Anılcan Atik
Dost Karaahmetli
Kutay Akalın
Tunahan Kılıç
December 11th, 2019
We analyzed the sentiments of current Turkey, USA, Brazil, Japan Top 50 playlists in Spotify and compared them. After that we analyzed the sentiment distribution change in daily Turkey Top 200 playlists between 2017 and 2019 so far.
Our data obtained directly from Spotify Web API. For API connection, we created “Client ID” and “Client Secret” from Spotify for Developers Website. For this purpose, “spotifyr” package used for making connection.
library(httpuv)
library(spotifyr)
library(tidyverse)
library(knitr)
library(lubridate)
library(ggalt)
library(plotly)
library(scales)
library(kableExtra)
options(max.print=1000000)
When connection is made successfully, we can access lots of difrent type data such as aritst, albums, tracks, user profile etc. Here is the Spotify API References. In our project, we will usually use playlist, artist and track data.
Our goal here is to download the Top 50 Playlists prepared by Spotify for countries in order to perform analysis. We put together these lists to compare musical differences between countries.
#Get Turkey Top 50
turkey_top_50_id="37i9dQZEVXbIVYVBNw9D5K"
turkey_top_50_audio_features <- get_playlist_audio_features("spotifycharts", turkey_top_50_id) %>% slice(-1)
#Get USA Top 50
usa_top_50_id = "37i9dQZEVXbLRQDuF5jeBp"
usa_top_50_audio_features <- get_playlist_audio_features("spotifycharts", usa_top_50_id)
#Get Japan Top 50
japan_top_50_id = "37i9dQZEVXbKXQ4mDTEBXq"
japan_top_50_audio_features <- get_playlist_audio_features("spotifycharts", japan_top_50_id)
#Get Brazil Top 50
brazil_top_50_id = "37i9dQZEVXbMXbN3EUUhlg"
brazil_top_50_audio_features <- get_playlist_audio_features("spotifycharts", brazil_top_50_id)
#Combining TR, USA, Japan and Brazil top 50 lists
combined_lists <- bind_rows(turkey_top_50_audio_features, usa_top_50_audio_features, japan_top_50_audio_features, brazil_top_50_audio_features)
glimpse(combined_lists)
## Observations: 199
## Variables: 61
## $ playlist_id <chr> "37i9dQZEVXbIVYVBNw9D5K", "...
## $ playlist_name <chr> "Turkey Top 50", "Turkey To...
## $ playlist_img <chr> "https://charts-images.scdn...
## $ playlist_owner_name <chr> "spotifycharts", "spotifych...
## $ playlist_owner_id <chr> "spotifycharts", "spotifych...
## $ danceability <dbl> 0.801, 0.743, 0.628, 0.810,...
## $ energy <dbl> 0.688, 0.680, 0.725, 0.631,...
## $ key <int> 9, 5, 7, 4, 6, 7, 9, 6, 5, ...
## $ loudness <dbl> -6.620, -4.344, -7.387, -7....
## $ mode <int> 1, 0, 1, 0, 0, 0, 0, 1, 0, ...
## $ speechiness <dbl> 0.1130, 0.1030, 0.1100, 0.3...
## $ acousticness <dbl> 0.2780, 0.1130, 0.0266, 0.1...
## $ instrumentalness <dbl> 8.88e-03, 1.23e-01, 0.00e+0...
## $ liveness <dbl> 0.1500, 0.1830, 0.0549, 0.1...
## $ valence <dbl> 0.410, 0.694, 0.458, 0.407,...
## $ tempo <dbl> 158.003, 180.059, 173.952, ...
## $ track.id <chr> "2KlbLTnQ5Wch2oOelW0Y2k", "...
## $ analysis_url <chr> "https://api.spotify.com/v1...
## $ time_signature <int> 4, 4, 4, 4, 4, 4, 4, 4, 4, ...
## $ added_at <chr> "1970-01-01T00:00:00Z", "19...
## $ is_local <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ primary_color <lgl> NA, NA, NA, NA, NA, NA, NA,...
## $ added_by.href <chr> "https://api.spotify.com/v1...
## $ added_by.id <chr> "", "", "", "", "", "", "",...
## $ added_by.type <chr> "user", "user", "user", "us...
## $ added_by.uri <chr> "spotify:user:", "spotify:u...
## $ added_by.external_urls.spotify <chr> "https://open.spotify.com/u...
## $ track.artists <list> [<data.frame[2 x 6]>, <dat...
## $ track.available_markets <list> [<"AD", "AE", "AR", "AT", ...
## $ track.disc_number <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ track.duration_ms <int> 154251, 196583, 269002, 185...
## $ track.episode <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ track.explicit <lgl> TRUE, FALSE, FALSE, TRUE, F...
## $ track.href <chr> "https://api.spotify.com/v1...
## $ track.is_local <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ track.name <chr> "Wir sind Kral", "AYA", "To...
## $ track.popularity <int> 61, 70, 74, 78, 81, 77, 60,...
## $ track.preview_url <chr> NA, NA, "https://p.scdn.co/...
## $ track.track <lgl> TRUE, TRUE, TRUE, TRUE, TRU...
## $ track.track_number <int> 11, 1, 1, 1, 1, 1, 6, 1, 1,...
## $ track.type <chr> "track", "track", "track", ...
## $ track.uri <chr> "spotify:track:2KlbLTnQ5Wch...
## $ track.album.album_type <chr> "album", "single", "single"...
## $ track.album.artists <list> [<data.frame[2 x 6]>, <dat...
## $ track.album.available_markets <list> [<"AD", "AE", "AR", "AT", ...
## $ track.album.href <chr> "https://api.spotify.com/v1...
## $ track.album.id <chr> "3gSlFOtOhV1OS3qTQx4g55", "...
## $ track.album.images <list> [<data.frame[3 x 3]>, <dat...
## $ track.album.name <chr> "Lights Out", "AYA", "Sarka...
## $ track.album.release_date <chr> "2019-11-14", "2019-09-20",...
## $ track.album.release_date_precision <chr> "day", "day", "day", "day",...
## $ track.album.total_tracks <int> 12, 1, 5, 1, 1, 1, 12, 1, 1...
## $ track.album.type <chr> "album", "album", "album", ...
## $ track.album.uri <chr> "spotify:album:3gSlFOtOhV1O...
## $ track.album.external_urls.spotify <chr> "https://open.spotify.com/a...
## $ track.external_ids.isrc <chr> "DECE71900765", "NLG6619008...
## $ track.external_urls.spotify <chr> "https://open.spotify.com/t...
## $ video_thumbnail.url <lgl> NA, NA, NA, NA, NA, NA, NA,...
## $ key_name <chr> "A", "F", "G", "E", "F#", "...
## $ mode_name <chr> "major", "minor", "major", ...
## $ key_mode <chr> "A major", "F minor", "G ma...
The purpose of this function named “classify_track_sentiment” is important for us to work primarily to reveal the mood of songs and song lists along these lines. Energy and valence are two important factors in terms of interpreting emotion in music. The variations of these two factors, which have values between 0 and 1, in this range determine the songs to be turbulent/angry, happy/joyful, sad/depressing and chill/peaceful.
According to Get Audio Features for a Track, explanations of the corresponding factors are as follows.
Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy.
A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).
classify_track_sentiment <- function(valence, energy) {
if (is.na(valence) | is.na(energy)) {
return(NA)
}
else if (valence >= .5) {
if (energy >= .5) {
return('Happy/Joyful')
} else {
return('Chill/Peaceful')
}
} else {
if (energy >= .5) {
return('Turbulent/Angry')
} else {
return('Sad/Depressing')
}
}
}
track_sentiment = c()
for (i in 1:199){
track_sentiment[i] = classify_track_sentiment(combined_lists[[15]][[i]], combined_lists[[7]][[i]])
}
#Adding sentiment column to Combined of four countries
combined_lists<-cbind(combined_lists,track_sentiment)
#Adding Artist Column to Combined Music List
track_audio_combined <- combined_lists %>%
select(track.name,track.id,track.artists,track.album.release_date,track.popularity,danceability:tempo,track_sentiment,track.duration_ms)
kable(head(track_audio_combined)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive")) %>% scroll_box(width = "100%", height = "400px")
track.name | track.id | track.artists | track.album.release_date | track.popularity | danceability | energy | key | loudness | mode | speechiness | acousticness | instrumentalness | liveness | valence | tempo | track_sentiment | track.duration_ms |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Wir sind Kral | 2KlbLTnQ5Wch2oOelW0Y2k | list(href = c(“https://api.spotify.com/v1/artists/5pVRwX5ZQR7hfJ18w8ZYkl”, “https://api.spotify.com/v1/artists/6LnJKrtFnTEGdbWQ2riWCL”), id = c(“5pVRwX5ZQR7hfJ18w8ZYkl”, “6LnJKrtFnTEGdbWQ2riWCL”), name = c(“Ufo361”, “Ezhel”), type = c(“artist”, “artist”), uri = c(“spotify:artist:5pVRwX5ZQR7hfJ18w8ZYkl”, “spotify:artist:6LnJKrtFnTEGdbWQ2riWCL”), external_urls.spotify = c(“https://open.spotify.com/artist/5pVRwX5ZQR7hfJ18w8ZYkl”, “https://open.spotify.com/artist/6LnJKrtFnTEGdbWQ2riWCL”)) | 2019-11-14 | 61 | 0.801 | 0.688 | 9 | -6.620 | 1 | 0.1130 | 0.2780 | 0.008880 | 0.1500 | 0.410 | 158.003 | Turbulent/Angry | 154251 |
AYA | 4IJEw3fDvS6XF4sDc3bvjK | list(href = c(“https://api.spotify.com/v1/artists/2y1VzMKAa5nmfXKtJL9jnj”, “https://api.spotify.com/v1/artists/6LnJKrtFnTEGdbWQ2riWCL”), id = c(“2y1VzMKAa5nmfXKtJL9jnj”, “6LnJKrtFnTEGdbWQ2riWCL”), name = c(“Murda”, “Ezhel”), type = c(“artist”, “artist”), uri = c(“spotify:artist:2y1VzMKAa5nmfXKtJL9jnj”, “spotify:artist:6LnJKrtFnTEGdbWQ2riWCL”), external_urls.spotify = c(“https://open.spotify.com/artist/2y1VzMKAa5nmfXKtJL9jnj”, “https://open.spotify.com/artist/6LnJKrtFnTEGdbWQ2riWCL”)) | 2019-09-20 | 70 | 0.743 | 0.680 | 5 | -4.344 | 0 | 0.1030 | 0.1130 | 0.123000 | 0.1830 | 0.694 | 180.059 | Happy/Joyful | 196583 |
Toz Taneleri | 36ulbeGLdspdIYSFKXIlmN | list(href = “https://api.spotify.com/v1/artists/1KXTegXtnCPKXjRaX1llcD”, id = “1KXTegXtnCPKXjRaX1llcD”, name = “Sagopa Kajmer”, type = “artist”, uri = “spotify:artist:1KXTegXtnCPKXjRaX1llcD”, external_urls.spotify = “https://open.spotify.com/artist/1KXTegXtnCPKXjRaX1llcD”) | 2019-11-29 | 74 | 0.628 | 0.725 | 7 | -7.387 | 1 | 0.1100 | 0.0266 | 0.000000 | 0.0549 | 0.458 | 173.952 | Turbulent/Angry | 269002 |
Arkadaş | 6bBnnrknLbDoOCUdKMkmnq | list(href = “https://api.spotify.com/v1/artists/2kS0jWMkkFBL0mrl0VotD0”, id = “2kS0jWMkkFBL0mrl0VotD0”, name = “Ben Fero”, type = “artist”, uri = “spotify:artist:2kS0jWMkkFBL0mrl0VotD0”, external_urls.spotify = “https://open.spotify.com/artist/2kS0jWMkkFBL0mrl0VotD0”) | 2019-11-08 | 78 | 0.810 | 0.631 | 4 | -7.855 | 0 | 0.3270 | 0.1170 | 0.000000 | 0.1190 | 0.407 | 144.978 | Turbulent/Angry | 185566 |
Dance Monkey | 1rgnBhdG2JDFTbYkYRZAku | list(href = “https://api.spotify.com/v1/artists/2NjfBq1NflQcKSeiDooVjY”, id = “2NjfBq1NflQcKSeiDooVjY”, name = “Tones and I”, type = “artist”, uri = “spotify:artist:2NjfBq1NflQcKSeiDooVjY”, external_urls.spotify = “https://open.spotify.com/artist/2NjfBq1NflQcKSeiDooVjY”) | 2019-05-10 | 81 | 0.825 | 0.593 | 6 | -6.401 | 0 | 0.0988 | 0.6880 | 0.000161 | 0.1700 | 0.540 | 98.078 | Happy/Joyful | 209754 |
Nalan | 1LNUxWJifZNEPpd273N2le | list(href = “https://api.spotify.com/v1/artists/4XP7cGw4t8BqZ8Du5q3bHg”, id = “4XP7cGw4t8BqZ8Du5q3bHg”, name = “Emir Can Igrek”, type = “artist”, uri = “spotify:artist:4XP7cGw4t8BqZ8Du5q3bHg”, external_urls.spotify = “https://open.spotify.com/artist/4XP7cGw4t8BqZ8Du5q3bHg”) | 2019-09-06 | 77 | 0.540 | 0.418 | 7 | -15.570 | 0 | 0.0761 | 0.1900 | 0.685000 | 0.1980 | 0.189 | 91.042 | Sad/Depressing | 199958 |
artist_names = c()
for (i in 1:199){
artist_names[i] <- track_audio_combined[[3]][[i]][[3]]
}
combined_lists <-cbind(combined_lists, artist_names)
glimpse(combined_lists)
## Observations: 199
## Variables: 63
## $ playlist_id <chr> "37i9dQZEVXbIVYVBNw9D5K", "...
## $ playlist_name <chr> "Turkey Top 50", "Turkey To...
## $ playlist_img <chr> "https://charts-images.scdn...
## $ playlist_owner_name <chr> "spotifycharts", "spotifych...
## $ playlist_owner_id <chr> "spotifycharts", "spotifych...
## $ danceability <dbl> 0.801, 0.743, 0.628, 0.810,...
## $ energy <dbl> 0.688, 0.680, 0.725, 0.631,...
## $ key <int> 9, 5, 7, 4, 6, 7, 9, 6, 5, ...
## $ loudness <dbl> -6.620, -4.344, -7.387, -7....
## $ mode <int> 1, 0, 1, 0, 0, 0, 0, 1, 0, ...
## $ speechiness <dbl> 0.1130, 0.1030, 0.1100, 0.3...
## $ acousticness <dbl> 0.2780, 0.1130, 0.0266, 0.1...
## $ instrumentalness <dbl> 8.88e-03, 1.23e-01, 0.00e+0...
## $ liveness <dbl> 0.1500, 0.1830, 0.0549, 0.1...
## $ valence <dbl> 0.410, 0.694, 0.458, 0.407,...
## $ tempo <dbl> 158.003, 180.059, 173.952, ...
## $ track.id <chr> "2KlbLTnQ5Wch2oOelW0Y2k", "...
## $ analysis_url <chr> "https://api.spotify.com/v1...
## $ time_signature <int> 4, 4, 4, 4, 4, 4, 4, 4, 4, ...
## $ added_at <chr> "1970-01-01T00:00:00Z", "19...
## $ is_local <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ primary_color <lgl> NA, NA, NA, NA, NA, NA, NA,...
## $ added_by.href <chr> "https://api.spotify.com/v1...
## $ added_by.id <chr> "", "", "", "", "", "", "",...
## $ added_by.type <chr> "user", "user", "user", "us...
## $ added_by.uri <chr> "spotify:user:", "spotify:u...
## $ added_by.external_urls.spotify <chr> "https://open.spotify.com/u...
## $ track.artists <list> [<data.frame[2 x 6]>, <dat...
## $ track.available_markets <list> [<"AD", "AE", "AR", "AT", ...
## $ track.disc_number <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ track.duration_ms <int> 154251, 196583, 269002, 185...
## $ track.episode <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ track.explicit <lgl> TRUE, FALSE, FALSE, TRUE, F...
## $ track.href <chr> "https://api.spotify.com/v1...
## $ track.is_local <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ track.name <chr> "Wir sind Kral", "AYA", "To...
## $ track.popularity <int> 61, 70, 74, 78, 81, 77, 60,...
## $ track.preview_url <chr> NA, NA, "https://p.scdn.co/...
## $ track.track <lgl> TRUE, TRUE, TRUE, TRUE, TRU...
## $ track.track_number <int> 11, 1, 1, 1, 1, 1, 6, 1, 1,...
## $ track.type <chr> "track", "track", "track", ...
## $ track.uri <chr> "spotify:track:2KlbLTnQ5Wch...
## $ track.album.album_type <chr> "album", "single", "single"...
## $ track.album.artists <list> [<data.frame[2 x 6]>, <dat...
## $ track.album.available_markets <list> [<"AD", "AE", "AR", "AT", ...
## $ track.album.href <chr> "https://api.spotify.com/v1...
## $ track.album.id <chr> "3gSlFOtOhV1OS3qTQx4g55", "...
## $ track.album.images <list> [<data.frame[3 x 3]>, <dat...
## $ track.album.name <chr> "Lights Out", "AYA", "Sarka...
## $ track.album.release_date <chr> "2019-11-14", "2019-09-20",...
## $ track.album.release_date_precision <chr> "day", "day", "day", "day",...
## $ track.album.total_tracks <int> 12, 1, 5, 1, 1, 1, 12, 1, 1...
## $ track.album.type <chr> "album", "album", "album", ...
## $ track.album.uri <chr> "spotify:album:3gSlFOtOhV1O...
## $ track.album.external_urls.spotify <chr> "https://open.spotify.com/a...
## $ track.external_ids.isrc <chr> "DECE71900765", "NLG6619008...
## $ track.external_urls.spotify <chr> "https://open.spotify.com/t...
## $ video_thumbnail.url <lgl> NA, NA, NA, NA, NA, NA, NA,...
## $ key_name <chr> "A", "F", "G", "E", "F#", "...
## $ mode_name <chr> "major", "minor", "major", ...
## $ key_mode <chr> "A major", "F minor", "G ma...
## $ track_sentiment <fct> Turbulent/Angry, Happy/Joyf...
## $ artist_names <fct> Ufo361, Murda, Sagopa Kajme...
The number of common songs in the 4 lists we have is few (Analysis 5.2). In this respect, the musical keys used mainly in songs on a country basis can be obtained from the plot.
country_by_key <- combined_lists%>%
select(playlist_name, key_name, track.name)%>%
group_by(playlist_name) %>% count(key_name, sort = TRUE)
kable(country_by_key) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
playlist_name | key_name | n |
---|---|---|
United States Top 50 | G | 10 |
Brazil Top 50 | B | 8 |
Japan Top 50 | C# | 8 |
Japan Top 50 | G | 8 |
United States Top 50 | C | 8 |
Turkey Top 50 | A | 7 |
United States Top 50 | D | 7 |
Brazil Top 50 | D | 6 |
Brazil Top 50 | F# | 6 |
Turkey Top 50 | D | 6 |
Turkey Top 50 | F | 6 |
Turkey Top 50 | G# | 6 |
United States Top 50 | C# | 6 |
Brazil Top 50 | C# | 5 |
Brazil Top 50 | G# | 5 |
Japan Top 50 | D | 5 |
Japan Top 50 | E | 5 |
Turkey Top 50 | F# | 5 |
Turkey Top 50 | G | 5 |
United States Top 50 | F# | 5 |
Brazil Top 50 | A# | 4 |
Brazil Top 50 | F | 4 |
Japan Top 50 | A | 4 |
Japan Top 50 | A# | 4 |
Japan Top 50 | F# | 4 |
Japan Top 50 | G# | 4 |
United States Top 50 | A# | 4 |
Brazil Top 50 | C | 3 |
Brazil Top 50 | E | 3 |
Brazil Top 50 | G | 3 |
Japan Top 50 | B | 3 |
Turkey Top 50 | A# | 3 |
Turkey Top 50 | B | 3 |
Turkey Top 50 | C | 3 |
Turkey Top 50 | E | 3 |
United States Top 50 | E | 3 |
United States Top 50 | G# | 3 |
Brazil Top 50 | A | 2 |
Japan Top 50 | C | 2 |
Japan Top 50 | F | 2 |
United States Top 50 | B | 2 |
Brazil Top 50 | D# | 1 |
Japan Top 50 | D# | 1 |
Turkey Top 50 | C# | 1 |
Turkey Top 50 | D# | 1 |
United States Top 50 | A | 1 |
United States Top 50 | F | 1 |
ggplot(country_by_key, aes(x = key_name, y = n, fill = playlist_name)) +
geom_bar(stat = "identity") +
labs(title = "Playlists by Key Name", x = "Key Name", y = "Total Number of Keys") +
theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5),
axis.title.x = element_text(size = 14, face = "bold"),
axis.title.y = element_text(size = 14, face = "bold"), legend.title = element_blank())
It is possible that there are common songs that could affect our analysis using the music charts of the 4 countries we have. The following table and plot show the songs that are common in the charts and how many of these songs are on the playlists.
common_songs <- combined_lists %>% group_by(track.name, artist_names) %>%
summarise(n_songs = n()) %>%
filter(n_songs >= 2) %>%
arrange(desc(n_songs))
kable(common_songs) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
track.name | artist_names | n_songs |
---|---|---|
Señorita | Shawn Mendes | 4 |
bad guy | Billie Eilish | 3 |
Dance Monkey | Tones and I | 3 |
All I Want for Christmas Is You | Mariah Carey | 2 |
everything i wanted | Billie Eilish | 2 |
Memories | Maroon 5 | 2 |
ggplot(common_songs, aes(x = reorder(track.name, n_songs), y = n_songs, fill = artist_names)) +
geom_bar(stat = "identity") +
labs(title = "Common Songs on Playlists", x = "Song Name", y = "Number of Songs") +
theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5),
axis.title.x = element_text(size = 14, face = "bold"),
axis.title.y = element_text(size = 14, face = "bold"), legend.title = element_blank()) +
coord_flip()
Danceability shows us how fun and full of energy a song is. So if we can plot the distribution of danceability by country, we can see which country’s playlist is more fun and full of energy. But we need to wait a little bit for the following analysis to detect the emotions and feelings that the songs reflect.
ggplot(combined_lists, aes(x = danceability, fill = playlist_name)) +
geom_density(alpha = 0.7, color = NA)+
labs(x = "Danceability", y = "Density") +
guides(fill = guide_legend(title = "Playlist"))+
theme_minimal()+
ggtitle("Distribution of Danceability Data") +
theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5),
axis.title.x = element_text(size = 14, face = "bold"),
axis.title.y = element_text(size = 14, face = "bold"), legend.title = element_blank())
In the steps above, we added a column of sentiment to the songs. We used energy and valence values to do this. The following chart shows the range of energy and valence values according to the country playlists.
playlist_feature_range <- combined_lists %>%
group_by(playlist_name)%>%
mutate(max_energy=max(energy), max_valence = max(valence))%>%
mutate(min_energy=min(energy), min_valence = min(valence))%>%
select(playlist_name, min_energy, max_energy, min_valence, max_valence)%>%
unique()
kable(playlist_feature_range) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
playlist_name | min_energy | max_energy | min_valence | max_valence |
---|---|---|---|---|
Turkey Top 50 | 0.191 | 0.931 | 0.1050 | 0.928 |
United States Top 50 | 0.210 | 0.816 | 0.0605 | 0.947 |
Japan Top 50 | 0.320 | 0.979 | 0.1840 | 0.962 |
Brazil Top 50 | 0.426 | 0.958 | 0.1520 | 0.964 |
The following chart is created with plotly, so you can move your cursor over the chart to see the maximum and minimum values.
energy_range_plot <- plot_ly(playlist_feature_range, color = I("gray80"),
hoverinfo = 'text') %>%
add_segments(x = ~max_energy, xend = ~min_energy, y = ~playlist_name, yend = ~playlist_name, showlegend = FALSE) %>%
add_segments(x = ~max_valence, xend = ~min_valence, y = ~playlist_name, yend = ~playlist_name, showlegend = FALSE) %>%
add_markers(x = ~max_energy, y = ~playlist_name, name = "Maximum Energy Value", color = I("red"), size = 2.5, text=~paste('Max Energy: ', max_energy)) %>%
add_markers(x = ~min_energy, y = ~playlist_name, name = "Minimum Energy Value", color = I("blue"), size = 2.5, text=~paste('Min Energy: ', min_energy))%>%
add_markers(x = ~max_valence, y = ~playlist_name, name = "Maximum Valence Value", color = I("#395B74"), size = 2.5, text=~paste('Max Valence: ', max_valence)) %>%
add_markers(x = ~min_valence, y = ~playlist_name, name = "Minimum Valence Value", color = I("#F7BC08"), size = 2.5, text=~paste('Min Valence: ', min_valence))%>%
layout(
title = "Playlist Energy and Valence Range",
xaxis = list(title = "Energy and Valence"),
yaxis= list(title="Country Lists"))
ggplotly(energy_range_plot)
While researching for Spotify analysis, we came across a beautiful analysis on the link. We have updated the formula used in this analysis to suit our own purpose. Here, we imagine what kind of music will be fun to hear when we go to a party or a festival. Values such as energy, danceability, tempo and loudness have a great impact on the energy and happiness of a song. We add the valence value that we will use in the emotion analysis of the songs to this group and use the following formula to find the average excitement of playlists.
excitement_of_playlist <- combined_lists %>% group_by(playlist_name) %>%
select(playlist_name, track.name, valence, energy, loudness, danceability, tempo) %>%
mutate(excitement = loudness + tempo + (energy*100) + (danceability*100) + (valence*100), excitement_mean = mean(excitement))
ggplot(excitement_of_playlist, aes(x = excitement, fill = playlist_name, color = playlist_name)) +
geom_histogram(binwidth = 30, position = "identity", alpha = 0.7) +
geom_vline(data = excitement_of_playlist, aes(xintercept = excitement_mean, color = playlist_name),
linetype = "dashed") +
labs(title = "Excitement Distribution of Playlists", y = "Count", x = "Excitement Scale") +
theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5),
axis.title.x = element_text(size = 14, face = "bold"), axis.title.y = element_text(size = 14, face = "bold"),
legend.title = element_blank())
The plot below shows the mean excitement of the playlists by country. Brazil’s playlist is noticeably exciting and entertaining, with Japan in second place out of three other countries whose averages are close to each other.
excitement_mean <- excitement_of_playlist %>% group_by(playlist_name) %>% select(excitement_mean) %>% unique()
kable(excitement_mean) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
playlist_name | excitement_mean |
---|---|
Turkey Top 50 | 296.3666 |
United States Top 50 | 291.7878 |
Japan Top 50 | 306.2224 |
Brazil Top 50 | 337.4382 |
ggplot(excitement_mean, aes(x = reorder(playlist_name, excitement_mean), y = excitement_mean, fill = playlist_name)) +
geom_bar(stat ="identity") +
labs(title = "Excitement Comparison of Playlists", x = "Country Playlist Names", y = "Means of Excitement", fill = "Country Charts",
caption = "The low score shows that the list is boring. \n Excitement Formula: (loudness + tempo + (energy*100) + (danceability*100) + (valence*100))") +
theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5),
axis.title.x = element_text(size = 14, face = "bold"),
axis.title.y = element_text(size = 14, face = "bold"),
legend.title = element_blank())
We create the a table by selecting energy, valence and sentiment columns for each country.
sentiment_by_countries <- combined_lists %>% group_by(playlist_name) %>%
select(playlist_name, track.name, artist_names, valence, energy, track_sentiment)
kable(head(sentiment_by_countries,n=15L)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
playlist_name | track.name | artist_names | valence | energy | track_sentiment |
---|---|---|---|---|---|
Turkey Top 50 | Wir sind Kral | Ufo361 | 0.410 | 0.688 | Turbulent/Angry |
Turkey Top 50 | AYA | Murda | 0.694 | 0.680 | Happy/Joyful |
Turkey Top 50 | Toz Taneleri | Sagopa Kajmer | 0.458 | 0.725 | Turbulent/Angry |
Turkey Top 50 | Arkadaş | Ben Fero | 0.407 | 0.631 | Turbulent/Angry |
Turkey Top 50 | Dance Monkey | Tones and I | 0.540 | 0.593 | Happy/Joyful |
Turkey Top 50 | Nalan | Emir Can İğrek | 0.189 | 0.418 | Sad/Depressing |
Turkey Top 50 | Yemin Olsun | Ufo361 | 0.310 | 0.582 | Turbulent/Angry |
Turkey Top 50 | Neresi? | BEGE | 0.652 | 0.562 | Happy/Joyful |
Turkey Top 50 | Goal | Patron | 0.461 | 0.750 | Turbulent/Angry |
Turkey Top 50 | Güzel Kızlar Patron Dinler | Patron | 0.537 | 0.723 | Happy/Joyful |
Turkey Top 50 | Eskimiş Senelere | Aspova | 0.496 | 0.515 | Turbulent/Angry |
Turkey Top 50 | LOLO | Ezhel | 0.566 | 0.603 | Happy/Joyful |
Turkey Top 50 | Ay Tenli Kadın | Ufuk Beydemir | 0.335 | 0.541 | Turbulent/Angry |
Turkey Top 50 | YKKE | Ufo361 | 0.456 | 0.445 | Sad/Depressing |
Turkey Top 50 | Neyse | Sagopa Kajmer | 0.317 | 0.776 | Turbulent/Angry |
The following table shows the sentiment intensity of the songs in the playlists, grouped by country. The analysis shows that Brazilian playlist is in the Happy/Joyful sentiment class. In other playlists, although the Happy/Joyful is superior, the Turbulent/Angry appears to be in close numbers.
kable(sentiment_by_countries %>% count(track_sentiment, sort = TRUE)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
playlist_name | track_sentiment | n |
---|---|---|
Brazil Top 50 | Happy/Joyful | 41 |
Japan Top 50 | Happy/Joyful | 26 |
Japan Top 50 | Turbulent/Angry | 20 |
Turkey Top 50 | Happy/Joyful | 19 |
United States Top 50 | Happy/Joyful | 19 |
Turkey Top 50 | Turbulent/Angry | 18 |
United States Top 50 | Turbulent/Angry | 18 |
Turkey Top 50 | Sad/Depressing | 9 |
United States Top 50 | Sad/Depressing | 7 |
United States Top 50 | Chill/Peaceful | 6 |
Brazil Top 50 | Turbulent/Angry | 5 |
Brazil Top 50 | Sad/Depressing | 3 |
Turkey Top 50 | Chill/Peaceful | 3 |
Japan Top 50 | Chill/Peaceful | 2 |
Japan Top 50 | Sad/Depressing | 2 |
Brazil Top 50 | Chill/Peaceful | 1 |
ggplot(sentiment_by_countries,aes(x = valence, y = energy, color = track_sentiment)) + geom_point() +
labs(color = "", title = "Sentiment Analysis by Each Country") +
theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5),
axis.title.x = element_text(size = 14, face = "bold"),
axis.title.y = element_text(size = 14, face = "bold"), legend.title = element_blank()) +
scale_x_continuous(expand = c(0, 0), limits = c(0, 1)) +
scale_y_continuous(expand = c(0, 0), limits = c(0, 1)) +
geom_label(aes(x = 0.12, y = 0.98, label = "Turbulent/Angry"), label.padding = unit(1, "mm"), fill = "grey", color="white") +
geom_label(aes(x = 0.90, y = 0.98, label = "Happy/Joyful"), label.padding = unit(1, "mm"), fill = "grey", color="white") +
geom_label(aes(x = 0.12, y = 0.025, label = "Sad/Depressing"), label.padding = unit(1, "mm"), fill = "grey", color="white") +
geom_label(aes(x = 0.895, y = 0.025, label = "Chill/Peaceful"), label.padding = unit(1, "mm"), fill = "grey", color="white") +
geom_segment(aes(x = 1, y = 0, xend = 1, yend = 1)) +
geom_segment(aes(x = 0, y = 0, xend = 0, yend = 1)) +
geom_segment(aes(x = 0, y = 0, xend = 1, yend = 0)) +
geom_segment(aes(x = 0, y = 0.5, xend = 1, yend = 0.5)) +
geom_segment(aes(x = 0.5, y = 0, xend = 0.5, yend = 1)) +
geom_segment(aes(x = 0, y = 1, xend = 1, yend = 1)) +
facet_wrap(~ playlist_name)
We obtained Turkey Top 200 daily playlist data between January 2017 and November 2019 on Spotify Charts . Because the data consist of 211.400 rows, data frame uploaded group github page in RDS format.
topturkey200<-readRDS(url("https://github.com/pjournal/mef03g-spo-R-ify/blob/master/turkeytop200.rds?raw=true"))
glimpse(topturkey200)
## Observations: 211,400
## Variables: 6
## $ Position <chr> "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", ...
## $ Track.Name <chr> "Gece Gölgenin Rahatina Bak", "Starboy", "Affet", "...
## $ Artist <chr> "Çagatay Akman", "The Weeknd", "Müslüm Gürses", "Cl...
## $ Streams <chr> "80607", "44427", "34889", "28400", "25425", "23032...
## $ URL <chr> "https://open.spotify.com/track/3P31rcl0ym5paqRdwSi...
## $ Date <date> 2017-01-01, 2017-01-01, 2017-01-01, 2017-01-01, 20...
Firstly, we wanted to look at monthly total stream change. In the graph below, we see that total stream increases exponentially. We can conclude that usage of spotify and stream amount increased rapidly in Turkey.
#First cal
#topturkey200 %>% group_by(Artist)%>% summarise(Total_number=n()) %>% arrange(desc(Total_number))
change<-topturkey200 %>% mutate(Year_Month = format(Date,"%Y/%m")) %>% group_by(Year_Month) %>% summarise(Total_Stream=sum(as.numeric(Streams)))
ggplot(change, aes(x = Year_Month,y=Total_Stream,group=1)) + geom_point() + geom_smooth() + theme(axis.text.x = element_text(angle = 90),title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5)) + labs(x = "Month", y = "Total Streams",title = "Total Stream Change") + scale_y_continuous(labels = comma)
If we curious about which songs streamed most between 2017 and 2019 so far, Ezhel - Geceler is the answer. First 20 most streamed songs are displayed in the table below.
rank<-topturkey200 %>% group_by(Artist,Track.Name) %>% summarise(Total_Stream=sum(as.numeric(Streams))) %>% arrange(desc(Total_Stream))
kable(head(rank,n=20L)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Artist | Track.Name | Total_Stream |
---|---|---|
Ezhel | Geceler | 44258676 |
Ezhel | Felaket | 43699884 |
Ufuk Beydemir | Ay Tenli Kadın | 35480606 |
Norm Ender | Mekanın Sahibi | 34730473 |
Ben Fero | Biladerim İçin | 33648205 |
Ezhel | İmkansızım | 32607042 |
Ben Fero | 3 2 1 | 32086343 |
Ben Fero | Demet Akalın | 28807007 |
Anıl Piyancı | KAFA10 | 28704551 |
Reynmen | Ela | 28246604 |
Ezhel | Kazıdık Tırnaklarla | 27783484 |
Yüzyüzeyken Konuşuruz | Ne Farkeder | 27540693 |
Ceza | Neyim Var Ki (feat. Sagopa K) | 27067016 |
Murda | AYA | 26111866 |
Yüzyüzeyken Konuşuruz | Dinle Beni Bi’ | 25431753 |
Feride Hilal Akın | Yok Yok | 24360307 |
MERO | Olabilir | 23990740 |
Ed Sheeran | Shape of You | 23958838 |
Anıl Piyancı | Bırakman Doğru Mu | 23643712 |
Reynmen | Derdim Olsun | 23545514 |
For sentiment analysis, we need to extract unique songs in data frame. So that, we can determine the sentiments of each tracks which can entry the top 200 songs in sometime. After this process, we extract the Spotify ID’s of tracks on a new column for further analysis.
top_200<-topturkey200 %>% mutate(id=substring(topturkey200$URL,32))
top_200_unique<-top_200[!duplicated(top_200[,c('id')]),]
glimpse(top_200_unique)
## Observations: 2,874
## Variables: 7
## $ Position <chr> "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", ...
## $ Track.Name <chr> "Gece Gölgenin Rahatina Bak", "Starboy", "Affet", "...
## $ Artist <chr> "Çagatay Akman", "The Weeknd", "Müslüm Gürses", "Cl...
## $ Streams <chr> "80607", "44427", "34889", "28400", "25425", "23032...
## $ URL <chr> "https://open.spotify.com/track/3P31rcl0ym5paqRdwSi...
## $ Date <date> 2017-01-01, 2017-01-01, 2017-01-01, 2017-01-01, 20...
## $ id <chr> "3P31rcl0ym5paqRdwSiZps", "5aAx2yezTd8zXrkmtKl66Z",...
With track ID’s, we obtanied audio features of this songs using Spotify Web API and “spotifyr” R package. To increase process speed, we upload the RDS data in our github repository and read it from there.
Id_list=top_200$id
#Tracks feautres obtaining code is below. Because of the process time data frame downloaded from github repository.
#a<-unique(Id_list)
#tracks_features=get_track_audio_features(a[1])
#for (x in 2:length(a)){
# tracks_features <- rbind(tracks_features,get_track_audio_features(a[x]))
#}
#tracks_features<-tracks_features%>%slice(-1)
tracks_features<-readRDS(url("https://github.com/pjournal/mef03g-spo-R-ify/blob/master/top200_tracks_features.rds?raw=true"))
glimpse(tracks_features)
## Observations: 2,874
## Variables: 18
## $ danceability <dbl> 0.769, 0.681, 0.424, 0.720, 0.476, 0.748, 0.4...
## $ energy <dbl> 0.837, 0.594, 0.666, 0.763, 0.718, 0.524, 0.3...
## $ key <int> 6, 7, 7, 9, 8, 8, 4, 4, 1, 5, 6, 6, 0, 8, 7, ...
## $ loudness <dbl> -4.057, -7.028, -6.683, -4.068, -5.309, -5.59...
## $ mode <int> 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, ...
## $ speechiness <dbl> 0.1400, 0.2820, 0.0473, 0.0523, 0.0576, 0.033...
## $ acousticness <dbl> 0.65200, 0.16500, 0.39600, 0.40600, 0.07840, ...
## $ instrumentalness <dbl> 0.00e+00, 3.49e-06, 1.69e-04, 0.00e+00, 1.02e...
## $ liveness <dbl> 0.0986, 0.1340, 0.1200, 0.1800, 0.1220, 0.111...
## $ valence <dbl> 0.8190, 0.5350, 0.2750, 0.7420, 0.1420, 0.661...
## $ tempo <dbl> 100.962, 186.054, 160.079, 101.965, 199.864, ...
## $ type <chr> "audio_features", "audio_features", "audio_fe...
## $ id <chr> "3P31rcl0ym5paqRdwSiZps", "5aAx2yezTd8zXrkmtK...
## $ uri <chr> "spotify:track:3P31rcl0ym5paqRdwSiZps", "spot...
## $ track_href <chr> "https://api.spotify.com/v1/tracks/3P31rcl0ym...
## $ analysis_url <chr> "https://api.spotify.com/v1/audio-analysis/3P...
## $ duration_ms <int> 163960, 230453, 279475, 251088, 205947, 24496...
## $ time_signature <int> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, ...
After we prepared 2 data frame, we joined them in to one data frame. Than, we add a column in our new data frame using our sentiment function.
top_200_audio_features <- inner_join(top_200_unique,tracks_features,by="id")
Sentiment=c()
for (i in 1:nrow(top_200_audio_features)){
Sentiment[i]=classify_track_sentiment(valence=top_200_audio_features$valence[i],energy=top_200_audio_features$energy[i])
}
top_200_audio_features<-cbind(top_200_audio_features,Sentiment)
glimpse(top_200_audio_features)
## Observations: 2,874
## Variables: 25
## $ Position <chr> "1", "2", "3", "4", "5", "6", "7", "8", "9", ...
## $ Track.Name <chr> "Gece Gölgenin Rahatina Bak", "Starboy", "Aff...
## $ Artist <chr> "Çagatay Akman", "The Weeknd", "Müslüm Gürses...
## $ Streams <chr> "80607", "44427", "34889", "28400", "25425", ...
## $ URL <chr> "https://open.spotify.com/track/3P31rcl0ym5pa...
## $ Date <date> 2017-01-01, 2017-01-01, 2017-01-01, 2017-01-...
## $ id <chr> "3P31rcl0ym5paqRdwSiZps", "5aAx2yezTd8zXrkmtK...
## $ danceability <dbl> 0.769, 0.681, 0.424, 0.720, 0.476, 0.748, 0.4...
## $ energy <dbl> 0.837, 0.594, 0.666, 0.763, 0.718, 0.524, 0.3...
## $ key <int> 6, 7, 7, 9, 8, 8, 4, 4, 1, 5, 6, 6, 0, 8, 7, ...
## $ loudness <dbl> -4.057, -7.028, -6.683, -4.068, -5.309, -5.59...
## $ mode <int> 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, ...
## $ speechiness <dbl> 0.1400, 0.2820, 0.0473, 0.0523, 0.0576, 0.033...
## $ acousticness <dbl> 0.65200, 0.16500, 0.39600, 0.40600, 0.07840, ...
## $ instrumentalness <dbl> 0.00e+00, 3.49e-06, 1.69e-04, 0.00e+00, 1.02e...
## $ liveness <dbl> 0.0986, 0.1340, 0.1200, 0.1800, 0.1220, 0.111...
## $ valence <dbl> 0.8190, 0.5350, 0.2750, 0.7420, 0.1420, 0.661...
## $ tempo <dbl> 100.962, 186.054, 160.079, 101.965, 199.864, ...
## $ type <chr> "audio_features", "audio_features", "audio_fe...
## $ uri <chr> "spotify:track:3P31rcl0ym5paqRdwSiZps", "spot...
## $ track_href <chr> "https://api.spotify.com/v1/tracks/3P31rcl0ym...
## $ analysis_url <chr> "https://api.spotify.com/v1/audio-analysis/3P...
## $ duration_ms <int> 163960, 230453, 279475, 251088, 205947, 24496...
## $ time_signature <int> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, ...
## $ Sentiment <fct> Happy/Joyful, Happy/Joyful, Turbulent/Angry, ...
kable(head(top_200_audio_features,n=10L)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive")) %>% scroll_box(width = "100%", height = "400px")
For proper sentiment analyse, we calculate the percentages of sentiments frequency in every month. With percentages, we can compare the users emotional preferences in months.
df1a<-top_200 %>% mutate(Year_Month = format(Date,"%Y/%m"))
df1b<-df1a %>% left_join(select(top_200_audio_features, "Sentiment","id"), by = "id")
monthly_sentiment<-df1b %>% group_by(Year_Month,Sentiment) %>% summarise(Count_Sentiment = n()) %>% ungroup() %>% group_by(Year_Month) %>% mutate (Month_Sum=sum(Count_Sentiment)) %>% ungroup() %>% mutate(Percent_in_Month = percent(Count_Sentiment/Month_Sum))
glimpse(monthly_sentiment)
## Observations: 140
## Variables: 5
## $ Year_Month <chr> "2017/01", "2017/01", "2017/01", "2017/01", "...
## $ Sentiment <fct> Chill/Peaceful, Happy/Joyful, Sad/Depressing,...
## $ Count_Sentiment <int> 186, 2710, 848, 2456, 151, 2452, 690, 2307, 1...
## $ Month_Sum <int> 6200, 6200, 6200, 6200, 5600, 5600, 5600, 560...
## $ Percent_in_Month <chr> "3.0%", "43.7%", "13.7%", "39.6%", "2.7%", "4...
kable(head(monthly_sentiment,n=10L)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Year_Month | Sentiment | Count_Sentiment | Month_Sum | Percent_in_Month |
---|---|---|---|---|
2017/01 | Chill/Peaceful | 186 | 6200 | 3.0% |
2017/01 | Happy/Joyful | 2710 | 6200 | 43.7% |
2017/01 | Sad/Depressing | 848 | 6200 | 13.7% |
2017/01 | Turbulent/Angry | 2456 | 6200 | 39.6% |
2017/02 | Chill/Peaceful | 151 | 5600 | 2.7% |
2017/02 | Happy/Joyful | 2452 | 5600 | 43.8% |
2017/02 | Sad/Depressing | 690 | 5600 | 12.3% |
2017/02 | Turbulent/Angry | 2307 | 5600 | 41.2% |
2017/03 | Chill/Peaceful | 183 | 6200 | 3.0% |
2017/03 | Happy/Joyful | 2670 | 6200 | 43.1% |
monthly_sentiment<-monthly_sentiment%>% mutate(Perc_Num=as.double( strsplit(Percent_in_Month,split = "%")))
ggplot(monthly_sentiment,aes(Year_Month,Perc_Num,group=Sentiment,color=Sentiment)) + geom_point() + geom_line(aes( color=Sentiment)) + labs(title = "Monthly Sentiment Change", x = "Months", y = "Percantage") +
theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5),axis.text.x = element_text(angle = 90),
axis.title.x = element_text(size = 14, face = "bold"),
axis.title.y = element_text(size = 14, face = "bold"), legend.title = element_blank())
Spotify offer us Turkey’s playlists in 1980, 1990, 2000, 2010 decades. These playlists includes just 50 songs per decade. Thus, we use sentiment percentage frequency again for compare decades and last 3 years data.
df80<-get_playlist_audio_features("spotifycharts","37i9dQZF1DX4io1yPyoLtv") %>% mutate(Year="1980")
df90<-get_playlist_audio_features("spotifycharts","37i9dQZF1DXb7MJRXLczzR") %>% mutate(Year="1990")
df00<-get_playlist_audio_features("spotifycharts","37i9dQZF1DWYteTcNVQZNq") %>% mutate(Year="2000")
df10<-get_playlist_audio_features("spotifycharts","37i9dQZF1DXaE9T4Nls8eC") %>% mutate(Year="2010")
past_track_data <- rbind(df80, df90, df00, df10)
glimpse(past_track_data)
## Observations: 200
## Variables: 62
## $ playlist_id <chr> "37i9dQZF1DX4io1yPyoLtv", "...
## $ playlist_name <chr> "Türkçe 80'ler", "Türkçe 80...
## $ playlist_img <chr> "https://pl.scdn.co/images/...
## $ playlist_owner_name <chr> "Spotify", "Spotify", "Spot...
## $ playlist_owner_id <chr> "spotify", "spotify", "spot...
## $ danceability <dbl> 0.395, 0.742, 0.469, 0.534,...
## $ energy <dbl> 0.675, 0.186, 0.486, 0.739,...
## $ key <int> 1, 8, 9, 4, 2, 4, 9, 7, 7, ...
## $ loudness <dbl> -5.881, -16.820, -15.076, -...
## $ mode <int> 0, 1, 0, 1, 0, 0, 0, 0, 1, ...
## $ speechiness <dbl> 0.0472, 0.0449, 0.0427, 0.1...
## $ acousticness <dbl> 0.6490, 0.8220, 0.1490, 0.6...
## $ instrumentalness <dbl> 0.00e+00, 0.00e+00, 1.21e-0...
## $ liveness <dbl> 0.0963, 0.1620, 0.1230, 0.2...
## $ valence <dbl> 0.496, 0.570, 0.595, 0.505,...
## $ tempo <dbl> 177.053, 114.982, 130.737, ...
## $ track.id <chr> "2rP7pI2WpMWcUraYAX2xiT", "...
## $ analysis_url <chr> "https://api.spotify.com/v1...
## $ time_signature <int> 4, 4, 4, 4, 3, 4, 5, 4, 4, ...
## $ added_at <chr> "2019-08-07T12:49:23Z", "20...
## $ is_local <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ primary_color <lgl> NA, NA, NA, NA, NA, NA, NA,...
## $ added_by.href <chr> "https://api.spotify.com/v1...
## $ added_by.id <chr> "", "", "", "", "", "", "",...
## $ added_by.type <chr> "user", "user", "user", "us...
## $ added_by.uri <chr> "spotify:user:", "spotify:u...
## $ added_by.external_urls.spotify <chr> "https://open.spotify.com/u...
## $ track.artists <list> [<data.frame[1 x 6]>, <dat...
## $ track.available_markets <list> [<"AD", "AE", "AR", "AT", ...
## $ track.disc_number <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ track.duration_ms <int> 259360, 257044, 239266, 229...
## $ track.episode <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ track.explicit <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ track.href <chr> "https://api.spotify.com/v1...
## $ track.is_local <lgl> FALSE, FALSE, FALSE, FALSE,...
## $ track.name <chr> "Tükenecegiz", "Bu Kalp Sen...
## $ track.popularity <int> 60, 48, 55, 50, 35, 56, 49,...
## $ track.preview_url <chr> "https://p.scdn.co/mp3-prev...
## $ track.track <lgl> TRUE, TRUE, TRUE, TRUE, TRU...
## $ track.track_number <int> 8, 1, 1, 6, 2, 2, 7, 8, 6, ...
## $ track.type <chr> "track", "track", "track", ...
## $ track.uri <chr> "spotify:track:2rP7pI2WpMWc...
## $ track.album.album_type <chr> "album", "album", "album", ...
## $ track.album.artists <list> [<data.frame[1 x 6]>, <dat...
## $ track.album.available_markets <list> [<"AD", "AE", "AR", "AT", ...
## $ track.album.href <chr> "https://api.spotify.com/v1...
## $ track.album.id <chr> "13JKU1RyLFS73hDvWnTHr1", "...
## $ track.album.images <list> [<data.frame[3 x 3]>, <dat...
## $ track.album.name <chr> "Sen Aglama", "Bu Kalp Seni...
## $ track.album.release_date <chr> "1984-09-06", "2002-02-20",...
## $ track.album.release_date_precision <chr> "day", "day", "day", "year"...
## $ track.album.total_tracks <int> 11, 9, 10, 10, 10, 13, 11, ...
## $ track.album.type <chr> "album", "album", "album", ...
## $ track.album.uri <chr> "spotify:album:13JKU1RyLFS7...
## $ track.album.external_urls.spotify <chr> "https://open.spotify.com/a...
## $ track.external_ids.isrc <chr> "TR0061200300", "TR00806002...
## $ track.external_urls.spotify <chr> "https://open.spotify.com/t...
## $ video_thumbnail.url <lgl> NA, NA, NA, NA, NA, NA, NA,...
## $ key_name <chr> "C#", "G#", "A", "E", "D", ...
## $ mode_name <chr> "minor", "major", "minor", ...
## $ key_mode <chr> "C# minor", "G# major", "A ...
## $ Year <chr> "1980", "1980", "1980", "19...
Sentiment=c()
for (i in 1:nrow(past_track_data)){
Sentiment[i]=classify_track_sentiment(valence=past_track_data$valence[i],energy=past_track_data$energy[i])
}
past_track_data<-cbind(past_track_data,Sentiment)
past_track_data_sentiment<-past_track_data %>% group_by(Year,Sentiment) %>% summarise(Count= n())
df1c<-top_200 %>% mutate(Year = format(Date,"%Y"))
sent_count_yearly<-df1c %>% left_join(select(top_200_audio_features, "Sentiment","id"), by = "id") %>% group_by(Year,Sentiment) %>% summarise(Count= n())
yearly_change<- rbind(past_track_data_sentiment,sent_count_yearly) %>% group_by(Year) %>% mutate (Year_Sum=sum(Count)) %>% ungroup() %>% mutate(Percent_in_Year = percent(Count/Year_Sum))
kable(yearly_change) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Year | Sentiment | Count | Year_Sum | Percent_in_Year |
---|---|---|---|---|
1980 | Chill/Peaceful | 7 | 50 | 14.0% |
1980 | Happy/Joyful | 21 | 50 | 42.0% |
1980 | Sad/Depressing | 15 | 50 | 30.0% |
1980 | Turbulent/Angry | 7 | 50 | 14.0% |
1990 | Chill/Peaceful | 7 | 50 | 14.0% |
1990 | Happy/Joyful | 29 | 50 | 58.0% |
1990 | Sad/Depressing | 9 | 50 | 18.0% |
1990 | Turbulent/Angry | 5 | 50 | 10.0% |
2000 | Chill/Peaceful | 2 | 50 | 4.0% |
2000 | Happy/Joyful | 30 | 50 | 60.0% |
2000 | Sad/Depressing | 4 | 50 | 8.0% |
2000 | Turbulent/Angry | 14 | 50 | 28.0% |
2010 | Chill/Peaceful | 3 | 50 | 6.0% |
2010 | Happy/Joyful | 28 | 50 | 56.0% |
2010 | Sad/Depressing | 3 | 50 | 6.0% |
2010 | Turbulent/Angry | 16 | 50 | 32.0% |
2017 | Chill/Peaceful | 2764 | 72400 | 3.8% |
2017 | Happy/Joyful | 34603 | 72400 | 47.8% |
2017 | Sad/Depressing | 8352 | 72400 | 11.5% |
2017 | Turbulent/Angry | 26681 | 72400 | 36.9% |
2018 | Chill/Peaceful | 2899 | 73000 | 4.0% |
2018 | Happy/Joyful | 29468 | 73000 | 40.4% |
2018 | Sad/Depressing | 10070 | 73000 | 13.8% |
2018 | Turbulent/Angry | 30563 | 73000 | 41.9% |
2019 | Chill/Peaceful | 3668 | 66000 | 5.6% |
2019 | Happy/Joyful | 24829 | 66000 | 37.6% |
2019 | Sad/Depressing | 9439 | 66000 | 14.3% |
2019 | Turbulent/Angry | 28064 | 66000 | 42.5% |
yearly_change<-yearly_change%>% mutate(Perc_Num=as.double( strsplit(Percent_in_Year,split = "%")))
ggplot(yearly_change,aes(Year,Perc_Num,group=Sentiment,color=Sentiment)) + geom_point() + geom_line(aes( color=Sentiment)) + labs(title = "Yearly Sentiment Change", x = "Years", y = "Percantage") +
theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5),axis.text.x = element_text(angle = 90),
axis.title.x = element_text(size = 14, face = "bold"),
axis.title.y = element_text(size = 14, face = "bold"), legend.title = element_blank())
Sentiment count of all unique songs between 2017 and 2019.
sent_count <- top_200_audio_features %>% group_by(Sentiment) %>% count()
ggplot(sent_count, aes(x=Sentiment, y=n, fill=Sentiment)) +
geom_bar(stat="identity") +
labs(title = "Sentiment Count", x = "Sentiment Distribution", y = "Count of Sentiments") +
theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5),
axis.title.x = element_text(size = 14, face = "bold"),
axis.title.y = element_text(size = 14, face = "bold"), legend.title = element_blank())
Finaly we mapped the all Top 200 tracks by their sentiment and displayed in gradient chart.
ggplot(top_200_audio_features,aes(x = valence, y = energy, color = Sentiment)) + geom_point() +
labs(color = "", title = "Sentiment Analysis of Turkey Top 200 Chart Between 2017 and 2019") +
theme(title = element_text(size = 16, face = "bold"), plot.title = element_text(hjust = 0.5),
axis.title.x = element_text(size = 14, face = "bold"),
axis.title.y = element_text(size = 14, face = "bold"), legend.title = element_blank()) +
scale_x_continuous(expand = c(0, 0), limits = c(0, 1)) +
scale_y_continuous(expand = c(0, 0), limits = c(0, 1)) +
geom_label(aes(x = 0.25, y = 0.97, label = "Turbulent/Angry"), label.padding = unit(2, "mm"), fill = "darkgrey", color="white") +
geom_label(aes(x = 0.75, y = 0.97, label = "Happy/Joyful"), label.padding = unit(2, "mm"), fill = "darkgrey", color="white") +
geom_label(aes(x = 0.25, y = 0.03, label = "Sad/Depressing"), label.padding = unit(2, "mm"), fill = "darkgrey", color="white") +
geom_label(aes(x = 0.75, y = 0.03, label = "Chill/Peaceful"), label.padding = unit(2, "mm"), fill = "darkgrey", color="white") +
geom_segment(aes(x = 1, y = 0, xend = 1, yend = 1)) +
geom_segment(aes(x = 0, y = 0, xend = 0, yend = 1)) +
geom_segment(aes(x = 0, y = 0, xend = 1, yend = 0)) +
geom_segment(aes(x = 0, y = 0.5, xend = 1, yend = 0.5)) +
geom_segment(aes(x = 0.5, y = 0, xend = 0.5, yend = 1)) +
geom_segment(aes(x = 0, y = 1, xend = 1, yend = 1))
Click on the link to use our app, which analyzes and compares the audio features of music charts created by Spotify or belonging to two different Spotify users as a radar chart.
Click on the link to use our app, which makes predictions on personality type based on the user playlist, using the audio features and key characteristic of the Spotify user’s playlist.