Let’s import necessary libraries.
library(tidyverse)
library(readr)
library(ggplot2)
Let’s load our data from the main source.
df <- read_csv("https://github.com/ygterl/EDA-Netflix-2020-in-R/raw/master/netflix_titles.csv")
Let’s take a look at our data.
head(df)
As can be seen from the graph below, there is a significant increase in the number of TV shows and movies released after 2000.
df %>% group_by(release_year) %>% count(release_year) %>%
ggplot(aes(x=release_year, y=n, fill=n)) +
geom_bar(stat='identity') +
ggtitle("Number of movies and tv shows released over time") +
xlab("Release Years") + ylab("Number of movies and tv shows")
Let’s try to find which directors directed most films or tv shows. For this, we have to first split the directors in the director column using “split” function with comma as our delimiter. Then, we create a new dataframe with directors that have 10 most directed films and what type of thing they directed (movie or tv show). As a last thing, we can create a dataframe consisting of this information and then plotting it in on bar chart.
k <- strsplit(df$director, split = ", ")
directors <- data.frame(type = rep(df$type, sapply(k, length)), directors = unlist(k))
directors$directors <- as.character(directors$directors)
amount_by_directors <- na.omit(directors) %>%
group_by(directors, type) %>%
summarise(count = n()) %>%
arrange(desc(count))
`summarise()` has grouped output by 'directors'. You can override using the `.groups` argument.
top_10_directors <- head(amount_by_directors, 10)
top_10_directors$directors <- factor(top_10_directors$directors, levels = c(top_10_directors$directors))
top_10_directors %>%
ggplot(aes(x=directors, y=count, fill=count)) +
geom_bar(stat='identity') +
ggtitle("Names of the directors who directed most movies and tv shows") +
xlab("Directors") + ylab("Number of directed movies and tv shows") +
theme(axis.text.x = element_text(angle = 60, hjust = 1, size=12))
LS0tDQp0aXRsZTogIldlZWsgNyAtIE5ldGZsaXgiDQphdXRob3I6ICJNdXJhdCBDYW4gVGFzYXIiDQpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sNCi0tLQ0KDQpMZXQncyBpbXBvcnQgbmVjZXNzYXJ5IGxpYnJhcmllcy4NCmBgYHtyLCBtZXNzYWdlPUZBTFNFfQ0KbGlicmFyeSh0aWR5dmVyc2UpDQpsaWJyYXJ5KHJlYWRyKQ0KbGlicmFyeShnZ3Bsb3QyKQ0KYGBgDQoNCkxldCdzIGxvYWQgb3VyIGRhdGEgZnJvbSB0aGUgbWFpbiBzb3VyY2UuDQpgYGB7ciwgbWVzc2FnZT1GQUxTRX0NCmRmIDwtIHJlYWRfY3N2KCJodHRwczovL2dpdGh1Yi5jb20veWd0ZXJsL0VEQS1OZXRmbGl4LTIwMjAtaW4tUi9yYXcvbWFzdGVyL25ldGZsaXhfdGl0bGVzLmNzdiIpDQpgYGANCg0KTGV0J3MgdGFrZSBhIGxvb2sgYXQgb3VyIGRhdGEuDQpgYGB7cn0NCmhlYWQoZGYpDQpgYGANCg0KQXMgY2FuIGJlIHNlZW4gZnJvbSB0aGUgZ3JhcGggYmVsb3csIHRoZXJlIGlzIGEgc2lnbmlmaWNhbnQgaW5jcmVhc2UgaW4gdGhlIG51bWJlciBvZiBUViBzaG93cyBhbmQgbW92aWVzIHJlbGVhc2VkIGFmdGVyIDIwMDAuDQpgYGB7cn0NCmRmICU+JSBncm91cF9ieShyZWxlYXNlX3llYXIpICU+JSBjb3VudChyZWxlYXNlX3llYXIpICU+JQ0KICAgICAgICBnZ3Bsb3QoYWVzKHg9cmVsZWFzZV95ZWFyLCB5PW4sIGZpbGw9bikpICsNCiAgICAgICAgZ2VvbV9iYXIoc3RhdD0naWRlbnRpdHknKSArDQogICAgICAgIGdndGl0bGUoIk51bWJlciBvZiBtb3ZpZXMgYW5kIHR2IHNob3dzIHJlbGVhc2VkIG92ZXIgdGltZSIpICsNCiAgICAgICAgeGxhYigiUmVsZWFzZSBZZWFycyIpICsgeWxhYigiTnVtYmVyIG9mIG1vdmllcyBhbmQgdHYgc2hvd3MiKQ0KDQpgYGANCg0KDQoNCkxldCdzIHRyeSB0byBmaW5kIHdoaWNoIGRpcmVjdG9ycyBkaXJlY3RlZCBtb3N0IGZpbG1zIG9yIHR2IHNob3dzLiBGb3IgdGhpcywgd2UgaGF2ZSB0byBmaXJzdCBzcGxpdCB0aGUgZGlyZWN0b3JzIGluIHRoZSBkaXJlY3RvciBjb2x1bW4gdXNpbmcgInNwbGl0IiBmdW5jdGlvbiB3aXRoIGNvbW1hIGFzIG91ciBkZWxpbWl0ZXIuIFRoZW4sIHdlIGNyZWF0ZSBhIG5ldyBkYXRhZnJhbWUgd2l0aCBkaXJlY3RvcnMgdGhhdCBoYXZlIDEwIG1vc3QgZGlyZWN0ZWQgZmlsbXMgYW5kIHdoYXQgdHlwZSBvZiB0aGluZyB0aGV5IGRpcmVjdGVkIChtb3ZpZSBvciB0diBzaG93KS4gQXMgYSBsYXN0IHRoaW5nLCB3ZSBjYW4gY3JlYXRlIGEgZGF0YWZyYW1lIGNvbnNpc3Rpbmcgb2YgdGhpcyBpbmZvcm1hdGlvbiBhbmQgdGhlbiBwbG90dGluZyBpdCBpbiBvbiBiYXIgY2hhcnQuDQpgYGB7cn0NCmsgPC0gc3Ryc3BsaXQoZGYkZGlyZWN0b3IsIHNwbGl0ID0gIiwgIikNCmRpcmVjdG9ycyA8LSBkYXRhLmZyYW1lKHR5cGUgPSByZXAoZGYkdHlwZSwgc2FwcGx5KGssIGxlbmd0aCkpLCBkaXJlY3RvcnMgPSB1bmxpc3QoaykpDQpkaXJlY3RvcnMkZGlyZWN0b3JzIDwtIGFzLmNoYXJhY3RlcihkaXJlY3RvcnMkZGlyZWN0b3JzKQ0KDQphbW91bnRfYnlfZGlyZWN0b3JzIDwtIG5hLm9taXQoZGlyZWN0b3JzKSAlPiUNCiAgZ3JvdXBfYnkoZGlyZWN0b3JzLCB0eXBlKSAlPiUNCiAgc3VtbWFyaXNlKGNvdW50ID0gbigpKSAlPiUNCiAgYXJyYW5nZShkZXNjKGNvdW50KSkNCg0KdG9wXzEwX2RpcmVjdG9ycyA8LSBoZWFkKGFtb3VudF9ieV9kaXJlY3RvcnMsIDEwKQ0KDQp0b3BfMTBfZGlyZWN0b3JzJGRpcmVjdG9ycyA8LSBmYWN0b3IodG9wXzEwX2RpcmVjdG9ycyRkaXJlY3RvcnMsIGxldmVscyA9IGModG9wXzEwX2RpcmVjdG9ycyRkaXJlY3RvcnMpKQ0KDQp0b3BfMTBfZGlyZWN0b3JzICU+JQ0KICAgICAgICBnZ3Bsb3QoYWVzKHg9ZGlyZWN0b3JzLCB5PWNvdW50LCBmaWxsPWNvdW50KSkgKw0KICAgICAgICBnZW9tX2JhcihzdGF0PSdpZGVudGl0eScpICsNCiAgICAgICAgZ2d0aXRsZSgiTmFtZXMgb2YgdGhlIGRpcmVjdG9ycyB3aG8gZGlyZWN0ZWQgbW9zdCBtb3ZpZXMgYW5kIHR2IHNob3dzIikgKw0KICAgICAgICB4bGFiKCJEaXJlY3RvcnMiKSArIHlsYWIoIk51bWJlciBvZiBkaXJlY3RlZCBtb3ZpZXMgYW5kIHR2IHNob3dzIikgKw0KICAgICAgICB0aGVtZShheGlzLnRleHQueCA9IGVsZW1lbnRfdGV4dChhbmdsZSA9IDYwLCBoanVzdCA9IDEsIHNpemU9MTIpKQ0KDQpgYGANCg0KDQoNCg0KDQoNCg==