First of all, the names of colums are renamed to more clear ones. Then we tried to understand what kind of data we have. We used filter function in order to get rid of the N/A values.
data_bonus<- read_xlsx("C:/Users/bsivas/Desktop/Bonus Assignment/EVDS_istanbul_property_data.xlsx", n_max = 131)
data_frame_v4<- rename(data_bonus, "Date" = "Tarih", "Total_Sales" = "TP AKONUTSAT1 T40", "mortgage_Sales"="TP AKONUTSAT2 T40","first_hand_sales"= "TP AKONUTSAT3 T40","second_hand_sales"="TP AKONUTSAT4 T40","foreigner_Sales"= "TP DISKONSAT ISTANBUL", "new_price_rate"="TP HEDONIKYKFE IST", "price_rate"="TP HKFE02", "unit_price(TL/m^2)"="TP TCBF02 ISTANBUL")
data_frame_v5<- data_frame_v4 %>% filter(Date >= 2013)
glimpse(data_frame_v5)
## Rows: 93
## Columns: 9
## $ Date <chr> "2013-01", "2013-02", "2013-03", "2013-04", "2...
## $ Total_Sales <dbl> 18235, 18971, 21570, 20791, 22030, 19357, 2066...
## $ mortgage_Sales <dbl> 8423, 8836, 10164, 9726, 10805, 9762, 10071, 6...
## $ first_hand_sales <dbl> 8298, 8277, 9542, 8751, 9371, 8160, 9034, 6960...
## $ second_hand_sales <dbl> 9937, 10694, 12028, 12040, 12659, 11197, 11634...
## $ foreigner_Sales <dbl> 138, 120, 198, 209, 188, 155, 192, 170, 156, 1...
## $ new_price_rate <dbl> 49.5, 50.4, 51.1, 51.6, 52.2, 53.0, 53.8, 54.8...
## $ price_rate <dbl> 47.8, 48.8, 49.9, 50.9, 51.6, 52.4, 52.9, 53.5...
## $ `unit_price(TL/m^2)` <dbl> 2063.8, 2113.7, 2157.5, 2186.3, 2217.8, 2262.9...
When we look at the plots, we can assume that the mortgage sales gets its highest point due to the lowest credit opportunity. The percentage of foreiner sales goes straight line in limited rates. The total number of sales suddenly goes deep dive after the covid diseases occured.
ggplot(data_frame_v5, aes(x= Date, y= Total_Sales,color="Total_Sales", group = 1))+geom_line() +
geom_line(aes(x= Date, y= mortgage_Sales,color="mortgage_Sales", group = 1)) +
geom_line(aes(x= Date, y= first_hand_sales,color="first_hand_sales", group = 1))+
geom_line(aes(x= Date, y= second_hand_sales,color="second_hand_sales", group = 1))+
geom_line(aes(x= Date, y= foreigner_Sales,color="foreigner_Sales", group = 1))
Here, we see the typical example of the supply-demand relationship, which is one of the most fundamental rules of the economy, in unit house prices & total housing sales figures.
ggplot(data_frame_v5, aes(x= Date, y= Total_Sales,color="Total_Sales", group = 1))+geom_line() +
geom_line(aes(x= Date, y= mortgage_Sales,color="unit_price(TL/m^2)", group = 1)) + geom_line()
Unfortunately from these data we cannot say anything about the relationship between price rate of new houses and sales demand.
ggplot(data_frame_v5, aes(x= Date, y=new_price_rate,color="new_price_rate", group=1))+geom_line() +
geom_line(aes(x= Date, y=first_hand_sales, color= "first_hand_sales", group=1))
## Warning: Removed 1 row(s) containing missing values (geom_path).