Reading Data
There were 130 rows in In EVDS_istanbul_property_data excel to analyze, so I limit the rows from top to 130th rows.
read_df <- read_xlsx("/Users/emredemirhan/Desktop/r_exercises/EVDS_istanbul_property_data.xlsx", sheet = "EVDS", n_max = 130)
Renaming columns
I have changed the column names by numbers to make them easy to understand.
my_df <- read_df %>% select(Date = 1, Total = 2, Mortgage = 3, FirstHand = 4,
SecondHand = 5, Foreign = 6, NewHousePriceIndex = 7,
HousePriceIndex = 8, UnitPrice = 9)
Analyzing Data frame Total Sales
First of all, I started my analyze by visualizing total property sales between the years 2013 to 2020. In this period, average total property sales fluctuate around 20.000 per month until October 2018. However, from this date property sales dropped significantly and only 10000 property sold in June 2019. After July 2019, sales start to rise until January 2020. Covid-19 lock-down also decline the sales sharply in April and May of 2020. Due to decrease in interest rate, property sales has reached its best performance with 30.980 average sales per month between June and September 2020.
plot_1_df <- my_df %>%
filter(is.na(Total)==FALSE) %>%
select(Date, Total) %>% arrange(Date)
ggplot(plot_1_df, aes(x = Date, y = Total, group = 1,color = Total)) +
geom_line() +
ylab("Total sales") +
ggtitle("Total Property Sales in Istanbul","Jan 2013 - September 2020") +
theme(axis.text.x = element_blank(), axis.ticks = element_blank(), plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5))
Mortgage Sales vs Total Sales
In order to analyze mortgage sale share in total sales, I visualize the data as below. It seems that mortgage sales and total sales stayed constant and did not fluctuate sharply until 2018. However, from this date, it seems as if there is no correlation between two data and it is very hard to interpret the sales with interest rate or depreciation of turkish lira.
plot_1_df <- my_df %>%
filter(is.na(Total)==FALSE) %>%
select(Date, Total) %>% arrange(Date)
plot_2_df <- my_df %>%
filter(is.na(Mortgage)==FALSE) %>%
select(Date, Mortgage) %>% arrange(Date)
ggplot() +
geom_point(data=plot_1_df, aes(x = Date, y = Total, color = "Total Sales")) +
geom_point(data=plot_2_df, aes(x = Date, y = Mortgage, color = "Mortgage Sales"))+
ylab("Sales") +
ggtitle("Property Sales in Istanbul","Jan 2013 - September 2020") +
theme(axis.text.x = element_blank(), axis.ticks = element_blank(), plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5))
First Hand Sales vs Second Hand Sales
Until 2018, first and second hand ratio did not fluctuate but again from 2018, buying a property preferences are much more concentrated on second hand sales.
First hand sales ratio(black line)
Second hand sales ratio(red line)
plot_21_df <- my_df %>%
filter(is.na(Total) == FALSE) %>%
transmute(Date, FirstHandRatio = FirstHand / Total) %>% arrange(Date)
plot_22_df <- my_df %>%
filter(is.na(Total) == FALSE) %>%
transmute(Date, SecondHandRatio = SecondHand / Total) %>% arrange(Date)
plot_23_df <- my_df %>%
filter(is.na(UnitPrice)==FALSE) %>%
select(Date, UnitPrice) %>% arrange(Date)
ggplot() +
geom_line(data=plot_21_df, aes(x = Date, y = FirstHandRatio),group=1, color = "black") +
geom_line(data=plot_22_df, aes(x = Date, y = SecondHandRatio),group=1, color = "red")+
ylab("First Hand/Second Hand Ratio - Percentage") +
ggtitle("First Hand/Second Hand Ratio in Istanbul","Jan 2013 - September 2020") +
theme(axis.text.x = element_blank(), axis.ticks = element_blank(), plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5))
Unit Price effect on Total Sales
In this graph, I wanted to show unit price differences effect on total property sales. Although unit price have a rise, total property sales did not decrease. This can be explained by the population growth in Ä°stanbul and depreciation of turkish lira.
plot_1_df <- my_df %>%
filter(is.na(Total)==FALSE) %>%
select(Date, Total) %>% arrange(Date)
plot_23_df <- my_df %>%
filter(is.na(UnitPrice)==FALSE) %>%
filter(Date>2013) %>%
select(Date, UnitPrice) %>% arrange(Date)
ggplot()+
geom_line(data=plot_23_df, aes(x = Date, y = UnitPrice, group = 1,color = UnitPrice)) +
geom_point(data=plot_1_df, aes(x = Date, y = Total,group=1,color=Total)) +
ylab("Total sales") +
ggtitle("Total Property Sales in Istanbul","Jan 2013 - September 2020") +
theme(axis.text.x = element_blank(), axis.ticks = element_blank(), plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5))