library(dplyr)
library(tidyverse)
library(ggplot2)
library(readxl)
library(lubridate)

Introduction and Data Overview

In this report, EPIAS Turkey Energy Market Price data of September 2020 was analyzed. Below code chunk helps us to read our dataset and convert our variables to the right format. MCP. stands for ‘Market Clearing Price’ and SMP. stands for ‘System Marginal Price’.

data<-read.csv("C:/Users/erenm/OneDrive/Masaüstü/ptf-smf.csv")
data$SMP.<-as.numeric(sub(",","",data$SMP.))
data$Positive.Imbalance.Price..TL.MWh.<-as.numeric(sub(",","",data$Positive.Imbalance.Price..TL.MWh.))
data$Negative.Imbalance.Price..TL.MWh.<-as.numeric(sub(",","",data$Negative.Imbalance.Price..TL.MWh.))
data$Date<-as.POSIXct(data$Date,format="%d.%m.%y %H:%M")

str(data)
## 'data.frame':    720 obs. of  6 variables:
##  $ Date                             : POSIXct, format: "2020-09-01 00:00:00" "2020-09-01 01:00:00" ...
##  $ MCP.                             : num  302 300 293 290 290 ...
##  $ SMP.                             : num  332 325 318 320 330 ...
##  $ Positive.Imbalance.Price..TL.MWh.: num  293 291 284 281 281 ...
##  $ Negative.Imbalance.Price..TL.MWh.: num  342 335 327 330 340 ...
##  $ SMP.Direction                    : chr  "? Energy Deficit" "? Energy Deficit" "? Energy Deficit" "? Energy Deficit" ...

Distribution of MCP and SMP

We can see that the range of min-max and Q1-Q3 of SMP is wider than MCP. There are more outliers in MCP prices but less variability as we can see in the below plot. We can limit MCP and SMP Prices in the next plots, just to see whether if it’s homoskedastic or not.

data %>%
  pivot_longer(cols = c(MCP.,SMP.), names_to = "PriceGroup")%>%
  ggplot(aes(x=PriceGroup, y=value, fill=PriceGroup)) +
  geom_boxplot() +
  theme_test() +
  labs(title = "MCP and SMP Boxplot", y = "Price")+
  scale_y_continuous(breaks=seq(0,2000,400))

Comparing MCP and SMP

Below, you may find 2 plots with limitations and without limitations. X axis was limited to (180,500) and Y axis was limited to (180,500) by the help of boxplot that we produced before. In the first plot, it seems like as MCP increases, SMP increases too or vice-versa. Also, it doesn’t seem like there is a heteroskedastic trend.

data%>%ggplot(aes(x = MCP., y = SMP.)) + 
  geom_point(colour = "#00AFBB",size=1.7) + 
  geom_smooth(method=lm) +
  theme_test() + 
  labs(x="MCP", y="SMP", title = "MCP and SMP Price Comparison")

When it comes to second plot, we can see that there is not a clear trend like in the first plot. It means we can observe that there is a heteroskedastic trend (variability of the variable is not equal) between MCP and SMP if we remove these outliers.

data%>%ggplot(aes(x = MCP., y = SMP.)) + 
  geom_point(colour = "#00AFBB",size=1.7) + 
  geom_smooth(method=lm) +
  xlim(180,400)+
  ylim(180,500)+
  theme_test() + 
  labs(x="MCP", y="SMP", title = "MCP and SMP Price Comparison")

Daily and Hourly Energy Direction Ratios

Below code chunk helps us to explore how our energy direction is distributed. If SMP<MCP then it’s Surplus; if SMP=MCP then it’s In Balance and SMP>MCP then it’s Deficit. We can use this grouping just to see our distribution. First data is the output of hourly prices. We can see that a higher proportion of data seems to be Deficit which means SMP>MCP. Second data is the output of daily average prices. We can see that there is not any daily average prices equal to each other due to high variability in the hourly prices.

data %>%
  summarise(Surplus = sum(MCP.>SMP.)/nrow(data), InBalance=sum(MCP.==SMP.)/nrow(data), 
            Deficit = sum(MCP.<SMP.)/nrow(data)) %>% 
  pivot_longer(cols = c(Surplus, InBalance, Deficit), names_to = "SMP_Direction")
## # A tibble: 3 x 2
##   SMP_Direction value
##   <chr>         <dbl>
## 1 Surplus       0.179
## 2 InBalance     0.147
## 3 Deficit       0.674
daily_data<-data %>% group_by(DayofMonth=day(Date))%>%
  summarise(MCP_Daily = mean(MCP.), SMP_Daily = mean(SMP.))

daily_data %>%
  summarise(Surplus = sum(MCP_Daily>SMP_Daily)/nrow(daily_data), InBalance=sum(MCP_Daily==SMP_Daily)/nrow(daily_data), 
            Deficit = sum(MCP_Daily<SMP_Daily)/nrow(daily_data)) %>% 
  pivot_longer(cols = c(Surplus, InBalance, Deficit), names_to = "SMP_Direction")
## # A tibble: 3 x 2
##   SMP_Direction value
##   <chr>         <dbl>
## 1 Surplus       0.233
## 2 InBalance     0    
## 3 Deficit       0.767

Daily and Hourly Average Price Comparison

Below code chunk helps us to see daily and hourly average prices in September 2020. There is a higher variability in SMP than MCP as expected. Weekend MCP seems to be lower than weekday prices as we can see from the first plot, there is always a downwards trend when it comes to weekend prices.

data %>% 
  group_by(DayofMonth = day(Date),weekend = wday(Date, week_start = getOption("lubridate.week.start", 1)) > 5) %>% 
  summarise(MCP_Average = mean(MCP.), SMP_Average = mean(SMP.)) %>% 
  pivot_longer(cols=c(MCP_Average, SMP_Average), names_to="PriceGroup")%>%
  ggplot(aes(x=DayofMonth, y=value)) + 
  geom_line(aes(color=PriceGroup),size=1.25) +
  scale_fill_manual(values=c("white","purple"))+
  geom_tile(aes(x = DayofMonth, height = Inf, fill = weekend), alpha = .025)+
  theme_test() + 
  labs(x="Day of Month", y="Price", title = "Daily Average Price Comparison") + 
  theme(legend.position="right")+
  scale_x_continuous(breaks=seq(0,30,3))

We can observe hourly average prices from the second plot. It seems like prices are decreasing from 7 pm to the early morning and it increases with a sharp movement and gives us a peak at 4 pm. Probably, demand increases a lot around 4 pm and it makes prices to get higher.

data %>% 
  group_by(HourofDay = hour(Date)) %>% 
  summarise(MCP_average = mean(MCP.), SMP_average = mean(SMP.)) %>% 
  pivot_longer(cols=c(MCP_average, SMP_average), names_to="PriceGroup")%>%
  ggplot(aes(x=HourofDay, y=value, color=PriceGroup)) + 
  geom_line(size=1.25) + 
  theme_test() + 
  labs(x="Hour of Day", y="Price", title = "Hourly Average Price Comparison") + 
  theme(legend.position="right")+
  scale_x_continuous(breaks=seq(0,23,3))