In this report, we will analyze the data of the electricity prices. We used the data from EPIAS. You can download the data from here. In this analysis, we used the data from 01 - 31.07.2020 electricity prices. You can get more information from here
After downloading the data, we will import the data to an object called data
. After importing the data we can observe the data with glimpse function.
data = read.csv("ptf-smf.csv")
data %>% glimpse()
## Rows: 744
## Columns: 6
## $ Tarih <chr> "01.07.20 00:00", "01.07.20 01:...
## $ PTF <chr> "323,85", "326,95", "324,31", "...
## $ SMF <chr> "211,00", "201,00", "211,00", "...
## $ Pozitif.Dengesizlik.Fiyatı..TL.MWh. <chr> "204,67", "194,97", "204,67", "...
## $ Negatif.Dengesizlik.Fiyatı..TL.MWh. <chr> "333,57", "336,76", "334,04", "...
## $ SMF.Yön <chr> "?Enerji Fazlası", "?Enerji Faz...
When we look at the data, the type of the date field is string. So, we need to change the type of that column. Also, we can change the name of the field (translating Turkish to English).
data$PTF = as.numeric(gsub(",", ".", gsub("\\.", "", data$PTF)))
data$SMF = as.numeric(gsub(",", ".", gsub("\\.", "", data$SMF)))
data$Negatif.Dengesizlik.Fiyatı..TL.MWh. = as.numeric(gsub(",", ".", gsub("\\.", "", data$Negatif.Dengesizlik.Fiyatı..TL.MWh.)))
data$Pozitif.Dengesizlik.Fiyatı..TL.MWh. = as.numeric(gsub(",", ".", gsub("\\.", "", data$Pozitif.Dengesizlik.Fiyatı..TL.MWh.)))
data$Tarih = gsub(pattern = "\\.","-",data$Tarih)
data_last = data %>%
select(Date = Tarih, MCP = PTF, SMP = SMF, NIP = Negatif.Dengesizlik.Fiyatı..TL.MWh., PIP = Pozitif.Dengesizlik.Fiyatı..TL.MWh., SMPDirection = SMF.Yön) %>%
mutate(DateTime = as.POSIXct(factor(Date), format = "%d-%m-%y %H:%M")) %>%
mutate(Day = wday(DateTime, week_start = 1), Hour = hour(DateTime), Date = as.Date(Date, format = "%d-%m-%y %H:%M"))
data_last %>% glimpse()
## Rows: 744
## Columns: 9
## $ Date <date> 2020-07-01, 2020-07-01, 2020-07-01, 2020-07-01, 2020-...
## $ MCP <dbl> 323.85, 326.95, 324.31, 322.11, 320.00, 286.21, 210.13...
## $ SMP <dbl> 211.00, 201.00, 211.00, 211.00, 201.00, 181.00, 113.75...
## $ NIP <dbl> 333.57, 336.76, 334.04, 331.77, 329.60, 294.80, 216.43...
## $ PIP <dbl> 204.67, 194.97, 204.67, 204.67, 194.97, 175.57, 110.34...
## $ SMPDirection <chr> "?Enerji Fazlası", "?Enerji Fazlası", "?Enerji Fazlası...
## $ DateTime <dttm> 2020-07-01 00:00:00, 2020-07-01 01:00:00, 2020-07-01 ...
## $ Day <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...
## $ Hour <int> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, ...
At the end of this process, we have 9 columns that also contain the date information.
Our data has date information. So, we can group the data and get aggregated information about the electricity prices. We can list these ideas like below:
We can create a variable for getting the information and creating the plot of the daily averages of MCP.
MCP_Daily = data_last %>%
group_by(Day) %>%
summarize(Avg = mean(MCP))
MCP_Daily %>% glimpse()
## Rows: 7
## Columns: 2
## $ Day <dbl> 1, 2, 3, 4, 5, 6, 7
## $ Avg <dbl> 302.0398, 298.8338, 306.9160, 297.0389, 299.2423, 291.8207, 275...
ggplot(MCP_Daily, aes(Day, Avg)) +
geom_col() +
expand_limits(y = 0)
We can easily understand from the plot that daily averages of MCP is decreasing on the weekend. In other days, their values are very close. We can do these steps for hour, too.
MCP_Hourly = data_last %>%
group_by(Hour) %>%
summarize(Avg = mean(MCP))
MCP_Hourly %>% glimpse()
## Rows: 24
## Columns: 2
## $ Hour <int> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, ...
## $ Avg <dbl> 297.3552, 305.8410, 298.1139, 284.3048, 284.8490, 246.7165, 21...
ggplot(MCP_Hourly, aes(Hour, Avg)) +
geom_line() +
expand_limits(y = 0)
Around 06:00, the average MCP has very low values respect to other hours. At other hours, they are nearly equal.
options(tibble.print_max = 24)
data_last %>%
group_by(Hour) %>%
top_n(1, MCP) %>%
select(Hour, MCP) %>%
arrange(desc(MCP))
## # A tibble: 24 x 2
## # Groups: Hour [24]
## Hour MCP
## <int> <dbl>
## 1 14 350
## 2 15 350
## 3 16 350
## 4 17 331.
## 5 11 330.
## 6 21 330.
## 7 10 330.
## 8 13 329.
## 9 20 329.
## 10 22 328.
## 11 9 328.
## 12 18 327.
## 13 1 327.
## 14 19 325.
## 15 8 325.
## 16 12 324.
## 17 2 324.
## 18 0 324.
## 19 23 323.
## 20 3 322.
## 21 4 320
## 22 7 318.
## 23 5 302.
## 24 6 293
options(tibble.print_max = 10)
We can see that the highest three MCP values have occurred at 14.00, 15.00 and 16.00 in July,2020.
We can create a variable for getting the information and creating the plot of the daily averages of SMP.
SMP_Daily = data_last %>%
group_by(Day) %>%
summarize(Avg = mean(SMP))
SMP_Daily %>% glimpse()
## Rows: 7
## Columns: 2
## $ Day <dbl> 1, 2, 3, 4, 5, 6, 7
## $ Avg <dbl> 316.7281, 314.4270, 275.7717, 300.8255, 294.4326, 310.8506, 289...
ggplot(SMP_Daily, aes(Day, Avg)) +
geom_col() +
expand_limits(y = 0)
We can easily understand from the plot that daily averages of SMP on Monday and Tuesday are higher than the other days. On wednesday, it has the lowest average SMP values. We can do these steps for hour, too.
SMP_Hourly = data_last %>%
group_by(Hour) %>%
summarize(Avg = mean(SMP))
SMP_Hourly %>% glimpse()
## Rows: 24
## Columns: 2
## $ Hour <int> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, ...
## $ Avg <dbl> 294.3274, 307.5423, 297.3471, 282.9535, 280.0355, 248.3558, 21...
ggplot(SMP_Hourly, aes(Hour, Avg)) +
geom_line() +
expand_limits(y = 0)
Around 06:00, the average SMP has very low values respect to other hours, too. At other hours, they are nearly equal.
options(tibble.print_max = 24)
data_last %>%
group_by(Hour) %>%
top_n(1, SMP) %>%
select(Hour, SMP) %>%
arrange(desc(SMP))
## # A tibble: 26 x 2
## # Groups: Hour [24]
## Hour SMP
## <int> <dbl>
## 1 14 460
## 2 15 460
## 3 16 435
## 4 21 419.
## 5 17 404.
## 6 20 403.
## 7 13 402
## 8 13 402
## 9 12 402.
## 10 18 402.
## # ... with 16 more rows
options(tibble.print_max = 10)
We can see that the highest three SMP values have occurred at 14.00, 15.00 and 16.00 in July,2020.