knitr::opts_chunk$set(fig.align = 'center', echo = TRUE)
The exist_july
dataset which provides July 2020’s electricity prices in the terms of Market Clearing Price (MCP) and System Marginal Price (SMP) and the related data which are Positive Imbalance Price and Negative Imbalance Price in the energy market in Turkey which is changing hourly. The data can be downloaded from EPIAS/EXIST MCP/SMP Page.
In this report, July 2020’s electricity prices will be analyzed using dplyr, ggplot2 and RMarkdown mainly. I will also use packages like readr
, knitr
and lubridate
.
library(tidyverse)
library(lubridate)
library(readr)
library(dplyr)
library(ggplot2)
library(knitr)
exist_july <- read.csv("C:/Users/tugrul/Desktop/ie48a/week2/dplyr/ptf-smf.csv")
Sys.setlocale("LC_TIME", "C")
Let’s start with taking a glance at dataset.
exist_july %>% glimpse
## Rows: 744
## Columns: 6
## $ Date <chr> "01.07.20 00:00", "01.07.20 01:00...
## $ MCP. <dbl> 323.85, 326.95, 324.31, 322.11, 3...
## $ SMP. <dbl> 211.00, 201.00, 211.00, 211.00, 2...
## $ Positive.Imbalance.Price..TL.MWh. <dbl> 204.67, 194.97, 204.67, 204.67, 1...
## $ Negative.Imbalance.Price..TL.MWh. <dbl> 333.57, 336.76, 334.04, 331.77, 3...
## $ SMP.Direction <chr> "?Energy Surplus", "?Energy Surpl...
Here, variables in the data are renamed to make it look more proper. Also three new columns (DateTime, Day, Hour) are added to be able to show significant differences in the terms of time through the report.
existj_one <- exist_july %>%
rename("MCP" = MCP.,
"SMP" = SMP.,
"Positive Imbalance Price" = Positive.Imbalance.Price..TL.MWh.,
"Negative Imbalance Price" = Negative.Imbalance.Price..TL.MWh.,
"SMP Direction" = SMP.Direction) %>%
mutate(DateTime = as.POSIXct(Date, format = "%d.%m.%y %H:%M")) %>%
mutate(Day = wday(DateTime, label = T, week_start = 1), Hour = hour(DateTime))
existj_one %>% glimpse()
## Rows: 744
## Columns: 9
## $ Date <chr> "01.07.20 00:00", "01.07.20 01:00", "01....
## $ MCP <dbl> 323.85, 326.95, 324.31, 322.11, 320.00, ...
## $ SMP <dbl> 211.00, 201.00, 211.00, 211.00, 201.00, ...
## $ `Positive Imbalance Price` <dbl> 204.67, 194.97, 204.67, 204.67, 194.97, ...
## $ `Negative Imbalance Price` <dbl> 333.57, 336.76, 334.04, 331.77, 329.60, ...
## $ `SMP Direction` <chr> "?Energy Surplus", "?Energy Surplus", "?...
## $ DateTime <dttm> 2020-07-01 00:00:00, 2020-07-01 01:00:0...
## $ Day <ord> Wed, Wed, Wed, Wed, Wed, Wed, Wed, Wed, ...
## $ Hour <int> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12...
After the modifications made on the dataset, I am doing the first analysis which is showing the comparison between Market Clearing Price and System Marginal Price.
ggplot(existj_one, aes(x = MCP, y = SMP, color = DateTime)) +
geom_point() +
labs(title = "Comparison of MCP and SMP", subtitle = "July 2020", x = "Market Clearing Price", y = "System Marginal Price") +
theme_minimal() +
theme(legend.position="right",axis.text.x = element_text(angle=45,hjust=1,vjust=1))
This scatter plot shows that MCP concentrate on 300-325 interval, but SMP values spread a larger interval which can be said as 150-400. We can say that it is a natural result of the difference between Day Ahead Market and Balancing Power Market. Since SMP prices are set at the last moment and in the end of a penalty process, they could project on more unexpected areas in the plot.
Now, to show how negative imbalanced situations are more than positive ones usually, I am going to proportion the hours when the energy direction is opposite to each other.
energy_direction <- existj_one %>%
summarise(Surplus = sum(MCP > SMP) / nrow(existj_one),
Deficit = sum(MCP <= SMP) / nrow(existj_one)) %>%
pivot_longer(cols = c(Surplus, Deficit), names_to = "Energy_Direction")
ggplot(energy_direction, aes(x = "", y = value, fill = Energy_Direction)) +
geom_bar(stat="identity" , width = 1) +
coord_polar("y") +
theme_void() +
geom_text(aes(label = scales::percent(round(value,2))), position = position_stack(vjust = 0.5)) +
ggtitle("Energy Direction", subtitle = "July 2020")
As can be seen easily, there is a huge weight on Energy Deficit in the terms of frequency. It means that at the most of the time, consumers and producers underestimate their consumption severely.
In this part of the report, I will plot average values MCP and SMP over days and hours. It will be a good practice to make deductions about the data, especially in the terms of on which day and hour prices have fallen or increased.
mcp_daily <- existj_one %>%
group_by(Day) %>%
summarize(avg_mcp = mean(MCP), .groups = 'drop')
kable(mcp_daily, format="markdown")
Day | avg_mcp |
---|---|
Mon | 302.0398 |
Tue | 298.8338 |
Wed | 306.9160 |
Thu | 297.0389 |
Fri | 299.2423 |
Sat | 291.8207 |
Sun | 275.0998 |
ggplot(mcp_daily, aes(x = Day, y = avg_mcp)) +
geom_col(fill = "royalblue3") +
labs(x = "Days of Week",
y = "Average MCP",
title = "MCP Averages Over Days",
subtitle = "July 2020") +
theme_minimal()
As it is seen above, MCP have fallen on the weekends due to decreasing in the production over the country. When we also consider that most private sector companies have no vacations on Saturday, and people’s common desire to make excursions to the rural areas (where electricity is not consumed much) on Sunday, we can explain that significant difference between Saturday and Sunday prices.
mcp_hourly <- existj_one %>%
group_by(Hour) %>%
summarize(avg_mcp = mean(MCP), .groups = 'drop')
kable(mcp_hourly, format="markdown")
Hour | avg_mcp |
---|---|
0 | 297.3552 |
1 | 305.8410 |
2 | 298.1139 |
3 | 284.3048 |
4 | 284.8490 |
5 | 246.7165 |
6 | 213.3771 |
7 | 268.8835 |
8 | 294.5987 |
9 | 288.2713 |
10 | 300.7368 |
11 | 305.1348 |
12 | 297.5416 |
13 | 304.1765 |
14 | 311.8487 |
15 | 312.6694 |
16 | 312.4710 |
17 | 310.6632 |
18 | 308.5587 |
19 | 310.3419 |
20 | 316.6100 |
21 | 316.9803 |
22 | 314.4574 |
23 | 308.1403 |
ggplot(mcp_hourly, aes(x = Hour, y = avg_mcp)) +
geom_line(color = "darkred", size = 1.1) +
expand_limits(y = 0) +
labs(x = "Hours of Day",
y = "Average MCP",
title = "MCP Averages Over Hours",
subtitle = "July 2020") +
theme_minimal()
There is remarkable drop in the MCP between 3-6 A.M. This drop can be explained as automation does not reign all over the production facilites yet, there is still shift changes. Also the reduction in the households due to the sleep interval for a worker says something regarding this drop.
smp_daily <- existj_one %>%
group_by(Day) %>%
summarize(avg_smp = mean(SMP), .groups = 'drop')
kable(smp_daily, format="markdown")
Day | avg_smp |
---|---|
Mon | 316.7281 |
Tue | 314.4270 |
Wed | 275.7717 |
Thu | 300.8255 |
Fri | 294.4326 |
Sat | 310.8506 |
Sun | 289.1030 |
ggplot(smp_daily, aes(x = Day, y = avg_smp)) +
geom_col(fill = "royalblue3") +
labs(x = "Days of Week",
y = "Average SMP",
title = "SMP Averages Over Days",
subtitle = "July 2020") +
theme_minimal()
The fallen in SMP prices after the first two days of a week means to us that facilities and providers are able to do more accurate purchases for the rest of the week and they are penalized less. Immediate increase in Saturdays also can be connected to vague electricity usage in the entertainment sector in the first day of the week.
smp_hourly <- existj_one %>%
group_by(Hour) %>%
summarize(avg_smp = mean(SMP), .groups = 'drop')
kable(mcp_hourly, format="markdown")
Hour | avg_mcp |
---|---|
0 | 297.3552 |
1 | 305.8410 |
2 | 298.1139 |
3 | 284.3048 |
4 | 284.8490 |
5 | 246.7165 |
6 | 213.3771 |
7 | 268.8835 |
8 | 294.5987 |
9 | 288.2713 |
10 | 300.7368 |
11 | 305.1348 |
12 | 297.5416 |
13 | 304.1765 |
14 | 311.8487 |
15 | 312.6694 |
16 | 312.4710 |
17 | 310.6632 |
18 | 308.5587 |
19 | 310.3419 |
20 | 316.6100 |
21 | 316.9803 |
22 | 314.4574 |
23 | 308.1403 |
ggplot(smp_hourly, aes(x = Hour, y = avg_smp)) +
geom_line(color = "darkred", size = 1.1) +
expand_limits(y = 0) +
labs(x = "Hours of Day",
y = "Average SMP",
title = "SMP Averages Over Hours",
subtitle = "July 2020") +
theme_minimal()
Unlike MCP Averages Over Hours plot, we observe instantenous rises and falls in SMP in the day. We can say that this situation is caused by the existential purpose of the Balancing Power Market which is providing a suitable ground for last minute sales or purchases.
In this report, I tried to manipulate and analyze exist_july
dataset which provides July 2020’s electricity prices, and tried to deduce meaningful conclusions from these analyses. As throughout the report seen, the market set based on clearing mechanism causes oftenly imbalances like energy deficits and surpluses. The producers and consumers try to predict their consumption and make the purchases daily and hourly in a perfect way in order not to pay more than enough.