In this study, an exploratory data analysis will be made about July 2020’s electricity prices using the dataset which is accessible at EPIAS/EXIST MCP/SMP Page.
library(dplyr)
library(ggplot2)
library(reshape2)
df <- readxl::read_xls('ptf-smf.xls')
Let’s investigate the summary of the dataset by glimpse.
glimpse(df)
## Rows: 744
## Columns: 6
## $ Date <chr> "01.07.20 00:00", "01.07.20 01:00…
## $ MCP <chr> "323.85", "326.95", "324.31", "32…
## $ SMP <chr> "211.00", "201.00", "211.00", "21…
## $ `Positive Imbalance Price (TL/MWh)` <chr> "204.67", "194.97", "204.67", "20…
## $ `Negative Imbalance Price (TL/MWh)` <chr> "333.57", "336.76", "334.04", "33…
## $ `SMP Direction` <chr> "↓Energy Surplus", "↓Energy Surpl…
It seems that we need to make datetime operations to utilize date columns. Also, price columns should be converted to numeric since they are in string as is and columns should be renamed to make them easily callable.
cnames <- c("date", "mcp", "smp", "pip", "nip", "direction")
colnames(df) <- cnames
df$date2 <- strptime(df$date,format="%d.%m.%y %H:%M")
df$hour <- lubridate::hour(df$date2)
df$day <- lubridate::day(df$date2)
df$mcp <- as.numeric(df$mcp)
df$smp <- as.numeric(df$smp)
df$pip <- as.numeric(df$pip)
df$nip <- as.numeric(df$nip)
After this quick preprocess, let’s see some sample rows from top and bottom of the dataset.
head(df)
## # A tibble: 6 x 9
## date mcp smp pip nip direction date2 hour day
## <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dttm> <int> <int>
## 1 01.07.20 … 324. 211 205. 334. ↓Energy Su… 2020-07-01 00:00:00 0 1
## 2 01.07.20 … 327. 201 195. 337. ↓Energy Su… 2020-07-01 01:00:00 1 1
## 3 01.07.20 … 324. 211 205. 334. ↓Energy Su… 2020-07-01 02:00:00 2 1
## 4 01.07.20 … 322. 211 205. 332. ↓Energy Su… 2020-07-01 03:00:00 3 1
## 5 01.07.20 … 320 201 195. 330. ↓Energy Su… 2020-07-01 04:00:00 4 1
## 6 01.07.20 … 286. 181 176. 295. ↓Energy Su… 2020-07-01 05:00:00 5 1
tail(df)
## # A tibble: 6 x 9
## date mcp smp pip nip direction date2 hour day
## <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dttm> <int> <int>
## 1 31.07.20… 311. 206. 200. 320. ↓Energy Sur… 2020-07-31 18:00:00 18 31
## 2 31.07.20… 318. 260 252. 327. ↓Energy Sur… 2020-07-31 19:00:00 19 31
## 3 31.07.20… 325. 205 199. 335. ↓Energy Sur… 2020-07-31 20:00:00 20 31
## 4 31.07.20… 326. 267. 259. 336. ↓Energy Sur… 2020-07-31 21:00:00 21 31
## 5 31.07.20… 325. 340. 315. 350. ↑ Energy De… 2020-07-31 22:00:00 22 31
## 6 31.07.20… 322. 210 204. 331. ↓Energy Sur… 2020-07-31 23:00:00 23 31
Let’s compare MCP and SMP values at midday.
ggplot(melt(filter(df, hour == 12)[, c("day", "mcp", "smp")], id=c("day")), aes(x = day, y = value, group = variable)) +
geom_line(aes(color = variable, linetype = variable)) +
scale_color_manual(values = c("darkred", "steelblue")) +
ggtitle("MCP vs. SMP - Midday") +
theme(axis.text.x = element_text(angle = 55, vjust = 0.5, hjust=1)) +
labs(x="Days in July", y="Price")
Let’s also see this comparison at day ends. It seems that MCP flattens out late in the day.
ggplot(melt(filter(df, hour == 23)[, c("day", "mcp", "smp")], id=c("day")), aes(x = day, y = value, group = variable)) +
geom_line(aes(color = variable, linetype = variable)) +
scale_color_manual(values = c("darkred", "steelblue")) +
ggtitle("MCP vs. SMP - End of the Days") +
theme(axis.text.x = element_text(angle = 55, vjust = 0.5, hjust=1)) +
labs(x="Days in July", y="Price")
The imbalance prices are calculated as below; - Positive Imbalance Price (TL/MWh) = 0,97min(MCP,SMP) - Negative Imbalance Price (TL/MWh)= 1,03max(MCP,SMP)
Let’s compare them at midday.
ggplot(melt(filter(df, hour == 12)[, c("day", "pip", "nip")], id=c("day")), aes(x = day, y = value, group = variable)) +
geom_line(aes(color = variable, linetype = variable)) +
scale_color_manual(values = c("darkred", "steelblue")) +
ggtitle("Positive vs. Negative Imbalance Prices - Midday") +
theme(axis.text.x = element_text(angle = 55, vjust = 0.5, hjust=1)) +
labs(x="Days in July", y="Price")
Similar to previous, let’s see the comparison again at the end of days. It seems that the gap between them is getting larger at the end of days, right?
ggplot(melt(filter(df, hour == 23)[, c("day", "pip", "nip")], id=c("day")), aes(x = day, y = value, group = variable)) +
geom_line(aes(color = variable, linetype = variable)) +
scale_color_manual(values = c("darkred", "steelblue")) +
ggtitle("Positive vs. Negative Imbalance Prices - End of the Days") +
theme(axis.text.x = element_text(angle = 55, vjust = 0.5, hjust=1)) +
labs(x="Days in July", y="Price")
At the end let’s also see the breakdown of energy directions in all the days of July.
df$dummy <- 1
dd <- melt(df[, c("day", "direction", "dummy")], id=c("day", "direction"))
df3 <- dd %>% group_by(day, direction) %>% summarise(Periods = sum(value))
## `summarise()` regrouping output by 'day' (override with `.groups` argument)
ggplot(df3, aes(fill=direction, y=Periods, x=day)) +
geom_bar(position="stack", stat="identity") +
scale_fill_viridis(discrete = T) +
ggtitle("Balance Direction") +
#theme_ipsum() +
coord_cartesian(xlim = c(1, 31), ylim = c(0, 24)) +
theme(axis.text.x = element_text(angle = 55, vjust = 0.5, hjust=1)) +
xlab("")