This report on July 2020 electricity prices is prepared using RMarkdown, dplyr and ggplot2. The data is available at EPIAS Website.
Electricity markets are designed with “balance” in mind. Balance means electricity production should be equal to electricity consumption as much as possible at all times. Every day, consumers declares next day’s hourly forecast on energy demand. Producers try to predict how much energy would be needed hourly as well. ‘MCP’ represents Market Clearing Price and ‘SMP’ represents System Marginal Price. ‘SMP’ is always higher than ‘MCP’ if system has Energy Deficit, and lower if there is Energy Surplus. Click here for more information about Energy Markets.
prices
prices %>% glimpse()
## Rows: 744
## Columns: 6
## $ Date <fct> 01.07.20 00:00, 01.07.20 01:00, 0...
## $ MCP. <dbl> 323.85, 326.95, 324.31, 322.11, 3...
## $ SMP. <dbl> 211.00, 201.00, 211.00, 211.00, 2...
## $ Positive.Imbalance.Price..TL.MWh. <dbl> 204.67, 194.97, 204.67, 204.67, 1...
## $ Negative.Imbalance.Price..TL.MWh. <dbl> 333.57, 336.76, 334.04, 331.77, 3...
## $ SMP.Direction <fct> ?Energy Surplus, ?Energy Surplus,...
Some rearrangements are required since ‘Date’ is type of <factor>
. Only ‘MCP’ and ‘SMP’ columns has been selected and ‘Date’ column has been changed to type of <datetime>
as follows. Keep in mind that the data is hourly.
prices <- prices %>%
select(Date, MCP., SMP.) %>%
mutate(Date = as.POSIXct(Date, format="%d.%m.%y %H:%M"))
prices %>% glimpse()
## Rows: 744
## Columns: 3
## $ Date <dttm> 2020-07-01 00:00:00, 2020-07-01 01:00:00, 2020-07-01 02:00:00...
## $ MCP. <dbl> 323.85, 326.95, 324.31, 322.11, 320.00, 286.21, 210.13, 318.20...
## $ SMP. <dbl> 211.00, 201.00, 211.00, 211.00, 201.00, 181.00, 113.75, 201.00...
Let’s create a scatterplot of MCP versus SMP prices in July, 2020 and divide hours into 4 times of day which are ‘Early Morning’, ‘Morning’, ‘Afternoon’ and ‘Evening’. In addition, the graphic has been made more readable by showing the parts of the day in different colors.
As can be clearly seen from the graph below, while both prices are generally low in the early of the day, they are increasing considerably towards the end of the day.
plot_comparison <- prices %>%
mutate(times_of_day = case_when(
lubridate::hour(Date) >= 1 & lubridate::hour(Date) <= 6 ~ "01:00-06:00 Early Morning",
lubridate::hour(Date) >= 7 & lubridate::hour(Date) <= 12 ~ "07:00-12:00 Morning",
lubridate::hour(Date) >= 13 & lubridate::hour(Date) <= 18 ~ "13:00-18:00 Afternoon",
lubridate::hour(Date) >= 19 & lubridate::hour(Date) <= 23 ~ "19:00-00:00 Evening",
lubridate::hour(Date) == 0 ~ "19:00-00:00 Evening")) %>%
select(times_of_day, MCP., SMP.)
ggplot(plot_comparison) +
geom_point(aes(x = MCP., y = SMP., color=times_of_day)) +
xlim(100,400) +
ylim(0,500) +
theme_minimal() +
labs(x="MCP", y="SMP", color="Times of Day", title = "Comparison: MCP vs SMP", subtitle = "Price comparison in July 2020, times/parts of day") +
theme(legend.position="right",axis.text.x = element_text(angle=45,hjust=1,vjust=1))
In cases where MCP is greater than SMP, there is an surpluss of energy, ie. the estimated amount of energy is higher than the actual required amount of energy.
In cases where SMP is greater than MCP, there is an deficit of energy, ie. the estimated amount of energy is lower than the actual required amount of energy.
Considering July 2020, there is an energy surplus in 25% cases, while there is energy deficit in 75% cases. A higher energy deficit is obviously not desirable. In such cases, generating the necessary extra energy is challenging for the producers and also costly to the consumers.
plot_energy_direction <- prices %>%
summarise(Surplus = sum(MCP.>SMP.)/nrow(prices), Deficit = sum(MCP.<=SMP.)/nrow(prices)) %>%
pivot_longer(cols = c(Surplus, Deficit), names_to = "Energy_Direction")
ggplot(plot_energy_direction, aes(x = "", y = value, fill = Energy_Direction)) +
geom_col() +
coord_polar(theta = "y") +
geom_text(aes(label = scales::percent(round(value,2))), position = position_stack(vjust = 0.5)) +
theme_void() +
theme(plot.title = element_text(vjust = 0.5)) +
ggtitle("Pie Chart of Energy Direction (Surplus/Deficit) of July,2020", subtitle = "Data Source: EPIAS")
Now, we all know that the energy deficit proportion is 75%. We can use the bar chart below in order to see easily when these shortages happened during July 2020. We see that there is a deficit of energy at the beginning and end of the month. Also, we do not see any energy deficit between July 7 - 9. However, we can easily say that there is a shortage in other times.
plot_MCP_SMP_hourly <- prices %>%
mutate(EnergyDirection = ifelse(MCP.>SMP.,"Surplus","Deficit")) %>%
mutate(maxvalue = case_when(
MCP.>SMP. ~ MCP.,
MCP.<=SMP. ~ SMP.)) %>%
pivot_longer(EnergyDirection)
ggplot(plot_MCP_SMP_hourly, aes(x=Date, y=maxvalue, fill=value)) +
geom_bar(stat="identity") +
theme_test() +
labs(title = "Energy Direction (Surplus/Deficit) over July,2020", subtitle = "Data Source: EPIAS", y="Price (TL/MWh)", value="Energy Direction")
First, the data is grouped by day in order to find daily average prices. After reaching necessary information, we can draw the change of daily average prices as line graph.
Looking at the line chart below, the both prices are generally in equilibrium, with a movement in one affecting the other. However, we can say that MCP is more stable or its standard deviation is lower and SMP is more fragile. For both prices, they had a higher averages at the beginning of the month, while they fell slightly at the end of the month. They also appear to be in an uptrend, albeit small, which can be seen in more detail in the next analysis.
plot_daily_average <- prices %>%
group_by(day = lubridate::day(Date)) %>%
summarise(mean_MCP = mean(MCP.), mean_SMP = mean(SMP.)) %>%
ungroup() %>%
select(day, mean_MCP, mean_SMP) %>%
pivot_longer(cols=c(mean_MCP, mean_SMP), names_to="PriceType")
ggplot(plot_daily_average, aes(x=day, y=value, color=PriceType)) +
geom_line() +
theme_test() +
labs(x="Days of July,2020", y="Price (TL/MWh)", title = "Daily Average Price Comparison: MCP vs SMP", subtitle = "Data Source: EPIAS") +
theme(legend.position="right")
The average hourly percentage change of MCP and SMP in July,2020 can be found as follows. (e.g. Suppose hourly price is 300 in hour 1 and 320 in hour 2, then the change should be (320-300)/300 = 0.07)
As mentioned before, it seems there are slight upward trends.
prices %>%
mutate(lag_MCP = lag(MCP.,1), lag_SMP = lag(SMP.,1)) %>%
mutate(perc_change_MCP = (MCP. - lag_MCP)/lag_MCP, perc_change_SMP = (SMP. - lag_SMP)/lag_SMP) %>%
summarise(average_change_MCP = mean(perc_change_MCP, na.rm = TRUE), average_change_SMP = mean(perc_change_SMP, na.rm = TRUE)) %>%
select(average_change_MCP, average_change_SMP)
## average_change_MCP average_change_SMP
## 1 0.005842365 0.02357685
Boxplot is very effective for gaining insight into the distributions of multiple data.
Looking at the Boxplot below, the distribution of prices cannot be said to be similar. The standard deviation of MCP is quite low while SMP is high. On the other hand, MCP has many outliers. The median of SMP is above the MCP. While SMP was reached higher and lower prices, MCP advanced more stable.
boxplot <- prices %>%
pivot_longer(cols = c(MCP.,SMP.), names_to = "PriceType")
ggplot(boxplot, aes(x=PriceType, y=value, fill=PriceType)) +
geom_boxplot() +
theme_test() +
labs(title = "Boxplot of MCP and SMP", subtitle = "Data Source: EPIAS", y = "Price (TL/MWh)")
Analyzing the days of the week will be important in understanding the data more accurately. Visualizing this with the bar chart will help us. We can also see average prices in the charts.
Looking at the bar chart below, we can say that SMP watched the lowest price on Wednesday while watching a decrease in general on weekdays. MCP, on the other hand, is more stable on weekdays and is around 300 TL / MWh. At the weekend, both prices followed different patterns in contrast to the weekdays. SMP is considerably higher than MCP on the weekends.
plot_weekday <- prices %>%
mutate(weekday = wday(Date, label = TRUE, week_start = 1)) %>%
group_by(weekday) %>%
summarise(mean_MCP = mean(MCP.), mean_SMP = mean(SMP.)) %>%
ungroup() %>%
pivot_longer(cols=c(mean_MCP, mean_SMP), names_to="PriceType")
ggplot(plot_weekday, aes(x=weekday, y=value, fill=PriceType)) +
theme_minimal() +
geom_bar(stat="identity", position="dodge") +
labs(title = "Avg. prices of weekday in July,2020", subtitle = "Data Source: EPIAS", x="Weekday", y="Price (TL/MWh)") +
geom_text(aes(label = format(value,digits=3)), size=4, position = position_dodge(0.9),vjust = 10) +
theme(axis.text.x = element_text(angle = 45), legend.position = "right")
First, the data is grouped by day in order to find daily min, average and max prices. We can here also calculate the daily range (max-min). After reaching necessary information, we can draw the all changes of daily prices as line graph.
plot_MCP <- prices %>%
group_by(day = lubridate::day(Date)) %>%
summarise(min_MCP = min(MCP.), mean_MCP = mean(MCP.), max_MCP = max(MCP.)) %>%
ungroup() %>%
mutate(range_MCP = max_MCP - min_MCP) %>%
select(day, min_MCP, mean_MCP, max_MCP, range_MCP)
plot_MCP
## # A tibble: 31 x 5
## day min_MCP mean_MCP max_MCP range_MCP
## <int> <dbl> <dbl> <dbl> <dbl>
## 1 1 210. 320. 332. 122.
## 2 2 199. 298. 322. 122.
## 3 3 293 317. 331. 37.6
## 4 4 238. 306. 314. 76.2
## 5 5 174. 258. 306. 132.
## 6 6 165. 290. 327. 162.
## 7 7 222. 307. 350. 128.
## 8 8 200. 304. 321. 121.
## 9 9 197. 288. 317. 120.
## 10 10 199. 282. 306. 107.
## 11 11 192. 273. 306. 114.
## 12 12 197. 261. 306. 110.
## 13 13 140. 289. 311. 171.
## 14 14 194. 281. 310. 116.
## 15 15 197. 292. 316. 119.
## 16 16 222. 306. 327. 105.
## 17 17 199. 303. 320. 122.
## 18 18 195. 296. 316. 122.
## 19 19 195. 292. 320. 125.
## 20 20 250. 313. 335. 85.0
## 21 21 198. 297. 316. 118.
## 22 22 202. 304. 322. 121.
## 23 23 222. 304. 325. 103.
## 24 24 245. 315. 350. 105.
## 25 25 193. 292. 313. 119.
## 26 26 200. 290. 319. 119.
## 27 27 229. 316. 350 121.
## 28 28 233. 310. 325. 91.6
## 29 29 233. 314. 350. 117.
## 30 30 211. 288. 312. 101.
## 31 31 196. 280. 326. 130.
Looking at the line chart below, we can say that the average price is closer to the maximum price compared to the minimum price. The highest of minimum price is on July 3. The lowest price observed was observed on July 13. The average and maximum price fluctuates less. Average MCP price varied around 300 TL / MWh during July, 2020. The highest price was observed in a few days. The specific date-time of the minimum and maximum observed has been found in the next analysis.
plot_MCP %>%
select(day, min_MCP, mean_MCP, max_MCP) %>%
pivot_longer(.,-day) %>%
ggplot(.,aes(x=day, y=value, color=name)) +
geom_line() +
theme_test() +
labs(x="Days of July,2020", y="Price (TL/MWh)", color="Labels", title = "Overview of 'MCP'", subtitle = "Data Source: EPIAS") +
theme(legend.position="right")
We can also investigate the datetimes which gives the min or max ‘MCP’ by following codes.
prices %>%
mutate(the_rank = rank(MCP., ties.method = "random")) %>%
# There is only one datetime gives the min 'MCP', so the_rank == 1.
filter(the_rank == 1) %>%
select(Date, MCP.)
## Date MCP.
## 1 2020-07-13 06:00:00 140.01
prices %>%
mutate(the_rank = rank(-MCP., ties.method = "random")) %>%
# There are three datetimes give the max 'MCP', so the_rank < 3.
filter(the_rank <= 3) %>%
select(Date, MCP.)
## Date MCP.
## 1 2020-07-27 14:00:00 350
## 2 2020-07-27 15:00:00 350
## 3 2020-07-27 16:00:00 350
The data is grouped by day in order to find daily min, average and max prices. We can here also calculate the daily range (max-min). After reaching necessary information, we can draw the all changes of daily prices as line graph.
plot_SMP <- prices %>%
group_by(day = lubridate::day(Date)) %>%
summarise(min_SMP = min(SMP.), mean_SMP = mean(SMP.), max_SMP = max(SMP.)) %>%
ungroup() %>%
mutate(range_SMP = max_SMP - min_SMP) %>%
select(day, min_SMP, mean_SMP, max_SMP, range_SMP)
plot_SMP
## # A tibble: 31 x 5
## day min_SMP mean_SMP max_SMP range_SMP
## <int> <dbl> <dbl> <dbl> <dbl>
## 1 1 114. 268. 377. 264.
## 2 2 229. 357. 404. 175.
## 3 3 335. 373. 460 125.
## 4 4 235 321. 365 130
## 5 5 120 198. 266. 146.
## 6 6 156. 307. 367. 211.
## 7 7 179 298. 353. 174.
## 8 8 151. 191. 235. 84.1
## 9 9 124. 189. 207. 83.7
## 10 10 177. 292. 344. 167.
## 11 11 232 309. 346 114
## 12 12 200. 283. 346. 147.
## 13 13 140. 272. 340. 199.
## 14 14 200. 302. 369. 169.
## 15 15 232. 337. 386. 154.
## 16 16 209. 312. 362. 153.
## 17 17 199. 263. 320. 122.
## 18 18 205. 308. 353. 149.
## 19 19 222. 325. 360 138.
## 20 20 203. 335. 385 182.
## 21 21 208. 323. 361 153.
## 22 22 201. 286. 350. 149.
## 23 23 222. 331. 365. 143.
## 24 24 245. 346. 389. 144.
## 25 25 203. 305. 333. 129.
## 26 26 266. 350. 419. 153.
## 27 27 244. 353. 435 191.
## 28 28 212. 335. 374. 161.
## 29 29 209. 296. 365 156.
## 30 30 246. 315. 362. 116.
## 31 31 10 199. 340. 330.
Looking at the line chart below, we can say that the average price is closer to the maximum price compared to the minimum price. The highest of minimum price is on July 3. The lowest price observed was observed on July 31 with a very low price. Average MCP price varied around 300 TL / MWh but with a higher standard deviation during July, 2020. The highest price was observed also on July 3. The specific date-time of the minimum and maximum observed has been found in the next analysis.
plot_SMP %>%
select(day, min_SMP, mean_SMP, max_SMP) %>%
pivot_longer(.,-day) %>%
ggplot(.,aes(x=day, y=value, color=name)) +
geom_line() +
theme_test() +
labs(x="Days of July,2020", y="Price (TL/MWh)", color="Labels", title = "Overview of 'SMP'", subtitle = "Data Source: EPIAS") + theme(legend.position="right")
We can also investigate the datetimes which gives the min or max ‘SMP’ by following codes.
prices %>%
mutate(the_rank = rank(SMP., ties.method = "random")) %>%
# There are three datetime gives the min 'SMP', so the_rank <= 3.
filter(the_rank <= 3) %>%
select(Date, SMP.)
## Date SMP.
## 1 2020-07-31 10:00:00 10
## 2 2020-07-31 11:00:00 10
## 3 2020-07-31 12:00:00 10
prices %>%
mutate(the_rank = rank(-SMP., ties.method = "random")) %>%
# There are two datetime gives the max 'SMP', so the_rank <= 2.
filter(the_rank <= 2) %>%
select(Date, SMP.)
## Date SMP.
## 1 2020-07-03 14:00:00 460
## 2 2020-07-03 15:00:00 460
End of the Assignment 2. Visit my Progress Journal