Introduction
Dataset Structure
Here is the dataset analyzed in this report.
This report is about Izmir’s Fish Market prices.
raw_df <- read.csv("https://raw.githubusercontent.com/MineKara95/R/main/balik_hal_fiyatlari.csv",stringsAsFactors = FALSE, header = TRUE, sep = ";", encoding="UTF-8")
fish_market <- raw_df %>%
select(date = "TARIH", product_type = "MAL_TURU", product_name = "MAL_ADI", units = "BIRIM", min_price = "ASGARI_UCRET", max_price = "AZAMI_UCRET") %>%
mutate(month = lubridate::month(date, label = TRUE))
str(fish_market)
## 'data.frame': 18369 obs. of 7 variables:
## $ date : chr "2021-01-02 00:00:00" "2021-01-02 00:00:00" "2021-01-02 00:00:00" "2021-01-02 00:00:00" ...
## $ product_type: chr "BALIK" "BALIK" "BALIK" "BALIK" ...
## $ product_name: chr "TIRSI (DENİZ)" "KIRLANGIÇ (DENİZ)" "ÇİMÇİM (DENİZ)" "HANOS ( DENİZ )" ...
## $ units : chr "KG" "KG" "KG" "KG" ...
## $ min_price : num 5.83 3 3.5 2.5 45 130 38 25 10 2.5 ...
## $ max_price : num 12.5 80 8 5 45 130 38 55 10 6.67 ...
## $ month : Ord.factor w/ 12 levels "Jan"<"Feb"<"Mar"<..: 1 1 1 1 1 1 1 1 1 1 ...
As you can see above, this dataset includes date, product types, product names, unit, minimum price and maximum price of these product. Also, there are 18,369 observations in the dataset.
Null Values
It is determined whether null values exist in the dataset. As we can see here is that our dataset has no null values. Thus, we do not need to drop or replace any values.
head(is.na(fish_market))
## date product_type product_name units min_price max_price month
## [1,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [2,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [3,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [4,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [5,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [6,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
sum(is.na(fish_market))
## [1] 0
Number of Unique Fishes
There are 4 types of product and 126 unique seafood.
unique(fish_market[c("product_type")])
## product_type
## 1 BALIK
## 8 KÜLTÜR
## 42 İTHAL (DONUK)
## 43 TATLI SU
count(unique(fish_market[c("product_name")]))
## n
## 1 126
unique(fish_market$product_name)
## [1] "TIRSI (DENİZ)" "KIRLANGIÇ (DENİZ)" "ÇİMÇİM (DENİZ)"
## [4] "HANOS ( DENİZ )" "KILIÇ (DENİZ)" "LÜFER ( DENİZ)"
## [7] "USKUMRU (DENİZ)" "LEVREK ( KÜLTÜR )" "KÖPEK (DENİZ)"
## [10] "ISTAVRIT (DENİZ)" "IZMARIT (DENİZ)" "KALAMAR (DENİZ)"
## [13] "SÜBYE (DENİZ)" "HAMSI (DENİZ)" "KOLYOZ (DENİZ)"
## [16] "LİDAKİ (DENİZ)" "PALAMUT-TORİK (DENİZ" "KEREVİT ( DENİZ )"
## [19] "MEZGİT (DENİZ)" "SOKKAN(DENİZ)" "BARBUN (TEKİR)"
## [22] "KOLORİT(DENİZ)" "DIL" "BARBUN(KAYA)"
## [25] "ÇIPURA (DENİZ)" "YENGEÇ (DENİZ)" "SARPA (DENİZ)"
## [28] "DİL(CANGIDEZ)" "ÇİPURA ( KÜLTÜR )" "KARİDES (DENİZ)"
## [31] "YABANİ KALAMAR" "MASKO DENİZ" "PEYGAMBER (DENİZ)"
## [34] "MERCAN(KÜÇÜK BOY)" "MERCAN (BÜYÜKBOY)" "LEVREK (DENİZ)"
## [37] "YAZILIORKINOS(DENİZ)" "LÜFER (KÜÇÜK )" "KAMİT(DENİZ)"
## [40] "BAKALYAR(BÜYÜKBOY)" "GRANYOZ (DENİZ)" "ITHAL USKUMRU(DONUK)"
## [43] "SAZAN ( TATLI )" "TEKİR(DENİZ)" "AHTOPOT (DENİZ)"
## [46] "KEFAL (DENİZ)" "KUPEZ (DENİZ)" "SARDALYA (DENİZ)"
## [49] "BAKALYARO(KÜÇÜKBOY)" "İSTAVRİT(SARI KANAT)" "KRAÇA ( DENİZ )"
## [52] "FENER (DENİZ)" "KARAGÖZ (DENİZ)" "SİNARİT (DENİZ)"
## [55] "VATOZ (DENİZ)" "ÇAPAK(TATLI)" "JAPON HAMSİ"
## [58] "İTHAL SOMON(TATLI)" "ISKORPIT (DENİZ)" "PAPAĞAN ( DENİZ )"
## [61] "ALABALIK ( TATLI )" "BÜLBÜL (DENİZ )" "ORKİNOZ (DENİZ)"
## [64] "ZARGANA (DENİZ)" "KALKAN (DENİZ)" "ISPAROZ (DENİZ)"
## [67] "AHTAPOT(DONUK)" "SOMON ( TATLI )" "MIRMIR (DENİZ)"
## [70] "MEZGİT" "YAYIN ( TATLI )" "LAHOZ (DENİZ)"
## [73] "EŞKİNA(DENİZ)" "SARGOZ(DENİZ)" "KOFANA ( DENİZ )"
## [76] "İSTAKOZ (DENİZ)" "DONUK(KALAMAR)" "PİSİ ( DENİZ )"
## [79] "YILAN ( TATLI )" "MINEKOP (DENİZ)" "İSPENDEK(DENİZ)"
## [82] "TURNA (DENİZ)" "LÜFER(İRİ BOY)" "DONUK TOMBİK"
## [85] "PALASKA(DENİZ)" "MELANUR (DENİZ)" "TOMBIK(DENİZ)"
## [88] "FANGRI (DENİZ)" "LOKUM(DENİZ)" "AKYA (DENİZ)"
## [91] "SUDAK ( TATLI )" "MARYA(DENİZ)" "TRANÇA (DENİZ)"
## [94] "ÇİMÇİM (DONUK)" "DONUK HAMSİ" "DONUK SARDALYA"
## [97] "TİRSİ (DONUK)" "KUPEZ (DONUK)" "DONUK DİL"
## [100] "LÜFER(DONUK)" "DONUK PALAMUT" "ÇUÇUNA"
## [103] "MİDYE DONUK" "GELİNCİK(DENİZ)" "ORFOZ (DENİZ)"
## [106] "ISTAVRİT (DONUK)" "TURNA ( TATLI )" "DENİZ PATLICANI"
## [109] "USKUMRU (DONUK)" "ÖKSÜZ(DENİZ)" "ALABALIK(DONUK)"
## [112] "GÖÇMEN" "TEKİR (DONUK)" "DONUK SOMON"
## [115] "LİPSÖZ ( DENİZ )" "GÜMÜŞ ( TATLI )" "KOLYOZ (DONUK)"
## [118] "BARBUN(DONUK)" "MİDYE" "KEFAL (DONUK)"
## [121] "MEZGİT (DONUK)" "BARBUN KAYA (DONUK)" "AKYA(DONUK)"
## [124] "KRAÇA (DONUK)" "KARİDES(DONUK)" "KELER(DENİZ)"
Monthly Average Min and Max Price of Product Types
Monthly Average Min Prices
avg_max_min_price <- fish_market %>% select(date, product_type, product_name, units, min_price, max_price) %>%
group_by(month = lubridate::month(date, label = TRUE), product_type) %>%
summarize(avg_min_price = mean(min_price), avg_max_price = mean(max_price))
avg_max_min_price
## # A tibble: 40 × 4
## # Groups: month [10]
## month product_type avg_min_price avg_max_price
## <ord> <chr> <dbl> <dbl>
## 1 Jan BALIK 23.9 52.2
## 2 Jan İTHAL (DONUK) 11.7 17.2
## 3 Jan KÜLTÜR 18.3 61.3
## 4 Jan TATLI SU 24.3 29.8
## 5 Feb BALIK 25.5 56.6
## 6 Feb İTHAL (DONUK) 16.0 22.9
## 7 Feb KÜLTÜR 17.6 65.3
## 8 Feb TATLI SU 26.5 30.8
## 9 Mar BALIK 26.6 63.4
## 10 Mar İTHAL (DONUK) 19.1 24.1
## # … with 30 more rows
ggplot(avg_max_min_price, aes(month, avg_min_price, group = product_type, color=product_type)) +
geom_line() +
labs(x= "Month",
y = "Average Min Price",
title = "Monthly Average Min Prices of Product Types")
Monthly average min prices of the type “KÜLTÜR” (which means “Aquaculture”) is lowest in almost all months. The most price changes occur in “İTHAL (DONUK)” (imported products).
Monthly Average Max Prices
ggplot(avg_max_min_price, aes(month, avg_max_price, group = product_type, color=product_type)) +
geom_line() +
labs(x= "Month",
y = "Average Max Price",
title = "Monthly Average Max Prices of Product Types")
Monthly average max prices of “BALIK” and “İTHAL(DONUK)” products are higher than others. The price of “BALIK” products is much higher in the summer. This is most likely due to the fishing ban.
Monthly Price Difference of Product Types
avg_prices <- avg_max_min_price %>% pivot_longer(c(avg_min_price, avg_max_price), names_to = "average_prices", values_to="values") %>%
group_by(month, product_type) %>%
select(month, product_type, average_prices, values)
avg_prices
## # A tibble: 80 × 4
## # Groups: month, product_type [40]
## month product_type average_prices values
## <ord> <chr> <chr> <dbl>
## 1 Jan BALIK avg_min_price 23.9
## 2 Jan BALIK avg_max_price 52.2
## 3 Jan İTHAL (DONUK) avg_min_price 11.7
## 4 Jan İTHAL (DONUK) avg_max_price 17.2
## 5 Jan KÜLTÜR avg_min_price 18.3
## 6 Jan KÜLTÜR avg_max_price 61.3
## 7 Jan TATLI SU avg_min_price 24.3
## 8 Jan TATLI SU avg_max_price 29.8
## 9 Feb BALIK avg_min_price 25.5
## 10 Feb BALIK avg_max_price 56.6
## # … with 70 more rows
ggplot(avg_prices, aes(month, values, fill=average_prices)) +
geom_col(position = position_dodge(width = 0.4), alpha = 0.8) +
scale_fill_brewer(palette = "Set1") +
labs(x= "Month",
y = "Average Min/Max Prices",
title = "Monthly Price Difference of Product Types")+
facet_wrap(~product_type)
The price of “BALIK” and “KÜLTÜR” products is more variable. The difference between max and min price in “BALIK” and “KÜLTÜR” are more.
Most Expensive Fishes and Their Max Prices
expensive_fishes <- fish_market %>%
group_by(product_name) %>%
summarize(max_price = max(max_price)) %>%
arrange(desc(max_price))
head(expensive_fishes)
## # A tibble: 6 × 2
## product_name max_price
## <chr> <dbl>
## 1 BARBUN (TEKİR) 3500
## 2 İSTAKOZ (DENİZ) 750
## 3 LEVREK (DENİZ) 500
## 4 MERCAN (BÜYÜKBOY) 500
## 5 PEYGAMBER (DENİZ) 500
## 6 KARİDES (DENİZ) 450
Investigating “BARBUN”
We get all types of “BARBUN” data and display their average min and max prices in each month.
barbun_monthly_avg <- fish_market %>%
filter(str_detect(product_name, "BARBUN")) %>%
group_by(month, product_name) %>%
summarize(avg_min_price = mean(min_price), avg_max_price = mean(max_price))
barbun_monthly_avg_pivot <- barbun_monthly_avg %>% pivot_longer(c(avg_min_price, avg_max_price), names_to = "average_prices", values_to="values") %>%
group_by(month, product_name) %>%
select(month, product_name, average_prices, values)
barbun_monthly_avg
## # A tibble: 24 × 4
## # Groups: month [10]
## month product_name avg_min_price avg_max_price
## <ord> <chr> <dbl> <dbl>
## 1 Jan BARBUN (TEKİR) 9.93 125.
## 2 Jan BARBUN(KAYA) 64.1 143.
## 3 Feb BARBUN (TEKİR) 9.81 139.
## 4 Feb BARBUN(KAYA) 69.7 145.
## 5 Mar BARBUN (TEKİR) 9.81 174.
## 6 Mar BARBUN(KAYA) 68.5 156.
## 7 Apr BARBUN (TEKİR) 16.6 93.9
## 8 Apr BARBUN(KAYA) 76.7 147.
## 9 May BARBUN (TEKİR) 17.8 70.8
## 10 May BARBUN(DONUK) 100 100
## # … with 14 more rows
Barbun (Tekir)
From January to October, minimum and maximum price difference of Barbun (Tekir) is very high in almost all months and it is clearly seen in the graph below. The peak of max price is in October.
barbun_tekir <- barbun_monthly_avg_pivot %>%
filter(product_name == "BARBUN (TEKİR)")
ggplot(barbun_tekir, aes(month, values, fill=average_prices)) +
geom_col(position = position_dodge(width = 0.4)) +
scale_fill_brewer(palette = "Set1") +
labs(x= "Month",
y = "Average Min/Max Prices",
title = "Monthly Price Difference of Barbun (Tekir)")
Barbun (Kaya)
From January to October, minimum and maximum price difference of Barbun (Kaya) is high in almost all months. However, the difference in Barbun (Tekir) is more than Barbun (Kaya).
barbun_kaya <- barbun_monthly_avg_pivot %>%
filter(product_name == "BARBUN(KAYA)")
ggplot(barbun_kaya, aes(month, values, fill=average_prices)) +
geom_col(position = position_dodge(width = 0.4)) +
scale_fill_brewer(palette = "Set1") +
labs(x= "Month",
y = "Average Min/Max Prices",
title = "Monthly Price Difference of Barbun (Kaya)")
Barbun (Donuk)
We do not have information of all months for the Barbun (Donuk), we only have May and June data. It is clearly seen in the graph below, there is no price difference in both months for Barbun (Donuk).
barbun_donuk <- barbun_monthly_avg_pivot %>%
filter(product_name == "BARBUN(DONUK)")
ggplot(barbun_donuk, aes(month, values, fill=average_prices)) +
geom_col(position = position_dodge(width = 0.4)) +
scale_fill_brewer(palette = "Set1") +
labs(x= "Month",
y = "Average Min/Max Prices",
title = "Monthly Price Difference of Barbun (Donuk)")
Barbun Kaya (Donuk)
We do not have information of all months for the Barbun Kaya (Donuk) as well. What we have is only July and August data. There is a price difference for Barbun Kaya (Donuk) in July unlike August.
barbun_kaya_donuk <- barbun_monthly_avg_pivot %>%
filter(product_name == "BARBUN KAYA (DONUK)")
ggplot(barbun_kaya_donuk, aes(month, values, fill=average_prices)) +
geom_col(position = position_dodge(width = 0.4)) +
scale_fill_brewer(palette = "Set1") +
labs(x= "Month",
y = "Average Min/Max Prices",
title = "Monthly Price Difference of Barbun Kaya (Donuk)")