İzmir Fish Prices for 2021

You can reach the whole years from here
Or you can download the data from here.

Understanding the shape of data and the columns

df=read.csv(file = 'https://raw.githubusercontent.com/pjournal/mef05-stuncers/gh-pages/balik_hal_fiyatlari%20(4).csv',
                    stringsAsFactors = FALSE, header = TRUE,sep = ";", encoding="UTF-8")
head(df)
##                 TARIH MAL_TURU           MAL_ADI BIRIM ASGARI_UCRET AZAMI_UCRET
## 1 2021-01-02 00:00:00    BALIK     TIRSI (DENIZ)    KG         5.83        12.5
## 2 2021-01-02 00:00:00    BALIK KIRLANGIÇ (DENIZ)    KG         3.00        80.0
## 3 2021-01-02 00:00:00    BALIK    ÇIMÇIM (DENIZ)    KG         3.50         8.0
## 4 2021-01-02 00:00:00    BALIK   HANOS ( DENIZ )    KG         2.50         5.0
## 5 2021-01-02 00:00:00    BALIK     KILIÇ (DENIZ)    KG        45.00        45.0
## 6 2021-01-02 00:00:00    BALIK    LÜFER ( DENIZ)    KG       130.00       130.0
summary(df)
##     TARIH             MAL_TURU           MAL_ADI             BIRIM          
##  Length:23311       Length:23311       Length:23311       Length:23311      
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##   ASGARI_UCRET     AZAMI_UCRET     
##  Min.   :  0.00   Min.   :   0.42  
##  1st Qu.:  5.00   1st Qu.:  15.00  
##  Median : 10.00   Median :  35.00  
##  Mean   : 31.13   Mean   :  67.50  
##  3rd Qu.: 35.00   3rd Qu.:  90.00  
##  Max.   :800.00   Max.   :5150.00

Finding maximum, minimum and the standard deviation on price

max_min_prices <- df %>%
  group_by(MAL_ADI, BIRIM) %>%
  summarise(MAX_MAX_PRICE = max(AZAMI_UCRET), MAX_MIN_PRICE = min(AZAMI_UCRET), MIN_MAX_PRICE = max(ASGARI_UCRET), MIN_MIN_PRICE = min(ASGARI_UCRET), .groups = 'drop' ) %>%
  arrange(desc(MAX_MAX_PRICE)) %>%ungroup()%>%
  head(5) 
max_min_prices
## # A tibble: 5 x 6
##   MAL_ADI          BIRIM MAX_MAX_PRICE MAX_MIN_PRICE MIN_MAX_PRICE MIN_MIN_PRICE
##   <chr>            <chr>         <dbl>         <dbl>         <dbl>         <dbl>
## 1 BARBUN (TEKIR)   KG             5150            30            45           2  
## 2 ITHAL SOMON(TAT~ KG             1153            50           140          17.5
## 3 ISTAKOZ (DENIZ)  KG              800            20           800           3  
## 4 LEVREK (DENIZ)   KG              500            30           140           5  
## 5 MERCAN (BÜYÜKBO~ KG              500            20           100           2
SD_MAX_MIN <- df %>%
  group_by(MAL_ADI, BIRIM) %>%
  summarise(sd_max = sd(AZAMI_UCRET), sd_min = sd(ASGARI_UCRET), .groups = 'drop' ) %>%
  arrange(desc(sd_max)) %>%ungroup()%>%
  head(5) 
SD_MAX_MIN  
## # A tibble: 5 x 4
##   MAL_ADI           BIRIM sd_max sd_min
##   <chr>             <chr>  <dbl>  <dbl>
## 1 BARBUN (TEKIR)    KG     334.    5.75
## 2 ISTAKOZ (DENIZ)   KG     169.  137.  
## 3 MERCAN (BÜYÜKBOY) KG      84.3   9.06
## 4 FANGRI (DENIZ)    KG      82.7  72.4 
## 5 SINARIT (DENIZ)   KG      80.2  66.8

Analze is showing us that prices are not correct. It has been write wrong in price or unit. Eiter way, we have to cut outliers to get healty data.

df_outliers <- df %>%
  group_by(MAL_ADI) %>%
  transmute(AZAMI_UCRET, ASGARI_UCRET, FARK = AZAMI_UCRET - ASGARI_UCRET,
            IQR_FARK = IQR(FARK),
            UPPER = quantile(FARK, 0.75) + IQR_FARK * 1.5,
            LOWER = quantile(FARK, 0.25) - IQR_FARK * 1.5) %>%
  filter(!(FARK <= UPPER & FARK >= LOWER))


df_cutoff_outliers <- df %>%
  group_by(MAL_ADI) %>%
  mutate(FARK = AZAMI_UCRET - ASGARI_UCRET,
         IQR_FARK = IQR(FARK),
         UPPER = quantile(FARK, 0.75) + IQR_FARK * 1.5,
         LOWER = quantile(FARK, 0.25) - IQR_FARK * 1.5) %>%
  filter(FARK <= UPPER & FARK >= LOWER) %>%
  ungroup()

df_cutoff_outliers
## # A tibble: 22,338 x 10
##    TARIH   MAL_TURU MAL_ADI BIRIM ASGARI_UCRET AZAMI_UCRET  FARK IQR_FARK  UPPER
##    <chr>   <chr>    <chr>   <chr>        <dbl>       <dbl> <dbl>    <dbl>  <dbl>
##  1 2021-0~ BALIK    TIRSI ~ KG            5.83       12.5   6.67     5.83  22.1 
##  2 2021-0~ BALIK    KIRLAN~ KG            3          80    77       71    244.  
##  3 2021-0~ BALIK    ÇIMÇIM~ KG            3.5         8     4.5      8.62  33.6 
##  4 2021-0~ BALIK    HANOS ~ KG            2.5         5     2.5      2.5    6.75
##  5 2021-0~ BALIK    KILIÇ ~ KG           45          45     0       20     50   
##  6 2021-0~ BALIK    LÜFER ~ KG          130         130     0       35     97.5 
##  7 2021-0~ BALIK    USKUMR~ KG           38          38     0       39    112.  
##  8 2021-0~ KÜLTÜR   LEVREK~ KG           25          55    30       21     96.5 
##  9 2021-0~ BALIK    KÖPEK ~ KG           10          10     0        8     20   
## 10 2021-0~ BALIK    ISTAVR~ KG            2.5         6.67  4.17     3.33  10.8 
## # ... with 22,328 more rows, and 1 more variable: LOWER <dbl>

We cut off the outliers but in order to reach fresh fish, we should cut off the hunting ban time.

 av_serbest <- df_cutoff_outliers %>% mutate (Zaman=lubridate::date(TARIH), Ay=lubridate::month(TARIH))%>% 
  filter(!(TARIH >= "2021-04-15" & TARIH <= "2021-08-31")) 

Time to check, when we can eat… Let’s say, big juicy pandora(Mercan)

mercan <-av_serbest %>% filter(MAL_ADI=="MERCAN (BÜYÜKBOY)") %>% 
  mutate (ay=lubridate::month(TARIH),daily_change=(AZAMI_UCRET-ASGARI_UCRET),daily_avg=(AZAMI_UCRET+ASGARI_UCRET)/2)



boxplot(daily_avg~ay,
        data=mercan,
        main="Mercan Data",
        xlab="Month Number",
        ylab="Daily Average",
        col="green",
        border="red"
)  

The boxplot is showing that the winter months are the best for pandora and the best month to eat is April of course. I searched it on Google and it is absolutely true.

Thank you for reading…

I want to thank both Meryem and Emirhan. I get great help from their analysis.

You can go back main page