Izmir Fish Price Analysis

How’s the data?

Firstly, we will analyze the data for determine data structure, format, size and data defects.

fish_price=read.csv(file = 'https://raw.githubusercontent.com/pjournal/mef05-ustame/gh-pages/balik_hal_fiyatlari.csv',
                    stringsAsFactors = FALSE, header = TRUE,sep = ";", encoding="UTF-8")

head(fish_price)
##                 TARIH MAL_TURU           MAL_ADI BIRIM ASGARI_UCRET AZAMI_UCRET
## 1 2021-01-02 00:00:00    BALIK     TIRSI (DENİZ)    KG         5.83        12.5
## 2 2021-01-02 00:00:00    BALIK KIRLANGIÇ (DENİZ)    KG         3.00        80.0
## 3 2021-01-02 00:00:00    BALIK    ÇİMÇİM (DENİZ)    KG         3.50         8.0
## 4 2021-01-02 00:00:00    BALIK   HANOS ( DENİZ )    KG         2.50         5.0
## 5 2021-01-02 00:00:00    BALIK     KILIÇ (DENİZ)    KG        45.00        45.0
## 6 2021-01-02 00:00:00    BALIK    LÜFER ( DENİZ)    KG       130.00       130.0
summary(fish_price)
##     TARIH             MAL_TURU           MAL_ADI             BIRIM          
##  Length:18369       Length:18369       Length:18369       Length:18369      
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##   ASGARI_UCRET     AZAMI_UCRET     
##  Min.   :  0.00   Min.   :   0.42  
##  1st Qu.:  5.00   1st Qu.:  15.00  
##  Median : 10.00   Median :  35.00  
##  Mean   : 30.46   Mean   :  65.53  
##  3rd Qu.: 34.00   3rd Qu.:  90.00  
##  Max.   :650.00   Max.   :3500.00

Basic Exploration

Min-Max Fish

In 2021, We can determine the cheapest fish all year. First of all, we calculate the average seasonal price by “MAL_ADI”. We are taking action to determine which fish is suitable in this year.

1. We can see that fish having the 5 highest volatile price. r the_highest_fish

fish_price %>%filter(MAL_TURU=='BALIK') %>%
    mutate (ay=lubridate::month(TARIH),daily_change=(AZAMI_UCRET-ASGARI_UCRET),daily_avg=(AZAMI_UCRET+ASGARI_UCRET)/2) %>%
    mutate(avg_change=(daily_change/2)/daily_avg*100) %>% group_by(MAL_ADI) %>%
    summarise(avg_seasonal_change=mean(avg_change),avg_seasonal=mean(daily_avg)) %>%
    arrange(desc(avg_seasonal_change)) %>% head(5)
## # A tibble: 5 x 3
##   MAL_ADI           avg_seasonal_change avg_seasonal
##   <chr>                           <dbl>        <dbl>
## 1 MERCAN (BÜYÜKBOY)                86.3        106. 
## 2 KUPEZ (DENİZ)                    84.9         11.2
## 3 LİDAKİ (DENİZ)                   78.7         35.9
## 4 BARBUN (TEKİR)                   78.5         88.0
## 5 ISKORPIT (DENİZ)                 77.8         57.8

2. We investigate the highest volatile price of fish “MERCAN (BÜYÜKBOY)”

When we examine the average price distribution of “MERCAN (BÜYÜKBOY)” fish by months.

mercan=fish_price %>% filter(MAL_ADI=="MERCAN (BÜYÜKBOY)") %>% 
  mutate (ay=lubridate::month(TARIH),daily_change=(AZAMI_UCRET-ASGARI_UCRET),daily_avg=(AZAMI_UCRET+ASGARI_UCRET)/2)

  boxplot(daily_avg~ay,
          data=mercan,
          main="Mercan Data",
          xlab="Month Number",
          ylab="Daily Average",
          col="orange",
          border="brown"
  )  

It can be purchased at an suitable price in the first 4 months “MERCAN” average.

3. We can see that fish having the 5 lowest volatile price. r the_lowest_fish

fish_price %>%filter(MAL_TURU=='BALIK') %>%
    mutate (ay=lubridate::month(TARIH),daily_change=(AZAMI_UCRET-ASGARI_UCRET),daily_avg=(AZAMI_UCRET+ASGARI_UCRET)/2) %>%
    mutate(avg_change=(daily_change/2)/daily_avg*100) %>% group_by(MAL_ADI) %>%
    summarise(avg_seasonal_change=mean(avg_change),avg_seasonal=mean(daily_avg)) %>%
    arrange(desc(avg_seasonal_change)) %>% tail(5)
## # A tibble: 5 x 3
##   MAL_ADI          avg_seasonal_change avg_seasonal
##   <chr>                          <dbl>        <dbl>
## 1 KELER(DENİZ)                       0         60  
## 2 KRAÇA (DONUK)                      0          2.5
## 3 LİPSÖZ ( DENİZ )                   0         60  
## 4 MARYA(DENİZ)                       0         33  
## 5 MEZGİT (DONUK)                     0         25

When we examine the average price distribution of “MEZGİT (DONUK)” fish by months.

mezgit=fish_price %>% filter(MAL_ADI=="MEZGİT (DONUK)") %>% 
  mutate (ay=lubridate::month(TARIH),daily_change=(AZAMI_UCRET-ASGARI_UCRET),daily_avg=(AZAMI_UCRET+ASGARI_UCRET)/2)

  boxplot(daily_avg~ay,
          data=mezgit,
          main="Mezgit Data",
          xlab="Month Number",
          ylab="Daily Average",
          col="red",
          border="brown"
  )  

Mezgit data

mezgit  
##                 TARIH MAL_TURU        MAL_ADI BIRIM ASGARI_UCRET AZAMI_UCRET ay
## 1 2021-06-02 00:00:00    BALIK MEZGİT (DONUK)    KG           25          25  6
##   daily_change daily_avg
## 1            0        25

After analysis, I realized that "“MEZGİT (DONUK)” is not enough data becuase it just sold one day.

I eliminated data with less than 180 days of sales.

elimanted=fish_price %>% filter(MAL_TURU=='BALIK') %>% group_by(MAL_ADI) %>% 
 count() %>%filter(n>=180)
 
fish_prise_eliminated=fish_price %>%inner_join(elimanted,by="MAL_ADI")

str(fish_prise_eliminated)
## 'data.frame':   13283 obs. of  7 variables:
##  $ TARIH       : chr  "2021-01-02 00:00:00" "2021-01-02 00:00:00" "2021-01-02 00:00:00" "2021-01-02 00:00:00" ...
##  $ MAL_TURU    : chr  "BALIK" "BALIK" "BALIK" "BALIK" ...
##  $ MAL_ADI     : chr  "TIRSI (DENİZ)" "KIRLANGIÇ (DENİZ)" "ÇİMÇİM (DENİZ)" "HANOS ( DENİZ )" ...
##  $ BIRIM       : chr  "KG" "KG" "KG" "KG" ...
##  $ ASGARI_UCRET: num  5.83 3 3.5 2.5 130 38 10 2.5 3 90 ...
##  $ AZAMI_UCRET : num  12.5 80 8 5 130 38 10 6.67 6 120 ...
##  $ n           : int  281 247 247 193 209 194 181 247 259 277 ...

Again, we determine that fish having the 5 lowest volatile price.

datamin=  fish_prise_eliminated %>%filter(MAL_TURU=='BALIK') %>%
    mutate (ay=lubridate::month(TARIH),daily_change=(AZAMI_UCRET-ASGARI_UCRET),daily_avg=(AZAMI_UCRET+ASGARI_UCRET)/2) %>%
    mutate(avg_change=(daily_change/2)/daily_avg*100) %>% group_by(MAL_ADI) %>%
    summarise(avg_seasonal_change=mean(avg_change),avg_seasonal=mean(daily_avg)) %>%
    arrange(desc(avg_seasonal_change)) %>% tail(5)
 
datamin 
## # A tibble: 5 x 3
##   MAL_ADI          avg_seasonal_change avg_seasonal
##   <chr>                          <dbl>        <dbl>
## 1 LÜFER(İRİ BOY)                  18.7        103. 
## 2 LAHOZ (DENİZ)                   17.3        163. 
## 3 MASKO DENİZ                     17.3         14.9
## 4 LÜFER ( DENİZ)                  14.4        108. 
## 5 KOFANA ( DENİZ )                10.5         83.0

“KOFANA ( DENİZ )” is the lowest volatile price.

kofana=fish_prise_eliminated %>% inner_join(datamin,by="MAL_ADI") %>% 
  mutate (ay=lubridate::month(TARIH),daily_change=(AZAMI_UCRET-ASGARI_UCRET),daily_avg=(AZAMI_UCRET+ASGARI_UCRET)/2)
  
  boxplot(daily_avg~ay,
          data=kofana,
          main="Kofana Data",
          xlab="Month Number",
          ylab="Daily Change",
          col="green",
          border="brown"
  )  

“Kofana” can be bought all season long. Because the price change is less than other fish species.