Izmir’s Fish Market Analysis

Introduction

In this assignment i will be exploring the Izmir’s Fish Market to find out interesting findings

The data includes the year 2021.

Lets start with importing necessary libraries.

library(tidyverse) # data manipulation , visualization
library(lubridate) # date library
library(readr) # library needed for read csv

Data Exploration

Read CSV file using read.csv() function. fileEncoding parameter specfied as “utf-8” due to the Turkish characters.

df <- read.csv("balik_hal_fiyatlari.csv",sep=';',fileEncoding = "utf-8")

First 5 rows of the data.

head(df)
##                 TARIH MAL_TURU           MAL_ADI BIRIM ASGARI_UCRET AZAMI_UCRET
## 1 2021-01-02 00:00:00    BALIK     TIRSI (DENIZ)    KG         5.83        12.5
## 2 2021-01-02 00:00:00    BALIK KIRLANGIÇ (DENIZ)    KG         3.00        80.0
## 3 2021-01-02 00:00:00    BALIK    ÇIMÇIM (DENIZ)    KG         3.50         8.0
## 4 2021-01-02 00:00:00    BALIK   HANOS ( DENIZ )    KG         2.50         5.0
## 5 2021-01-02 00:00:00    BALIK     KILIÇ (DENIZ)    KG        45.00        45.0
## 6 2021-01-02 00:00:00    BALIK    LÜFER ( DENIZ)    KG       130.00       130.0

Summary statistics of the data.

summary(df)
##     TARIH             MAL_TURU           MAL_ADI             BIRIM          
##  Length:18369       Length:18369       Length:18369       Length:18369      
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##   ASGARI_UCRET     AZAMI_UCRET     
##  Min.   :  0.00   Min.   :   0.42  
##  1st Qu.:  5.00   1st Qu.:  15.00  
##  Median : 10.00   Median :  35.00  
##  Mean   : 30.46   Mean   :  65.53  
##  3rd Qu.: 34.00   3rd Qu.:  90.00  
##  Max.   :650.00   Max.   :3500.00

Data has 5 attributes ,

  • TARIH : Date
  • MAL_TURU : Fish Type
  • MAL_ADI : Fish Name
  • BIRIM : Unit
  • ASGARI_UCRET : Minimum Price
  • AZAMI_UCRET : Maximum Price

Data Type for TARIH column is incorrect. Change it to date type and look for summary again.

We can now see the minimum and maximum date.

df$TARIH <- as.Date(df$TARIH)
summary(df$TARIH)
##         Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
## "2021-01-02" "2021-03-08" "2021-05-25" "2021-05-25" "2021-08-14" "2021-10-18"

Check distinct values for MAL_ADI, MAL_TURU and BIRIM attributes.

There are 4 distinct Values for MAL_TURU.

unique(df['MAL_TURU'])
##         MAL_TURU
## 1          BALIK
## 8         KÜLTÜR
## 42 ITHAL (DONUK)
## 43      TATLI SU

There are 2 distinct Values for BIRIM.

unique(df['BIRIM'])
##    BIRIM
## 1     KG
## 17  ADET

Due to the too many distinct values for MAL_ADI. I just count the unique fish in the fish market. 126 Distinct Values for MAL_ADI in short there are 126 different fish in the market.

count(unique(df['MAL_ADI']))
##     n
## 1 126

Data Visualization

Percentiles of each MAL_TURU. 

BALIK is the most Fish Type in the market followed by TATLI_SU, KÜLTÜR and ITHAL(DONUK)

Fishes above Mean AZAMI_UCRET(Maximum Price) of 2021.

Find out maximum price tag for each fish of entire 2021 and average of that max.

maksimum_fiyat <- df %>% group_by(MAL_ADI) %>% summarise(maksimum_fiyat = max(AZAMI_UCRET))
ortalama_fiyat <- maksimum_fiyat %>% summarise(mean_max_price = mean(maksimum_fiyat))
fishes_above_mean <- maksimum_fiyat %>% filter(maksimum_fiyat >= as.vector(ortalama_fiyat$mean_max_price))

Visualize Top 20 Fishes Above Maximum Price

Lets explore my favorite fish, Somon, using line plot and answer questions like,

  • Which month has the lowest price ?
  • Is there a significant different in price of between imported somon or local somon.

Filter for imported somon and local somon and find average monthly price.

somon_prices_df <- df %>% filter(MAL_ADI %in% c("SOMON ( TATLI )","ITHAL SOMON(TATLI)")) %>% mutate(GUNLUK_ORTALAMA_UCRET = (AZAMI_UCRET + ASGARI_UCRET / 2)) %>% group_by(MAL_ADI,AY = lubridate::month(TARIH)) %>% summarise(AYLIK_ORTALAMA_UCRET = mean(GUNLUK_ORTALAMA_UCRET))

month_numeric <- unique(somon_prices_df$AY) # Get Month Numbers for visualization x axis
month_label <- lubridate::month(df$TARIH, label = TRUE) # Get Month Names for visualization x axis

When we look at the line chart, there are significant price difference between local somon and imported somon.

On February ITHAL SOMON(TATLI) has lowest price.On the other hand, local Somon (SOMON (TATLI)) is lowest at January.

We can see the price spikes after february for both fish and after May SOMON prices gradually decreasing.

Conclusion

In this analysis i briefly analyzed Izmir’s Fish Market.

First , i start of with find out and visualize pie chart for MAL_TURU(Fish Type) to find out percentages.

Later, i find out top 20 fishes above mean maximum price. BARBUN(TEKIR) can go up to 3500 Turkish liras !!

Finally i analyzed my favorite fish, Somon. Average price is high on May,Jun,July.We can expect high prices on this months because this is the season for fresh, quality Somon. There is a huge price difference between imported and local somon. From my experience there isn’t too much difference in taste between these 2 type of somon’s to worth the price :)