In this assignment i will be exploring the Izmir’s Fish Market to find out interesting findings
The data includes the year 2021.
Lets start with importing necessary libraries.
library(tidyverse) # data manipulation , visualization
library(lubridate) # date library
library(readr) # library needed for read csv
Read CSV file using read.csv() function. fileEncoding parameter specfied as “utf-8” due to the Turkish characters.
df <- read.csv("balik_hal_fiyatlari.csv",sep=';',fileEncoding = "utf-8")
First 5 rows of the data.
head(df)
## TARIH MAL_TURU MAL_ADI BIRIM ASGARI_UCRET AZAMI_UCRET
## 1 2021-01-02 00:00:00 BALIK TIRSI (DENIZ) KG 5.83 12.5
## 2 2021-01-02 00:00:00 BALIK KIRLANGIÇ (DENIZ) KG 3.00 80.0
## 3 2021-01-02 00:00:00 BALIK ÇIMÇIM (DENIZ) KG 3.50 8.0
## 4 2021-01-02 00:00:00 BALIK HANOS ( DENIZ ) KG 2.50 5.0
## 5 2021-01-02 00:00:00 BALIK KILIÇ (DENIZ) KG 45.00 45.0
## 6 2021-01-02 00:00:00 BALIK LÜFER ( DENIZ) KG 130.00 130.0
Summary statistics of the data.
summary(df)
## TARIH MAL_TURU MAL_ADI BIRIM
## Length:18369 Length:18369 Length:18369 Length:18369
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## ASGARI_UCRET AZAMI_UCRET
## Min. : 0.00 Min. : 0.42
## 1st Qu.: 5.00 1st Qu.: 15.00
## Median : 10.00 Median : 35.00
## Mean : 30.46 Mean : 65.53
## 3rd Qu.: 34.00 3rd Qu.: 90.00
## Max. :650.00 Max. :3500.00
Data has 5 attributes ,
Data Type for TARIH column is incorrect. Change it to date type and look for summary again.
We can now see the minimum and maximum date.
df$TARIH <- as.Date(df$TARIH)
summary(df$TARIH)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## "2021-01-02" "2021-03-08" "2021-05-25" "2021-05-25" "2021-08-14" "2021-10-18"
Check distinct values for MAL_ADI, MAL_TURU and BIRIM attributes.
There are 4 distinct Values for MAL_TURU.
unique(df['MAL_TURU'])
## MAL_TURU
## 1 BALIK
## 8 KÜLTÜR
## 42 ITHAL (DONUK)
## 43 TATLI SU
There are 2 distinct Values for BIRIM.
unique(df['BIRIM'])
## BIRIM
## 1 KG
## 17 ADET
Due to the too many distinct values for MAL_ADI. I just count the unique fish in the fish market. 126 Distinct Values for MAL_ADI in short there are 126 different fish in the market.
count(unique(df['MAL_ADI']))
## n
## 1 126
Percentiles of each MAL_TURU.
BALIK is the most Fish Type in the market followed by TATLI_SU, KÜLTÜR and ITHAL(DONUK)
Fishes above Mean AZAMI_UCRET(Maximum Price) of 2021.
Find out maximum price tag for each fish of entire 2021 and average of that max.
maksimum_fiyat <- df %>% group_by(MAL_ADI) %>% summarise(maksimum_fiyat = max(AZAMI_UCRET))
ortalama_fiyat <- maksimum_fiyat %>% summarise(mean_max_price = mean(maksimum_fiyat))
fishes_above_mean <- maksimum_fiyat %>% filter(maksimum_fiyat >= as.vector(ortalama_fiyat$mean_max_price))
Visualize Top 20 Fishes Above Maximum Price
Lets explore my favorite fish, Somon, using line plot and answer questions like,
Filter for imported somon and local somon and find average monthly price.
somon_prices_df <- df %>% filter(MAL_ADI %in% c("SOMON ( TATLI )","ITHAL SOMON(TATLI)")) %>% mutate(GUNLUK_ORTALAMA_UCRET = (AZAMI_UCRET + ASGARI_UCRET / 2)) %>% group_by(MAL_ADI,AY = lubridate::month(TARIH)) %>% summarise(AYLIK_ORTALAMA_UCRET = mean(GUNLUK_ORTALAMA_UCRET))
month_numeric <- unique(somon_prices_df$AY) # Get Month Numbers for visualization x axis
month_label <- lubridate::month(df$TARIH, label = TRUE) # Get Month Names for visualization x axis
When we look at the line chart, there are significant price difference between local somon and imported somon.
On February ITHAL SOMON(TATLI) has lowest price.On the other hand, local Somon (SOMON (TATLI)) is lowest at January.
We can see the price spikes after february for both fish and after May SOMON prices gradually decreasing.
In this analysis i briefly analyzed Izmir’s Fish Market.
First , i start of with find out and visualize pie chart for MAL_TURU(Fish Type) to find out percentages.
Later, i find out top 20 fishes above mean maximum price. BARBUN(TEKIR) can go up to 3500 Turkish liras !!
Finally i analyzed my favorite fish, Somon. Average price is high on May,Jun,July.We can expect high prices on this months because this is the season for fresh, quality Somon. There is a huge price difference between imported and local somon. From my experience there isn’t too much difference in taste between these 2 type of somon’s to worth the price :)