Brief Analysis

This is a brief analysis of fish market dataset. First we start by loading the necessary libraries.

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.4     v dplyr   1.0.7
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   2.0.1     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
library(ggplot2)

Second, we read the dataset with the read_csv command. Don’t forget that our delimiter is “;”, so we have to specify it. Also, our dataset contains Turkish characters, thus we also have to specify an encoding argument.

fish <- read.csv("C:/Users/mctas/Desktop/MEF_BDA/R/balik_hal_fiyatlari.csv", sep=';', encoding = 'UTF-8')

Then, we pipe our dataset into a view.

fish %>% View()

If we get a glimpse of our dataframe, fish:

glimpse(fish)
## Rows: 18,596
## Columns: 6
## $ TARIH        <chr> "2021-01-02 00:00:00", "2021-01-02 00:00:00", "2021-01-02~
## $ MAL_TURU     <chr> "BALIK", "BALIK", "BALIK", "BALIK", "BALIK", "BALIK", "BA~
## $ MAL_ADI      <chr> "TIRSI (DENIZ)", "KIRLANGIÇ (DENIZ)", "ÇIMÇIM (DENIZ)", "~
## $ BIRIM        <chr> "KG", "KG", "KG", "KG", "KG", "KG", "KG", "KG", "KG", "KG~
## $ ASGARI_UCRET <dbl> 5.83, 3.00, 3.50, 2.50, 45.00, 130.00, 38.00, 25.00, 10.0~
## $ AZAMI_UCRET  <dbl> 12.50, 80.00, 8.00, 5.00, 45.00, 130.00, 38.00, 55.00, 10~

As you can see, our date column is type character. We have to convert it into datetime by using the command below.

fish$TARIH <- ymd_hms(fish$TARIH)

Let’s check the minimum and maximum dates in the dataframe.

min(fish$TARIH)
## [1] "2021-01-02 UTC"
max(fish$TARIH)
## [1] "2021-10-21 UTC"

Also, let’s have a general idea about the minimum and maximum mean prices of fish overall. This table shows us the overall mean prices of different products, including regular fish, imported fish etc.

ortalama_ucretler <- fish %>% group_by(MAL_TURU) %>% summarise(ASGARI_UCRET_ORTALAMA=mean(ASGARI_UCRET), AZAMI_UCRET_ORTALAMA=mean(AZAMI_UCRET))
ortalama_ucretler
## # A tibble: 4 x 3
##   MAL_TURU      ASGARI_UCRET_ORTALAMA AZAMI_UCRET_ORTALAMA
##   <chr>                         <dbl>                <dbl>
## 1 BALIK                          30.9                 68.4
## 2 ITHAL (DONUK)                  24.4                 30.8
## 3 KÜLTÜR                         22.2                 64.6
## 4 TATLI SU                       29.6                 35.8

If we want to check out daily minimum and maximum mean prices of all fish products, we get the following dataframe.

historic_fish_prices <- fish %>% group_by(TARIH) %>% summarise(ASGARI_gunluk_ortalama=mean(ASGARI_UCRET), AZAMI_gunluk_ortalama=mean(AZAMI_UCRET))
historic_fish_prices
## # A tibble: 284 x 3
##    TARIH               ASGARI_gunluk_ortalama AZAMI_gunluk_ortalama
##    <dttm>                               <dbl>                 <dbl>
##  1 2021-01-02 00:00:00                   21.1                  36.7
##  2 2021-01-03 00:00:00                   19.5                  51.1
##  3 2021-01-04 00:00:00                   23.3                  48.4
##  4 2021-01-05 00:00:00                   26.2                  49.7
##  5 2021-01-06 00:00:00                   21.8                  46.7
##  6 2021-01-07 00:00:00                   23.8                  60.3
##  7 2021-01-08 00:00:00                   21.1                  51.8
##  8 2021-01-09 00:00:00                   21.9                  43.8
##  9 2021-01-10 00:00:00                   22.3                  44.9
## 10 2021-01-11 00:00:00                   20.0                  49.6
## # ... with 274 more rows

Let’s add two more columns to this dataframe. These columns should represent the overall minimum and maximum mean prices of all fish.

historic_fish_prices$ASGARI_tum_zamanlar_ortalama <- mean(historic_fish_prices$ASGARI_gunluk_ortalama)
historic_fish_prices$AZAMI_tum_zamanlar_ortalama <- mean(historic_fish_prices$AZAMI_gunluk_ortalama)
mean(historic_fish_prices$ASGARI_gunluk_ortalama)
## [1] 30.74063
mean(historic_fish_prices$AZAMI_gunluk_ortalama)
## [1] 65.42487

Let’s plot the historic data of fish prices using the library reshape2.

library(reshape2)
## 
## Attaching package: 'reshape2'
## The following object is masked from 'package:tidyr':
## 
##     smiths
historic_fish_prices.long <- melt(historic_fish_prices, id = "TARIH", 
                                  measure = c("ASGARI_tum_zamanlar_ortalama", 
                                                      "AZAMI_gunluk_ortalama", 
                                                      "ASGARI_gunluk_ortalama",
                                                      "AZAMI_tum_zamanlar_ortalama"))
ggplot(historic_fish_prices.long, aes(TARIH, value, colour = variable)) + geom_line(size=1)

Results

As can be seen from the graph above, fish prices have gone up till July 2021, but later in the year they have started to gone down. Also note that there is an unusual peak in July. A price of 110 TL for either a kg or a single fish was too high compared to other prices. Maybe an expensive fish was caught during that time, or someone sold theirs for a very high profit. More analysis is required to answer this question.