Firstly, we will analyze the data for determine data structure, format, size and data defects.
fish_price=read.csv(file = 'https://raw.githubusercontent.com/pjournal/mef05-ustame/gh-pages/balik_hal_fiyatlari.csv',
stringsAsFactors = FALSE, header = TRUE,sep = ";", encoding="UTF-8")
head(fish_price)
## TARIH MAL_TURU MAL_ADI BIRIM ASGARI_UCRET AZAMI_UCRET
## 1 2021-01-02 00:00:00 BALIK TIRSI (DENİZ) KG 5.83 12.5
## 2 2021-01-02 00:00:00 BALIK KIRLANGIÇ (DENİZ) KG 3.00 80.0
## 3 2021-01-02 00:00:00 BALIK ÇİMÇİM (DENİZ) KG 3.50 8.0
## 4 2021-01-02 00:00:00 BALIK HANOS ( DENİZ ) KG 2.50 5.0
## 5 2021-01-02 00:00:00 BALIK KILIÇ (DENİZ) KG 45.00 45.0
## 6 2021-01-02 00:00:00 BALIK LÜFER ( DENİZ) KG 130.00 130.0
summary(fish_price)
## TARIH MAL_TURU MAL_ADI BIRIM
## Length:18369 Length:18369 Length:18369 Length:18369
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## ASGARI_UCRET AZAMI_UCRET
## Min. : 0.00 Min. : 0.42
## 1st Qu.: 5.00 1st Qu.: 15.00
## Median : 10.00 Median : 35.00
## Mean : 30.46 Mean : 65.53
## 3rd Qu.: 34.00 3rd Qu.: 90.00
## Max. :650.00 Max. :3500.00
In 2021, We can determine the cheapest fish all year. First of all, we calculate the average seasonal price by “MAL_ADI”. We are taking action to determine which fish is suitable in this year.
1. We can see that fish having the 5 highest volatile price. r the_highest_fish
fish_price %>%filter(MAL_TURU=='BALIK') %>%
mutate (ay=lubridate::month(TARIH),daily_change=(AZAMI_UCRET-ASGARI_UCRET),daily_avg=(AZAMI_UCRET+ASGARI_UCRET)/2) %>%
mutate(avg_change=(daily_change/2)/daily_avg*100) %>% group_by(MAL_ADI) %>%
summarise(avg_seasonal_change=mean(avg_change),avg_seasonal=mean(daily_avg)) %>%
arrange(desc(avg_seasonal_change)) %>% head(5)
## # A tibble: 5 x 3
## MAL_ADI avg_seasonal_change avg_seasonal
## <chr> <dbl> <dbl>
## 1 MERCAN (BÜYÜKBOY) 86.3 106.
## 2 KUPEZ (DENİZ) 84.9 11.2
## 3 LİDAKİ (DENİZ) 78.7 35.9
## 4 BARBUN (TEKİR) 78.5 88.0
## 5 ISKORPIT (DENİZ) 77.8 57.8
2. We investigate the highest volatile price of fish “MERCAN (BÜYÜKBOY)”
When we examine the average price distribution of “MERCAN (BÜYÜKBOY)” fish by months.
mercan=fish_price %>% filter(MAL_ADI=="MERCAN (BÜYÜKBOY)") %>%
mutate (ay=lubridate::month(TARIH),daily_change=(AZAMI_UCRET-ASGARI_UCRET),daily_avg=(AZAMI_UCRET+ASGARI_UCRET)/2)
boxplot(daily_avg~ay,
data=mercan,
main="Mercan Data",
xlab="Month Number",
ylab="Daily Average",
col="orange",
border="brown"
)
It can be purchased at an suitable price in the first 4 months “MERCAN” average.
3. We can see that fish having the 5 lowest volatile price. r the_lowest_fish
fish_price %>%filter(MAL_TURU=='BALIK') %>%
mutate (ay=lubridate::month(TARIH),daily_change=(AZAMI_UCRET-ASGARI_UCRET),daily_avg=(AZAMI_UCRET+ASGARI_UCRET)/2) %>%
mutate(avg_change=(daily_change/2)/daily_avg*100) %>% group_by(MAL_ADI) %>%
summarise(avg_seasonal_change=mean(avg_change),avg_seasonal=mean(daily_avg)) %>%
arrange(desc(avg_seasonal_change)) %>% tail(5)
## # A tibble: 5 x 3
## MAL_ADI avg_seasonal_change avg_seasonal
## <chr> <dbl> <dbl>
## 1 KELER(DENİZ) 0 60
## 2 KRAÇA (DONUK) 0 2.5
## 3 LİPSÖZ ( DENİZ ) 0 60
## 4 MARYA(DENİZ) 0 33
## 5 MEZGİT (DONUK) 0 25
When we examine the average price distribution of “MEZGİT (DONUK)” fish by months.
mezgit=fish_price %>% filter(MAL_ADI=="MEZGİT (DONUK)") %>%
mutate (ay=lubridate::month(TARIH),daily_change=(AZAMI_UCRET-ASGARI_UCRET),daily_avg=(AZAMI_UCRET+ASGARI_UCRET)/2)
boxplot(daily_avg~ay,
data=mezgit,
main="Mezgit Data",
xlab="Month Number",
ylab="Daily Average",
col="red",
border="brown"
)
Mezgit data
mezgit
## TARIH MAL_TURU MAL_ADI BIRIM ASGARI_UCRET AZAMI_UCRET ay
## 1 2021-06-02 00:00:00 BALIK MEZGİT (DONUK) KG 25 25 6
## daily_change daily_avg
## 1 0 25
After analysis, I realized that "“MEZGİT (DONUK)” is not enough data becuase it just sold one day.
I eliminated data with less than 180 days of sales.
elimanted=fish_price %>% filter(MAL_TURU=='BALIK') %>% group_by(MAL_ADI) %>%
count() %>%filter(n>=180)
fish_prise_eliminated=fish_price %>%inner_join(elimanted,by="MAL_ADI")
str(fish_prise_eliminated)
## 'data.frame': 13283 obs. of 7 variables:
## $ TARIH : chr "2021-01-02 00:00:00" "2021-01-02 00:00:00" "2021-01-02 00:00:00" "2021-01-02 00:00:00" ...
## $ MAL_TURU : chr "BALIK" "BALIK" "BALIK" "BALIK" ...
## $ MAL_ADI : chr "TIRSI (DENİZ)" "KIRLANGIÇ (DENİZ)" "ÇİMÇİM (DENİZ)" "HANOS ( DENİZ )" ...
## $ BIRIM : chr "KG" "KG" "KG" "KG" ...
## $ ASGARI_UCRET: num 5.83 3 3.5 2.5 130 38 10 2.5 3 90 ...
## $ AZAMI_UCRET : num 12.5 80 8 5 130 38 10 6.67 6 120 ...
## $ n : int 281 247 247 193 209 194 181 247 259 277 ...
Again, we determine that fish having the 5 lowest volatile price.
datamin= fish_prise_eliminated %>%filter(MAL_TURU=='BALIK') %>%
mutate (ay=lubridate::month(TARIH),daily_change=(AZAMI_UCRET-ASGARI_UCRET),daily_avg=(AZAMI_UCRET+ASGARI_UCRET)/2) %>%
mutate(avg_change=(daily_change/2)/daily_avg*100) %>% group_by(MAL_ADI) %>%
summarise(avg_seasonal_change=mean(avg_change),avg_seasonal=mean(daily_avg)) %>%
arrange(desc(avg_seasonal_change)) %>% tail(5)
datamin
## # A tibble: 5 x 3
## MAL_ADI avg_seasonal_change avg_seasonal
## <chr> <dbl> <dbl>
## 1 LÜFER(İRİ BOY) 18.7 103.
## 2 LAHOZ (DENİZ) 17.3 163.
## 3 MASKO DENİZ 17.3 14.9
## 4 LÜFER ( DENİZ) 14.4 108.
## 5 KOFANA ( DENİZ ) 10.5 83.0
“KOFANA ( DENİZ )” is the lowest volatile price.
kofana=fish_prise_eliminated %>% inner_join(datamin,by="MAL_ADI") %>%
mutate (ay=lubridate::month(TARIH),daily_change=(AZAMI_UCRET-ASGARI_UCRET),daily_avg=(AZAMI_UCRET+ASGARI_UCRET)/2)
boxplot(daily_avg~ay,
data=kofana,
main="Kofana Data",
xlab="Month Number",
ylab="Daily Change",
col="green",
border="brown"
)
“Kofana” can be bought all season long. Because the price change is less than other fish species.