Code
library(readxl)
library(knitr)
library(dplyr)
library(tidyr)
library(reshape2)
library(writexl)
There are 4 data sets that are used in the project. These are called;
These data sets were taken from TUIK. TUIK is an official data source that provides data for Turkey in diverse topics. Link to their website. Links for the data sets that are used here are in link1, link2, link3 and link4.
They will be processed before any calculations or visualizations to avoid possible errors/mistakes and to be used more effectively.
The data sets contain multiple headers, column names that are written in 2 languages and other explanations that has no place in a data frame. There could be missing values, symbols and false data types in the data. They will be examined and fixed as well.
Processing will start with importing the necessary libraries.
library(readxl)
library(knitr)
library(dplyr)
library(tidyr)
library(reshape2)
library(writexl)
<- readxl::read_excel("term_project/Number of Departing Visitors by Country of Residence.xls", skip = 4)
dep glimpse(dep)
Rows: 47
Columns: 11
$ Nationality <chr> "A.B.D. - USA", "Almanya - Germany", "Avusturya - Austria"…
$ `2012` <dbl> 883408, 7305228, 832019, 663758, 838895, 2643699, 1514894,…
$ `2013` <dbl> 856728, 7378650, 916069, 715578, 859199, 2738368, 1640259,…
$ `2014` <dbl> 888077, 7794762, 802133, 726078, 809843, 2818021, 1701021,…
$ `2015` <dbl> 833850, 8402180, 836755, 650569, 814868, 2776057, 1826947,…
$ `2016` <dbl> 505989, 6960545, 677284, 656685, 627223, 1957576, 1710276,…
$ `2017` <dbl> 331239, 7117716, 578074, 792883, 651702, 1951637, 1854683,…
$ `2018` <dbl> 468281, 8022883, 620002, 916429, 751660, 2575768, 2387679,…
$ `2019` <dbl> 626298, 8861124, 652020, 933291, 755681, 2978764, 2719962,…
$ `2020(1)` <dbl> 148914, 2903189, 238682, 250087, 219196, 1122967, 1190803,…
$ `2021` <dbl> 365211, 6314266, 516081, 512215, 459091, 473681, 1339552, …
First 4 rows are skipped to get the proper header. Similar process will take place to exclude the descriptions at the bottom of the data frame.
<- head(dep, - 15) dep
Second problem seems to be in the “Nationality” column. The countries were written in two different languages. They have a dash between them which can be used to exclude one of the languages. ” - ” string will be fed to the separate function to get rid off the white spaces as well.
<- dep %>%
dep separate(Nationality, c(NA,"Nationality")," - ")
There is the “(1)” character in the column for the year 2020 which is a note that says the column only contains data for the first, third and last quarter of the year due to the lack of surveys(COVID-19).
It will be removed from the column name.
names(dep)[names(dep) == "2020(1)"] <- "2020"
kable(head(dep, 3))
Nationality | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | 2021 |
---|---|---|---|---|---|---|---|---|---|---|
USA | 883408 | 856728 | 888077 | 833850 | 505989 | 331239 | 468281 | 626298 | 148914 | 365211 |
Germany | 7305228 | 7378650 | 7794762 | 8402180 | 6960545 | 7117716 | 8022883 | 8861124 | 2903189 | 6314266 |
Austria | 832019 | 916069 | 802133 | 836755 | 677284 | 578074 | 620002 | 652020 | 238682 | 516081 |
Lastly, data type for the year columns should be integers. Lack of warnings will also reveal if there are empty cells. The data frame is saved as an excel file.
<- dep %>%
dep mutate(across("2012":"2021", as.integer))
write_xlsx(dep,"term_project/depart_by_residence.xlsx")
saveRDS(dep, file = "term_project/depart_by_residence.rds")
<- melt(dep, "Nationality")
melted_dep colnames(melted_dep) <- c('Nationality','Dep_Year','Departing_Visitors')
write_xlsx(melted_dep,"term_project/melted_depart_by_residence.xlsx")
saveRDS(melted_dep, file = "term_project/melted_depart_by_residence.rds")
glimpse(dep)
Rows: 32
Columns: 11
$ Nationality <chr> "USA", "Germany", "Austria", "Azerbaijan", "Belgium", "Uni…
$ `2012` <int> 883408, 7305228, 832019, 663758, 838895, 2643699, 1514894,…
$ `2013` <int> 856728, 7378650, 916069, 715578, 859199, 2738368, 1640259,…
$ `2014` <int> 888077, 7794762, 802133, 726078, 809843, 2818021, 1701021,…
$ `2015` <int> 833850, 8402180, 836755, 650569, 814868, 2776057, 1826947,…
$ `2016` <int> 505989, 6960545, 677284, 656685, 627223, 1957576, 1710276,…
$ `2017` <int> 331239, 7117716, 578074, 792883, 651702, 1951637, 1854683,…
$ `2018` <int> 468281, 8022883, 620002, 916429, 751660, 2575768, 2387679,…
$ `2019` <int> 626298, 8861124, 652020, 933291, 755681, 2978764, 2719962,…
$ `2020` <int> 148914, 2903189, 238682, 250087, 219196, 1122967, 1190803,…
$ `2021` <int> 365211, 6314266, 516081, 512215, 459091, 473681, 1339552, …
Second data set will be processed. Similar manipulations with the previous chapter will not be explained in detail to offer a better readability. Impracticable rows are excluded again.
This time, column names will be checked first. Only 5 column names are shown.
<- readxl::read_excel("term_project/Foreign and Citizen Visitors by Purpose of Visit (Foreigner and Citizens Resident Abroad).xls", skip = 5)
purp <- head(purp, -9)
purp colnames(purp)[1:5]
[1] "Yıl"
[2] "Çeyrek"
[3] "Toplam \nTotal...3"
[4] "Gezi, eğlence, sportif ve kültürel faaliyetler \nTravel, entertainment, sportive or cultural activities...4"
[5] "Akraba ve arkadaş ziyareti \nVisiting relatives and friends...5"
These column names seem chaotic enough to confuse the user. There are several things that can be done.
First of all, they are named in two languages again. Most of them are separated with the new line indicator “\n
”. Year and quarter columns are duplicated in two languages, duplicates will be dropped.
In this chunk, everything before “\n
” character is dropped if there is one, duplicated columns are dropped, some of the other characters are dropped and spaces are replaced with “_“.
names(purp)[3:23] <- sub(".*?\n", "", names(purp)[3:23])
names(purp) <- gsub(r"{\s*\([^\)]+\)}","",names(purp))
names(purp) <- trimws(names(purp), "l")
names(purp) <- gsub(" / ", "_", names(purp))
names(purp) <- gsub(" ", "_", names(purp))
names(purp)[3:7] <- substr(names(purp)[3:7],1,nchar(names(purp)[3:7])-4)
names(purp)[8:23] <- substr(names(purp)[8:23],1,nchar(names(purp)[8:23])-5)
names(purp) <- gsub(",", "", names(purp))
<- subset(purp, select = -c(Yıl,Çeyrek))
purp <- purp[,-11]
purp names(purp)[1:5]
[1] "Total"
[2] "Travel_entertainment_sportive_or_cultural_activities"
[3] "Visiting_relatives_and_friends"
[4] "Education_training"
[5] "Health_or_medical_reasons"
There are still duplicated column names yet they have different values. First columns represent the value for all of the tourists while others represent the values for Turkish citizens who live abroad. They can be distinguished with prefixes.
names(purp)[1:10] <- paste0("ALL_", names(purp)[1:10])
names(purp)[11:20] <- paste0("TR_", names(purp)[11:20])
<- purp %>% select(Year, Quarter, ALL_Total:TR_Other)
purp glimpse(purp)
Rows: 54
Columns: 22
$ Year <dbl> 2012, NA, NA,…
$ Quarter <chr> "Annual", "I"…
$ ALL_Total <chr> "36463921.041…
$ ALL_Travel_entertainment_sportive_or_cultural_activities <chr> "24953961", "…
$ ALL_Visiting_relatives_and_friends <chr> "6792033", "1…
$ ALL_Education_training <chr> "231152", "51…
$ ALL_Health_or_medical_reasons <chr> "240682", "63…
$ ALL_Religion_Pilgrimag <chr> "73510", "112…
$ ALL_Shoppin <chr> "934204", "14…
$ ALL_Transit <chr> "45194", "161…
$ ALL_Business <chr> "2224844", "5…
$ ALL_Other <chr> "968339", "22…
$ TR_Total <chr> "5121457", "8…
$ TR_Travel_entertainment_sportive_or_cultural_activities <chr> "1083976", "1…
$ TR_Visiting_relatives_and_friends <chr> "3645145", "5…
$ TR_Education_training <chr> "21768", "621…
$ TR_Health_or_medical_reasons <chr> "67151", "239…
$ TR_Religion_Pilgrimage <chr> "5973", "1775…
$ TR_Shopping <chr> "26330", "931…
$ TR_Transit <chr> "-", "-", "-"…
$ TR_Business <chr> "244774", "68…
$ TR_Other <chr> "26340", "399…
kable(head(purp))
Year | Quarter | ALL_Total | ALL_Travel_entertainment_sportive_or_cultural_activities | ALL_Visiting_relatives_and_friends | ALL_Education_training | ALL_Health_or_medical_reasons | ALL_Religion_Pilgrimag | ALL_Shoppin | ALL_Transit | ALL_Business | ALL_Other | TR_Total | TR_Travel_entertainment_sportive_or_cultural_activities | TR_Visiting_relatives_and_friends | TR_Education_training | TR_Health_or_medical_reasons | TR_Religion_Pilgrimage | TR_Shopping | TR_Transit | TR_Business | TR_Other |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2012 | Annual | 36463921.041000001 | 24953961 | 6792033 | 231152 | 240682 | 73510 | 934204 | 45194 | 2224844 | 968339 | 5121457 | 1083976 | 3645145 | 21768 | 67151 | 5973 | 26330 | - | 244774 | 26340 |
NA | I | 4219162 | 2005504 | 1168263 | 51489 | 63843 | 11203 | 148181 | 16131 | 532073 | 222474 | 844430 | 130458 | 599931 | 6212 | 23962 | 1775 | 9318 | - | 68779 | 3994 |
NA | II | 9323459 | 6752489 | 1189394 | 88929 | 58283 | 28866 | 244106 | 10525 | 654678 | 296190 | 911152 | 199092 | 586250 | 7134 | 16861 | 649 | 9079 | - | 81063 | 11025 |
NA | III | 15437123 | 11439809 | 3072660 | 43643 | 44905 | 14690 | 204199 | 13743 | 353468 | 250005 | 2276006 | 499200 | 1722066 | 5164 | 8951 | 1377 | 4131 | - | 26509 | 8607 |
NA | IV | 7484177 | 4756160 | 1361716 | 47091 | 73652 | 18751 | 337718 | 4795 | 684625 | 199669 | 1089869 | 255226 | 736898 | 3258 | 17377 | 2172 | 3801 | - | 68422 | 2715 |
2013 | Annual | 39226225.794699997 | 26817201 | 7239397 | 195918 | 300102 | 62762 | 1000734 | 41172 | 2404344 | 1164596 | 5398751.7854000004 | 1322033 | 3655078 | 20330 | 90579 | 4681 | 39986 | - | 255493 | 10571 |
There are still problems with the data. It can be seen that years are partially empty which should be repeated after first instances. There are problems in the data types as well. Lastly, Transit column for Turkish citizens is filled with dashes, it would be more efficient to simply keep them as zeros.
None values in the Year column are filled below.
<- purp %>% fill(Year) purp
Transit column is filled with zeros.
$TR_Transit <- 0 purp
Annual values for 2022 are empty and the values were filled with dashes for the second quarter of the year 2020 as explained before. There are other instances of dashes in the data set as well. They will be filled with zeros.
kable(filter(purp, rowSums(is.na(purp)) > 0 | (Year == 2020 & Quarter == "II")))
Year | Quarter | ALL_Total | ALL_Travel_entertainment_sportive_or_cultural_activities | ALL_Visiting_relatives_and_friends | ALL_Education_training | ALL_Health_or_medical_reasons | ALL_Religion_Pilgrimag | ALL_Shoppin | ALL_Transit | ALL_Business | ALL_Other | TR_Total | TR_Travel_entertainment_sportive_or_cultural_activities | TR_Visiting_relatives_and_friends | TR_Education_training | TR_Health_or_medical_reasons | TR_Religion_Pilgrimage | TR_Shopping | TR_Transit | TR_Business | TR_Other |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2020 | II | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | - | - |
2022 | Annual | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 0 | NA | NA |
<- data.frame(lapply(purp, gsub, pattern = "-", replacement = 0))
purp is.na(purp)] <- 0 purp[
Lastly, the columns are ready to be assigned by their correct types and the data is ready to be exported.
-2] <- sapply(purp[,-2],as.integer)
purp[,write_xlsx(purp,"term_project/purposes.xlsx")
saveRDS(purp, file = "term_project/purposes.rds")
glimpse(purp)
Rows: 54
Columns: 22
$ Year <int> 2012, 2012, 2…
$ Quarter <chr> "Annual", "I"…
$ ALL_Total <int> 36463921, 421…
$ ALL_Travel_entertainment_sportive_or_cultural_activities <int> 24953961, 200…
$ ALL_Visiting_relatives_and_friends <int> 6792033, 1168…
$ ALL_Education_training <int> 231152, 51489…
$ ALL_Health_or_medical_reasons <int> 240682, 63843…
$ ALL_Religion_Pilgrimag <int> 73510, 11203,…
$ ALL_Shoppin <int> 934204, 14818…
$ ALL_Transit <int> 45194, 16131,…
$ ALL_Business <int> 2224844, 5320…
$ ALL_Other <int> 968339, 22247…
$ TR_Total <int> 5121457, 8444…
$ TR_Travel_entertainment_sportive_or_cultural_activities <int> 1083976, 1304…
$ TR_Visiting_relatives_and_friends <int> 3645145, 5999…
$ TR_Education_training <int> 21768, 6212, …
$ TR_Health_or_medical_reasons <int> 67151, 23962,…
$ TR_Religion_Pilgrimage <int> 5973, 1775, 6…
$ TR_Shopping <int> 26330, 9318, …
$ TR_Transit <int> 0, 0, 0, 0, 0…
$ TR_Business <int> 244774, 68779…
$ TR_Other <int> 26340, 3994, …
This data set is quite similar to the previous one.
<- readxl::read_excel("term_project/Tourism Income, Expenditure and Average Number of Nights.xls", skip = 4)
night <- head(night, -9)
night colnames(night)
[1] "Yıl\nYear"
[2] "Yıllık-Annual\nÇeyrek-Quarter"
[3] "Turizm geliri\nTourism income\n( 000 $)"
[4] "Ziyaretçi sayısı\nNumber of departing \nvisitors"
[5] "Kişi başı ortalama \nharcama\nAverage expenditure per capita\n($)...5"
[6] "Ortalama geceleme sayısı \nAverage number of overnights...6"
[7] "...7"
[8] "Turizm gideri\nTourism expenditure\n( 000 $)"
[9] "\nTürkiye'de ikamet eden yurt dışını ziyaret eden vatandaş sayısı\nNumber of citizens (resident in Turkey) visited abroad"
[10] "Kişi başı ortalama \nharcama\nAverage expenditure per capita\n($)...10"
[11] "Ortalama geceleme sayısı \nAverage number of overnights...11"
The usual suspects could be eliminated with similar methods. Yet these column names have so much individual problems in such small amount that it will be shorter to change most of them manually.
<- night[,-7]
night names(night) <- sub(".*?\n", "", names(night))
colnames(night)[2] <- "Quarter"
colnames(night)[3] <- "Tourism_Income_in_ThousandDollars"
colnames(night)[4] <- "Number_of_Departing_Visitors"
colnames(night)[5] <- "ALL_Average_expenditure_per_capita_in_Dollars"
colnames(night)[6] <- "ALL_Average_number_of_overnights"
colnames(night)[7] <- "Tourism_expenditure_in_ThousandDollars"
colnames(night)[8] <- "Number_of_Turkish_citizens_visited_abroad"
colnames(night)[9] <- "TR_Average_expenditure_per_capita_in_Dollars"
colnames(night)[10] <- "TR_Average_number_of_overnights"
names(night)
[1] "Year"
[2] "Quarter"
[3] "Tourism_Income_in_ThousandDollars"
[4] "Number_of_Departing_Visitors"
[5] "ALL_Average_expenditure_per_capita_in_Dollars"
[6] "ALL_Average_number_of_overnights"
[7] "Tourism_expenditure_in_ThousandDollars"
[8] "Number_of_Turkish_citizens_visited_abroad"
[9] "TR_Average_expenditure_per_capita_in_Dollars"
[10] "TR_Average_number_of_overnights"
Current status of the data frame:
kable(tail(night))
Year | Quarter | Tourism_Income_in_ThousandDollars | Number_of_Departing_Visitors | ALL_Average_expenditure_per_capita_in_Dollars | ALL_Average_number_of_overnights | Tourism_expenditure_in_ThousandDollars | Number_of_Turkish_citizens_visited_abroad | TR_Average_expenditure_per_capita_in_Dollars | TR_Average_number_of_overnights |
---|---|---|---|---|---|---|---|---|---|
NA | III | 14126732 | 13640672.334207579 | 1035.6331193861683 | 11.36096473622375 | 584378.93482789095 | 873026.66579242004 | 669.37123197430367 | 18.717043940553779 |
NA | IV | 9306804 | 9050112.2298819609 | 1028.3634228612639 | 13.23171319647536 | 696183.34344183875 | 1188802.7701180396 | 585.61719482930937 | 16.496943017713928 |
2022 | Yıllık-Annual | NA | NA | NA | NA | NA | NA | NA | NA |
NA | I | 6561011 | 6451656.9355938099 | 1016.949711910887 | 12.455328498538254 | 664989.01667372952 | 1039666.0644061896 | 639.61789216765612 | 17.470795565958081 |
NA | II | 10515168 | 11939130.535165597 | 880.73147956867979 | 9.440213100360511 | 1057787.2055852062 | 1666135.4648344023 | 634.87467130431673 | 16.310156750210435 |
NA | III | 17952361 | 21000127.519801684 | 854.8691370122466 | 9.7038686869594368 | 1106285.4788077581 | 2072116.4801983174 | 533.89154971726202 | 9.8233256160661639 |
Null values in the year column are filled again.
<- night %>% fill(Year) night
Some of the cells contain two languages. They are changed.
$Quarter[night$Quarter == "Yıllık-Annual"] <- "Anual" night
Next, null values and dashes are filled with zeros.
<- data.frame(lapply(night, gsub, pattern = "-", replacement = 0))
night is.na(night)] <- 0 night[
Lastly, data types are changed and the data frame is exported.
-2] <- sapply(night[,-2],as.numeric)
night[,1] <- sapply(night[,1],as.integer)
night[, 4] <- sapply(night[,4],as.integer)
night[, 8] <- sapply(night[,8],as.integer)
night[,
write_xlsx(night,"term_project/income_nights.xlsx")
saveRDS(night, file = "term_project/income_nights.rds")
kable(head(night))
Year | Quarter | Tourism_Income_in_ThousandDollars | Number_of_Departing_Visitors | ALL_Average_expenditure_per_capita_in_Dollars | ALL_Average_number_of_overnights | Tourism_expenditure_in_ThousandDollars | Number_of_Turkish_citizens_visited_abroad | TR_Average_expenditure_per_capita_in_Dollars | TR_Average_number_of_overnights |
---|---|---|---|---|---|---|---|---|---|
2012 | Anual | 29689249 | 36463921 | 814.2089 | 10.820622 | 4593390 | 5802949 | 791.5612 | 12.500000 |
2013 | Anual | 33073502 | 39226225 | 843.1477 | 10.203777 | 5253565 | 7525869 | 698.0675 | 13.085486 |
2014 | Anual | 35137949 | 41415070 | 848.4339 | 9.986969 | 5470481 | 7982263 | 685.3295 | 12.901660 |
2015 | Anual | 32492212 | 41617530 | 780.7338 | 10.065018 | 5698423 | 8750851 | 651.1850 | 11.942026 |
2016 | Anual | 22839468 | 31365329 | 728.1756 | 11.353400 | 5049793 | 7891909 | 639.8697 | 10.997575 |
2017 | Anual | 27044542 | 38620345 | 700.2667 | 10.864540 | 5137244 | 8886916 | 578.0681 | 9.872048 |
This data set has two headers, the first one shows the years which is not included here.
<- readxl::read_excel("term_project/Tourism income, number of visitors and average expenditure per capita by months.xls", skip = 4)
mon <- head(mon, -9)
mon colnames(mon)[1:7]
[1] "Aylar - Months"
[2] "Turizm geliri Tourism\nincome\n(000 $)...2"
[3] "Ziyaretçi\nsayısı\nNumber of\nvisitors...3"
[4] "Kişi başı \nortalama\nharcama\nAverage\nexpenditure\nper capita\n($)...4"
[5] "Turizm geliri Tourism\nincome\n(000 $)...5"
[6] "Ziyaretçi\nsayısı\nNumber of\nvisitors...6"
[7] "Kişi başı \nortalama\nharcama\nAverage\nexpenditure\nper capita\n($)...7"
Column names are edited again.
names(mon)[1] <- "Months"
names(mon)[c(2,5,8,11,14,17,20,23,26,29,32)] <- "Tourism_Income_in_ThousandDollars"
names(mon)[c(3,6,9,12,15,18,21,24,27,30,33)] <- "Number_of_Visitors"
names(mon)[c(4,7,10,13,16,19,22,25,28,31,34)] <- "Average_expenditure_per_capita"
names(mon)[2:4] <- paste0(names(mon)[2:4], "_2012")
names(mon)[5:7] <- paste0(names(mon)[5:7], "_2013")
names(mon)[8:10] <- paste0(names(mon)[8:10], "_2014")
names(mon)[11:13] <- paste0(names(mon)[11:13], "_2015")
names(mon)[14:16] <- paste0(names(mon)[14:16], "_2016")
names(mon)[17:19] <- paste0(names(mon)[17:19], "_2017")
names(mon)[20:22] <- paste0(names(mon)[20:22], "_2018")
names(mon)[23:25] <- paste0(names(mon)[23:25], "_2019")
names(mon)[26:28] <- paste0(names(mon)[26:28], "_2020")
names(mon)[29:31] <- paste0(names(mon)[29:31], "_2021")
names(mon)[32:34] <- paste0(names(mon)[32:34], "_2022")
names(mon)[1:7]
[1] "Months"
[2] "Tourism_Income_in_ThousandDollars_2012"
[3] "Number_of_Visitors_2012"
[4] "Average_expenditure_per_capita_2012"
[5] "Tourism_Income_in_ThousandDollars_2013"
[6] "Number_of_Visitors_2013"
[7] "Average_expenditure_per_capita_2013"
Months column is changed.
<- mon %>%
mon separate(Months, c(NA,"Months")," - ")
Dashes and null values are filled with zeros.
<- data.frame(lapply(mon, gsub, pattern = "-", replacement = 0))
mon is.na(mon)] <- 0 mon[
Lastly, data types are changed and the data frame is exported.
-1] <- sapply(mon[,-1],as.numeric)
mon[,grepl("Number_of_Visitors",names(mon))]<-sapply(mon[,grepl("Number_of_Visitors", names(mon))],as.integer)
mon[,
write_xlsx(mon,"term_project/income_months.xlsx")
saveRDS(mon, file = "term_project/income_months.rds")
kable(head(mon))
Months | Tourism_Income_in_ThousandDollars_2012 | Number_of_Visitors_2012 | Average_expenditure_per_capita_2012 | Tourism_Income_in_ThousandDollars_2013 | Number_of_Visitors_2013 | Average_expenditure_per_capita_2013 | Tourism_Income_in_ThousandDollars_2014 | Number_of_Visitors_2014 | Average_expenditure_per_capita_2014 | Tourism_Income_in_ThousandDollars_2015 | Number_of_Visitors_2015 | Average_expenditure_per_capita_2015 | Tourism_Income_in_ThousandDollars_2016 | Number_of_Visitors_2016 | Average_expenditure_per_capita_2016 | Tourism_Income_in_ThousandDollars_2017 | Number_of_Visitors_2017 | Average_expenditure_per_capita_2017 | Tourism_Income_in_ThousandDollars_2018 | Number_of_Visitors_2018 | Average_expenditure_per_capita_2018 | Tourism_Income_in_ThousandDollars_2019 | Number_of_Visitors_2019 | Average_expenditure_per_capita_2019 | Tourism_Income_in_ThousandDollars_2020 | Number_of_Visitors_2020 | Average_expenditure_per_capita_2020 | Tourism_Income_in_ThousandDollars_2021 | Number_of_Visitors_2021 | Average_expenditure_per_capita_2021 | Tourism_Income_in_ThousandDollars_2022 | Number_of_Visitors_2022 | Average_expenditure_per_capita_2022 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Total | 29689249 | 36463920 | 814.2089 | 33073502 | 39226225 | 843.1477 | 35137949 | 41415070 | 848.4339 | 32492212 | 41617530 | 780.7338 | 22839468 | 31365329 | 728.1756 | 27044542 | 38620345 | 700.2667 | 30545924 | 45628672 | 669.4458 | 38930474 | 51860042 | 750.6834 | 14817273.3 | 15826266 | 936.2457 | 30173587.5 | 29357463 | 1027.7995 | 0 | 0 | 0.0000 |
January | 1143894 | 1374400 | 832.2854 | 1469297 | 1466127 | 1002.1615 | 1540396 | 1575399 | 977.7813 | 1666096 | 1762004 | 945.5687 | 1442336 | 1691287 | 852.8040 | 1168279 | 1568343 | 744.9123 | 1537993 | 2045340 | 751.9495 | 1755674 | 2226287 | 788.6106 | 2085858.3 | 2529422 | 824.6380 | 854474.5 | 829931 | 1029.5719 | 2259265 | 2158066 | 1046.8934 |
February | 1052891 | 1209064 | 870.8315 | 1401129 | 1415328 | 989.9672 | 1461263 | 1523244 | 959.3096 | 1462829 | 1564925 | 934.7592 | 1214408 | 1517503 | 800.2669 | 1013689 | 1432341 | 707.7149 | 1324531 | 1806821 | 733.0722 | 1505062 | 1944956 | 773.8280 | 1682608.0 | 2051922 | 820.0153 | 722423.9 | 727125 | 993.5335 | 1870892 | 1851394 | 1010.5316 |
March | 1375023 | 1635696 | 840.6349 | 1837104 | 1892369 | 970.7956 | 1869525 | 1967114 | 950.3896 | 1861352 | 2017645 | 922.5370 | 1497145 | 1898762 | 788.4848 | 1260527 | 1844076 | 683.5546 | 1641208 | 2270019 | 722.9929 | 1865797 | 2473146 | 754.4225 | 895925.7 | 1058067 | 846.7564 | 1059070.8 | 1043410 | 1015.0086 | 2430853 | 2442196 | 995.3553 |
April | 1763753 | 2231942 | 790.2322 | 2004636 | 2418962 | 828.7172 | 2158634 | 2573138 | 838.9108 | 1923638 | 2626663 | 732.3503 | 1394602 | 2049238 | 680.5466 | 1415774 | 2278537 | 621.3523 | 1873123 | 2870568 | 652.5267 | 2287216 | 3266255 | 700.2564 | 175638.2 | 0 | 0.0000 | 1198135.0 | 1179561 | 1015.7460 | 2525296 | 2921440 | 864.4012 |
May | 2466505 | 3194546 | 772.0985 | 3074218 | 3717734 | 826.9064 | 3229089 | 3863882 | 835.7109 | 2806667 | 3775012 | 743.4854 | 1895207 | 2749648 | 689.2543 | 1965730 | 3095281 | 635.0730 | 2506235 | 3790524 | 661.1842 | 3024127 | 4219837 | 716.6453 | 196363.2 | 0 | 0.0000 | 1035085.8 | 1025559 | 1009.2889 | 3611055 | 4078424 | 885.4043 |
This was the final step of the processing. Data sets are ready to be explored.