2  Preprocessing the Data

Published

January 9, 2023

The Data Sets

There are 4 data sets that are used in the project. These are called;

These data sets were taken from TUIK. TUIK is an official data source that provides data for Turkey in diverse topics. Link to their website. Links for the data sets that are used here are in link1, link2, link3 and link4.

They will be processed before any calculations or visualizations to avoid possible errors/mistakes and to be used more effectively.

The data sets contain multiple headers, column names that are written in 2 languages and other explanations that has no place in a data frame. There could be missing values, symbols and false data types in the data. They will be examined and fixed as well.

Processing will start with importing the necessary libraries.

Code
library(readxl)
library(knitr)
library(dplyr)
library(tidyr)
library(reshape2)
library(writexl)

2.1 Number of Departing Visitors by Country of Residence

Code
dep <- readxl::read_excel("term_project/Number of Departing Visitors by Country of Residence.xls", skip = 4)
glimpse(dep)
Rows: 47
Columns: 11
$ Nationality <chr> "A.B.D. - USA", "Almanya - Germany", "Avusturya - Austria"…
$ `2012`      <dbl> 883408, 7305228, 832019, 663758, 838895, 2643699, 1514894,…
$ `2013`      <dbl> 856728, 7378650, 916069, 715578, 859199, 2738368, 1640259,…
$ `2014`      <dbl> 888077, 7794762, 802133, 726078, 809843, 2818021, 1701021,…
$ `2015`      <dbl> 833850, 8402180, 836755, 650569, 814868, 2776057, 1826947,…
$ `2016`      <dbl> 505989, 6960545, 677284, 656685, 627223, 1957576, 1710276,…
$ `2017`      <dbl> 331239, 7117716, 578074, 792883, 651702, 1951637, 1854683,…
$ `2018`      <dbl> 468281, 8022883, 620002, 916429, 751660, 2575768, 2387679,…
$ `2019`      <dbl> 626298, 8861124, 652020, 933291, 755681, 2978764, 2719962,…
$ `2020(1)`   <dbl> 148914, 2903189, 238682, 250087, 219196, 1122967, 1190803,…
$ `2021`      <dbl> 365211, 6314266, 516081, 512215, 459091, 473681, 1339552, …

First 4 rows are skipped to get the proper header. Similar process will take place to exclude the descriptions at the bottom of the data frame.

Code
dep <- head(dep, - 15) 

Second problem seems to be in the “Nationality” column. The countries were written in two different languages. They have a dash between them which can be used to exclude one of the languages. ” - ” string will be fed to the separate function to get rid off the white spaces as well.

Code
dep <- dep %>%
  separate(Nationality, c(NA,"Nationality")," - ")

There is the “(1)” character in the column for the year 2020 which is a note that says the column only contains data for the first, third and last quarter of the year due to the lack of surveys(COVID-19).

It will be removed from the column name.

Code
names(dep)[names(dep) == "2020(1)"] <- "2020"
kable(head(dep, 3))
Nationality 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021
USA 883408 856728 888077 833850 505989 331239 468281 626298 148914 365211
Germany 7305228 7378650 7794762 8402180 6960545 7117716 8022883 8861124 2903189 6314266
Austria 832019 916069 802133 836755 677284 578074 620002 652020 238682 516081

Lastly, data type for the year columns should be integers. Lack of warnings will also reveal if there are empty cells. The data frame is saved as an excel file.

Code
dep <- dep %>%
    mutate(across("2012":"2021", as.integer))
write_xlsx(dep,"term_project/depart_by_residence.xlsx")
saveRDS(dep, file = "term_project/depart_by_residence.rds")

melted_dep <- melt(dep, "Nationality")
colnames(melted_dep) <- c('Nationality','Dep_Year','Departing_Visitors')

write_xlsx(melted_dep,"term_project/melted_depart_by_residence.xlsx")
saveRDS(melted_dep, file = "term_project/melted_depart_by_residence.rds")

glimpse(dep)
Rows: 32
Columns: 11
$ Nationality <chr> "USA", "Germany", "Austria", "Azerbaijan", "Belgium", "Uni…
$ `2012`      <int> 883408, 7305228, 832019, 663758, 838895, 2643699, 1514894,…
$ `2013`      <int> 856728, 7378650, 916069, 715578, 859199, 2738368, 1640259,…
$ `2014`      <int> 888077, 7794762, 802133, 726078, 809843, 2818021, 1701021,…
$ `2015`      <int> 833850, 8402180, 836755, 650569, 814868, 2776057, 1826947,…
$ `2016`      <int> 505989, 6960545, 677284, 656685, 627223, 1957576, 1710276,…
$ `2017`      <int> 331239, 7117716, 578074, 792883, 651702, 1951637, 1854683,…
$ `2018`      <int> 468281, 8022883, 620002, 916429, 751660, 2575768, 2387679,…
$ `2019`      <int> 626298, 8861124, 652020, 933291, 755681, 2978764, 2719962,…
$ `2020`      <int> 148914, 2903189, 238682, 250087, 219196, 1122967, 1190803,…
$ `2021`      <int> 365211, 6314266, 516081, 512215, 459091, 473681, 1339552, …

2.2 Foreign and Citizen Visitors by Purpose of Visit

Second data set will be processed. Similar manipulations with the previous chapter will not be explained in detail to offer a better readability. Impracticable rows are excluded again.

This time, column names will be checked first. Only 5 column names are shown.

Code
purp <- readxl::read_excel("term_project/Foreign and Citizen Visitors by Purpose of Visit (Foreigner and Citizens Resident Abroad).xls", skip = 5)
purp <- head(purp, -9)
colnames(purp)[1:5]
[1] "Yıl"                                                                                                        
[2] "Çeyrek"                                                                                                     
[3] "Toplam \nTotal...3"                                                                                         
[4] "Gezi, eğlence, sportif ve kültürel faaliyetler \nTravel, entertainment, sportive or cultural activities...4"
[5] "Akraba ve arkadaş ziyareti \nVisiting relatives and friends...5"                                            

These column names seem chaotic enough to confuse the user. There are several things that can be done.

First of all, they are named in two languages again. Most of them are separated with the new line indicator “\n”. Year and quarter columns are duplicated in two languages, duplicates will be dropped.

In this chunk, everything before “\n” character is dropped if there is one, duplicated columns are dropped, some of the other characters are dropped and spaces are replaced with “_“.

Code
names(purp)[3:23] <- sub(".*?\n", "", names(purp)[3:23])
names(purp) <- gsub(r"{\s*\([^\)]+\)}","",names(purp))
names(purp) <- trimws(names(purp), "l")
names(purp) <- gsub(" / ", "_", names(purp))
names(purp) <- gsub(" ", "_", names(purp))
names(purp)[3:7] <- substr(names(purp)[3:7],1,nchar(names(purp)[3:7])-4)
names(purp)[8:23] <- substr(names(purp)[8:23],1,nchar(names(purp)[8:23])-5)
names(purp) <- gsub(",", "", names(purp))
purp <- subset(purp, select = -c(Yıl,Çeyrek))
purp <- purp[,-11]
names(purp)[1:5]
[1] "Total"                                               
[2] "Travel_entertainment_sportive_or_cultural_activities"
[3] "Visiting_relatives_and_friends"                      
[4] "Education_training"                                  
[5] "Health_or_medical_reasons"                           

There are still duplicated column names yet they have different values. First columns represent the value for all of the tourists while others represent the values for Turkish citizens who live abroad. They can be distinguished with prefixes.

Code
names(purp)[1:10] <- paste0("ALL_", names(purp)[1:10])
names(purp)[11:20] <- paste0("TR_", names(purp)[11:20])
purp <- purp %>% select(Year, Quarter, ALL_Total:TR_Other)
glimpse(purp)
Rows: 54
Columns: 22
$ Year                                                     <dbl> 2012, NA, NA,…
$ Quarter                                                  <chr> "Annual", "I"…
$ ALL_Total                                                <chr> "36463921.041…
$ ALL_Travel_entertainment_sportive_or_cultural_activities <chr> "24953961", "…
$ ALL_Visiting_relatives_and_friends                       <chr> "6792033", "1…
$ ALL_Education_training                                   <chr> "231152", "51…
$ ALL_Health_or_medical_reasons                            <chr> "240682", "63…
$ ALL_Religion_Pilgrimag                                   <chr> "73510", "112…
$ ALL_Shoppin                                              <chr> "934204", "14…
$ ALL_Transit                                              <chr> "45194", "161…
$ ALL_Business                                             <chr> "2224844", "5…
$ ALL_Other                                                <chr> "968339", "22…
$ TR_Total                                                 <chr> "5121457", "8…
$ TR_Travel_entertainment_sportive_or_cultural_activities  <chr> "1083976", "1…
$ TR_Visiting_relatives_and_friends                        <chr> "3645145", "5…
$ TR_Education_training                                    <chr> "21768", "621…
$ TR_Health_or_medical_reasons                             <chr> "67151", "239…
$ TR_Religion_Pilgrimage                                   <chr> "5973", "1775…
$ TR_Shopping                                              <chr> "26330", "931…
$ TR_Transit                                               <chr> "-", "-", "-"…
$ TR_Business                                              <chr> "244774", "68…
$ TR_Other                                                 <chr> "26340", "399…
Code
kable(head(purp))
Year Quarter ALL_Total ALL_Travel_entertainment_sportive_or_cultural_activities ALL_Visiting_relatives_and_friends ALL_Education_training ALL_Health_or_medical_reasons ALL_Religion_Pilgrimag ALL_Shoppin ALL_Transit ALL_Business ALL_Other TR_Total TR_Travel_entertainment_sportive_or_cultural_activities TR_Visiting_relatives_and_friends TR_Education_training TR_Health_or_medical_reasons TR_Religion_Pilgrimage TR_Shopping TR_Transit TR_Business TR_Other
2012 Annual 36463921.041000001 24953961 6792033 231152 240682 73510 934204 45194 2224844 968339 5121457 1083976 3645145 21768 67151 5973 26330 - 244774 26340
NA I 4219162 2005504 1168263 51489 63843 11203 148181 16131 532073 222474 844430 130458 599931 6212 23962 1775 9318 - 68779 3994
NA II 9323459 6752489 1189394 88929 58283 28866 244106 10525 654678 296190 911152 199092 586250 7134 16861 649 9079 - 81063 11025
NA III 15437123 11439809 3072660 43643 44905 14690 204199 13743 353468 250005 2276006 499200 1722066 5164 8951 1377 4131 - 26509 8607
NA IV 7484177 4756160 1361716 47091 73652 18751 337718 4795 684625 199669 1089869 255226 736898 3258 17377 2172 3801 - 68422 2715
2013 Annual 39226225.794699997 26817201 7239397 195918 300102 62762 1000734 41172 2404344 1164596 5398751.7854000004 1322033 3655078 20330 90579 4681 39986 - 255493 10571

There are still problems with the data. It can be seen that years are partially empty which should be repeated after first instances. There are problems in the data types as well. Lastly, Transit column for Turkish citizens is filled with dashes, it would be more efficient to simply keep them as zeros.

None values in the Year column are filled below.

Code
purp <- purp %>% fill(Year)

Transit column is filled with zeros.

Code
purp$TR_Transit <- 0

Annual values for 2022 are empty and the values were filled with dashes for the second quarter of the year 2020 as explained before. There are other instances of dashes in the data set as well. They will be filled with zeros.

Code
kable(filter(purp, rowSums(is.na(purp)) > 0 | (Year == 2020 & Quarter == "II")))
Year Quarter ALL_Total ALL_Travel_entertainment_sportive_or_cultural_activities ALL_Visiting_relatives_and_friends ALL_Education_training ALL_Health_or_medical_reasons ALL_Religion_Pilgrimag ALL_Shoppin ALL_Transit ALL_Business ALL_Other TR_Total TR_Travel_entertainment_sportive_or_cultural_activities TR_Visiting_relatives_and_friends TR_Education_training TR_Health_or_medical_reasons TR_Religion_Pilgrimage TR_Shopping TR_Transit TR_Business TR_Other
2020 II - - - - - - - - - - - - - - - - - 0 - -
2022 Annual NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0 NA NA
Code
purp <- data.frame(lapply(purp, gsub, pattern = "-", replacement = 0))
purp[is.na(purp)] <- 0

Lastly, the columns are ready to be assigned by their correct types and the data is ready to be exported.

Code
purp[,-2] <- sapply(purp[,-2],as.integer)
write_xlsx(purp,"term_project/purposes.xlsx")
saveRDS(purp, file = "term_project/purposes.rds")
glimpse(purp)
Rows: 54
Columns: 22
$ Year                                                     <int> 2012, 2012, 2…
$ Quarter                                                  <chr> "Annual", "I"…
$ ALL_Total                                                <int> 36463921, 421…
$ ALL_Travel_entertainment_sportive_or_cultural_activities <int> 24953961, 200…
$ ALL_Visiting_relatives_and_friends                       <int> 6792033, 1168…
$ ALL_Education_training                                   <int> 231152, 51489…
$ ALL_Health_or_medical_reasons                            <int> 240682, 63843…
$ ALL_Religion_Pilgrimag                                   <int> 73510, 11203,…
$ ALL_Shoppin                                              <int> 934204, 14818…
$ ALL_Transit                                              <int> 45194, 16131,…
$ ALL_Business                                             <int> 2224844, 5320…
$ ALL_Other                                                <int> 968339, 22247…
$ TR_Total                                                 <int> 5121457, 8444…
$ TR_Travel_entertainment_sportive_or_cultural_activities  <int> 1083976, 1304…
$ TR_Visiting_relatives_and_friends                        <int> 3645145, 5999…
$ TR_Education_training                                    <int> 21768, 6212, …
$ TR_Health_or_medical_reasons                             <int> 67151, 23962,…
$ TR_Religion_Pilgrimage                                   <int> 5973, 1775, 6…
$ TR_Shopping                                              <int> 26330, 9318, …
$ TR_Transit                                               <int> 0, 0, 0, 0, 0…
$ TR_Business                                              <int> 244774, 68779…
$ TR_Other                                                 <int> 26340, 3994, …

2.3 Tourism Income, Expenditure and Average Number of Nights

This data set is quite similar to the previous one.

Code
night <- readxl::read_excel("term_project/Tourism Income, Expenditure and Average Number of Nights.xls", skip = 4)
night <- head(night, -9)
colnames(night)
 [1] "Yıl\nYear"                                                                                                                
 [2] "Yıllık-Annual\nÇeyrek-Quarter"                                                                                            
 [3] "Turizm  geliri\nTourism income\n( 000 $)"                                                                                 
 [4] "Ziyaretçi sayısı\nNumber of departing \nvisitors"                                                                         
 [5] "Kişi başı ortalama \nharcama\nAverage expenditure per capita\n($)...5"                                                    
 [6] "Ortalama geceleme sayısı \nAverage number of overnights...6"                                                              
 [7] "...7"                                                                                                                     
 [8] "Turizm  gideri\nTourism expenditure\n( 000 $)"                                                                            
 [9] "\nTürkiye'de ikamet eden yurt dışını ziyaret eden vatandaş sayısı\nNumber of citizens (resident in Turkey) visited abroad"
[10] "Kişi başı ortalama \nharcama\nAverage expenditure per capita\n($)...10"                                                   
[11] "Ortalama geceleme sayısı \nAverage number of overnights...11"                                                             

The usual suspects could be eliminated with similar methods. Yet these column names have so much individual problems in such small amount that it will be shorter to change most of them manually.

Code
night <- night[,-7]
names(night) <- sub(".*?\n", "", names(night))
colnames(night)[2] <- "Quarter"
colnames(night)[3] <- "Tourism_Income_in_ThousandDollars"
colnames(night)[4] <- "Number_of_Departing_Visitors"
colnames(night)[5] <- "ALL_Average_expenditure_per_capita_in_Dollars"
colnames(night)[6] <- "ALL_Average_number_of_overnights"
colnames(night)[7] <- "Tourism_expenditure_in_ThousandDollars"
colnames(night)[8] <- "Number_of_Turkish_citizens_visited_abroad"
colnames(night)[9] <- "TR_Average_expenditure_per_capita_in_Dollars"
colnames(night)[10] <- "TR_Average_number_of_overnights"

names(night)
 [1] "Year"                                         
 [2] "Quarter"                                      
 [3] "Tourism_Income_in_ThousandDollars"            
 [4] "Number_of_Departing_Visitors"                 
 [5] "ALL_Average_expenditure_per_capita_in_Dollars"
 [6] "ALL_Average_number_of_overnights"             
 [7] "Tourism_expenditure_in_ThousandDollars"       
 [8] "Number_of_Turkish_citizens_visited_abroad"    
 [9] "TR_Average_expenditure_per_capita_in_Dollars" 
[10] "TR_Average_number_of_overnights"              

Current status of the data frame:

Code
kable(tail(night))
Year Quarter Tourism_Income_in_ThousandDollars Number_of_Departing_Visitors ALL_Average_expenditure_per_capita_in_Dollars ALL_Average_number_of_overnights Tourism_expenditure_in_ThousandDollars Number_of_Turkish_citizens_visited_abroad TR_Average_expenditure_per_capita_in_Dollars TR_Average_number_of_overnights
NA III 14126732 13640672.334207579 1035.6331193861683 11.36096473622375 584378.93482789095 873026.66579242004 669.37123197430367 18.717043940553779
NA IV 9306804 9050112.2298819609 1028.3634228612639 13.23171319647536 696183.34344183875 1188802.7701180396 585.61719482930937 16.496943017713928
2022 Yıllık-Annual NA NA NA NA NA NA NA NA
NA I 6561011 6451656.9355938099 1016.949711910887 12.455328498538254 664989.01667372952 1039666.0644061896 639.61789216765612 17.470795565958081
NA II 10515168 11939130.535165597 880.73147956867979 9.440213100360511 1057787.2055852062 1666135.4648344023 634.87467130431673 16.310156750210435
NA III 17952361 21000127.519801684 854.8691370122466 9.7038686869594368 1106285.4788077581 2072116.4801983174 533.89154971726202 9.8233256160661639

Null values in the year column are filled again.

Code
night <- night %>% fill(Year)

Some of the cells contain two languages. They are changed.

Code
night$Quarter[night$Quarter == "Yıllık-Annual"] <- "Anual"

Next, null values and dashes are filled with zeros.

Code
night <- data.frame(lapply(night, gsub, pattern = "-", replacement = 0))
night[is.na(night)] <- 0

Lastly, data types are changed and the data frame is exported.

Code
night[,-2] <- sapply(night[,-2],as.numeric)
night[, 1] <- sapply(night[,1],as.integer)
night[, 4] <- sapply(night[,4],as.integer)
night[, 8] <- sapply(night[,8],as.integer)

write_xlsx(night,"term_project/income_nights.xlsx")
saveRDS(night, file = "term_project/income_nights.rds")

kable(head(night))
Year Quarter Tourism_Income_in_ThousandDollars Number_of_Departing_Visitors ALL_Average_expenditure_per_capita_in_Dollars ALL_Average_number_of_overnights Tourism_expenditure_in_ThousandDollars Number_of_Turkish_citizens_visited_abroad TR_Average_expenditure_per_capita_in_Dollars TR_Average_number_of_overnights
2012 Anual 29689249 36463921 814.2089 10.820622 4593390 5802949 791.5612 12.500000
2013 Anual 33073502 39226225 843.1477 10.203777 5253565 7525869 698.0675 13.085486
2014 Anual 35137949 41415070 848.4339 9.986969 5470481 7982263 685.3295 12.901660
2015 Anual 32492212 41617530 780.7338 10.065018 5698423 8750851 651.1850 11.942026
2016 Anual 22839468 31365329 728.1756 11.353400 5049793 7891909 639.8697 10.997575
2017 Anual 27044542 38620345 700.2667 10.864540 5137244 8886916 578.0681 9.872048

2.4 Tourism Income, Number of Visitors and Average Expenditure per Capita by Months

This data set has two headers, the first one shows the years which is not included here.

Code
mon <- readxl::read_excel("term_project/Tourism income, number of visitors and average expenditure per capita by months.xls", skip = 4)
mon <- head(mon, -9)
colnames(mon)[1:7]
[1] "Aylar - Months"                                                          
[2] "Turizm geliri Tourism\nincome\n(000 $)...2"                              
[3] "Ziyaretçi\nsayısı\nNumber of\nvisitors...3"                              
[4] "Kişi başı \nortalama\nharcama\nAverage\nexpenditure\nper capita\n($)...4"
[5] "Turizm geliri Tourism\nincome\n(000 $)...5"                              
[6] "Ziyaretçi\nsayısı\nNumber of\nvisitors...6"                              
[7] "Kişi başı \nortalama\nharcama\nAverage\nexpenditure\nper capita\n($)...7"

Column names are edited again.

Code
names(mon)[1] <- "Months"
names(mon)[c(2,5,8,11,14,17,20,23,26,29,32)] <- "Tourism_Income_in_ThousandDollars"
names(mon)[c(3,6,9,12,15,18,21,24,27,30,33)] <- "Number_of_Visitors"
names(mon)[c(4,7,10,13,16,19,22,25,28,31,34)] <- "Average_expenditure_per_capita"

names(mon)[2:4] <- paste0(names(mon)[2:4], "_2012")
names(mon)[5:7] <- paste0(names(mon)[5:7], "_2013")
names(mon)[8:10] <- paste0(names(mon)[8:10], "_2014")
names(mon)[11:13] <- paste0(names(mon)[11:13], "_2015")
names(mon)[14:16] <- paste0(names(mon)[14:16], "_2016")
names(mon)[17:19] <- paste0(names(mon)[17:19], "_2017")
names(mon)[20:22] <- paste0(names(mon)[20:22], "_2018")
names(mon)[23:25] <- paste0(names(mon)[23:25], "_2019")
names(mon)[26:28] <- paste0(names(mon)[26:28], "_2020")
names(mon)[29:31] <- paste0(names(mon)[29:31], "_2021")
names(mon)[32:34] <- paste0(names(mon)[32:34], "_2022")

names(mon)[1:7]
[1] "Months"                                
[2] "Tourism_Income_in_ThousandDollars_2012"
[3] "Number_of_Visitors_2012"               
[4] "Average_expenditure_per_capita_2012"   
[5] "Tourism_Income_in_ThousandDollars_2013"
[6] "Number_of_Visitors_2013"               
[7] "Average_expenditure_per_capita_2013"   

Months column is changed.

Code
mon <- mon %>%
  separate(Months, c(NA,"Months")," - ")

Dashes and null values are filled with zeros.

Code
mon <- data.frame(lapply(mon, gsub, pattern = "-", replacement = 0))
mon[is.na(mon)] <- 0

Lastly, data types are changed and the data frame is exported.

Code
mon[,-1] <- sapply(mon[,-1],as.numeric)
mon[,grepl("Number_of_Visitors",names(mon))]<-sapply(mon[,grepl("Number_of_Visitors", names(mon))],as.integer)


write_xlsx(mon,"term_project/income_months.xlsx")
saveRDS(mon, file = "term_project/income_months.rds")

kable(head(mon))
Months Tourism_Income_in_ThousandDollars_2012 Number_of_Visitors_2012 Average_expenditure_per_capita_2012 Tourism_Income_in_ThousandDollars_2013 Number_of_Visitors_2013 Average_expenditure_per_capita_2013 Tourism_Income_in_ThousandDollars_2014 Number_of_Visitors_2014 Average_expenditure_per_capita_2014 Tourism_Income_in_ThousandDollars_2015 Number_of_Visitors_2015 Average_expenditure_per_capita_2015 Tourism_Income_in_ThousandDollars_2016 Number_of_Visitors_2016 Average_expenditure_per_capita_2016 Tourism_Income_in_ThousandDollars_2017 Number_of_Visitors_2017 Average_expenditure_per_capita_2017 Tourism_Income_in_ThousandDollars_2018 Number_of_Visitors_2018 Average_expenditure_per_capita_2018 Tourism_Income_in_ThousandDollars_2019 Number_of_Visitors_2019 Average_expenditure_per_capita_2019 Tourism_Income_in_ThousandDollars_2020 Number_of_Visitors_2020 Average_expenditure_per_capita_2020 Tourism_Income_in_ThousandDollars_2021 Number_of_Visitors_2021 Average_expenditure_per_capita_2021 Tourism_Income_in_ThousandDollars_2022 Number_of_Visitors_2022 Average_expenditure_per_capita_2022
Total 29689249 36463920 814.2089 33073502 39226225 843.1477 35137949 41415070 848.4339 32492212 41617530 780.7338 22839468 31365329 728.1756 27044542 38620345 700.2667 30545924 45628672 669.4458 38930474 51860042 750.6834 14817273.3 15826266 936.2457 30173587.5 29357463 1027.7995 0 0 0.0000
January 1143894 1374400 832.2854 1469297 1466127 1002.1615 1540396 1575399 977.7813 1666096 1762004 945.5687 1442336 1691287 852.8040 1168279 1568343 744.9123 1537993 2045340 751.9495 1755674 2226287 788.6106 2085858.3 2529422 824.6380 854474.5 829931 1029.5719 2259265 2158066 1046.8934
February 1052891 1209064 870.8315 1401129 1415328 989.9672 1461263 1523244 959.3096 1462829 1564925 934.7592 1214408 1517503 800.2669 1013689 1432341 707.7149 1324531 1806821 733.0722 1505062 1944956 773.8280 1682608.0 2051922 820.0153 722423.9 727125 993.5335 1870892 1851394 1010.5316
March 1375023 1635696 840.6349 1837104 1892369 970.7956 1869525 1967114 950.3896 1861352 2017645 922.5370 1497145 1898762 788.4848 1260527 1844076 683.5546 1641208 2270019 722.9929 1865797 2473146 754.4225 895925.7 1058067 846.7564 1059070.8 1043410 1015.0086 2430853 2442196 995.3553
April 1763753 2231942 790.2322 2004636 2418962 828.7172 2158634 2573138 838.9108 1923638 2626663 732.3503 1394602 2049238 680.5466 1415774 2278537 621.3523 1873123 2870568 652.5267 2287216 3266255 700.2564 175638.2 0 0.0000 1198135.0 1179561 1015.7460 2525296 2921440 864.4012
May 2466505 3194546 772.0985 3074218 3717734 826.9064 3229089 3863882 835.7109 2806667 3775012 743.4854 1895207 2749648 689.2543 1965730 3095281 635.0730 2506235 3790524 661.1842 3024127 4219837 716.6453 196363.2 0 0.0000 1035085.8 1025559 1009.2889 3611055 4078424 885.4043

This was the final step of the processing. Data sets are ready to be explored.